How to Host DeepSeek Locally on a Docker Home Server

In this tutorial, I will show you how to install the DeepSeek AI chat (or other LLM AI models) on your home server in just a few minutes and start using it locally completely free and without limitations. This guide assumes that you already have a home server with Docker and Portainer installed. If you don’t have a home server yet, I recommend checking out my beginner’s guide on setting up a home server with a Raspberry Pi (link). In that article, I cover the installation of Docker and Portainer in detail. And yes, you can even host DeepSeek on a Raspberry Pi as well!

Requirements:

Docker and Portainer installed
At least 1.2GB of free disk space available (to download the smallest distilled DeepSeek model)
At least 1.5GB of free RAM available (to run the smallest distilled DeepSeek model)

Adding the Docker Stack in Portainer

Create a new stack in Portainer and simply paste the following code:

services:
  webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: webui
    ports:
      - 7000:8080/tcp
    volumes:
      - open-webui:/app/backend/data
    extra_hosts:
      - "host.docker.internal:host-gateway"
    depends_on:
      - ollama
    restart: unless-stopped

  ollama:
    image: ollama/ollama
    container_name: ollama
    expose:
      - 11434/tcp
    ports:
      - 11434:11434/tcp
    healthcheck:
      test: ollama --version || exit 1
    volumes:
      - ollama:/root/.ollama
    restart: unless-stopped

volumes:
  ollama:
  open-webui:

Click “Deploy the Stack” and wait for the installation and startup of the services. This setup will install two containers:

Ollama – A backend service that runs and manages large language models (LLMs) locally.
WebUI – A user-friendly web interface that allows you to interact with Ollama easily.

Once both containers are installed and running, you can access the web interface by opening:

http://host_name_or_ip:7000 in your browser.

Using WebUI

On the WebUI page, you will first be prompted to create an administrator account. You can enter any credentials you like—these will not be transmitted anywhere and will be stored locally on your server.

Downloading a DeepSeek Model

Go to the Workspace tab. Click on the drop-down arrow next to “Select a model”.
Type in the search:
- “deepseek-r1” for the 7b model or
- “deepseek-r1:1.5b” for the 1.5b model.
Click “Pull ‘deepseek-r1’ from Ollama.com” to download the selected model.

That’s it! Once the model is downloaded, you can start using your local DeepSeek AI chat!

You can choose from different DeepSeek model versions depending on your available hardware. Here are some examples:

deepseek-r1:1.5b – The smallest distilled model, requiring 1.2GB of disk space and 1.5GB of free RAM to run.
deepseek-r1:7b – A larger distilled model that needs 4.7GB of disk space and 5.5GB of RAM.
deepseek-r1:671b – The most advanced full model currently available, requiring 405GB of disk space and about the same amount of RAM.

Check the full list of DeepSeek models here: DeepSeek models on Ollama.

You can download multiple models and switch between them just like in the Chat GPT web interface.

Additionally, feel free to explore other non-DeepSeek models available in the Ollama library.

Performance Comparison

I tested the first two DeepSeek models on my two home server setups:

Raspberry Pi 4 (8GB DDR4)
Intel N100 Mini PC (32GB DDR5) (detailed review of this mini PC)

I’m not an expert in benchmarking LLMs, so for simplicity, I compared the performance of the 1.5b and 7b models on both of my home servers by asking the AI the same question:

“Summarize Einstein’s theory of relativity in simple terms.”

I then measured the time it took for each model to generate a response. Here are the results I got:

Model	Raspberry Pi 4 (8GB DDR4)	Intel N100 Mini PC (32GB DDR5)
DeepSeek 1.5b	6m 43s	1m 30s
DeepSeek 7b	26m 20s	3m 54s

As expected, the Raspberry Pi 4 struggles with performance, while the Intel N100 Mini PC, combined with fast DDR5 memory, delivers significantly faster results. You can compare response quality and additional details in the screenshots below.

What’s Next?

Using a Self-Hosted LLM on Mobile
For convenient access to your self-hosted LLM on a mobile device, you can use a dedicated app. I personally found this one to be quite useful. Simply add your server address in the app like this:
http://host_name_or_ip:11434
Securing Remote Access
If you want to access your LLM server from outside your home network, do not expose the port directly without first setting up authentication. By default, Ollama does not require authentication, which is a security risk. Ensure you configure proper access control before opening the server to the internet.
Leveraging GPU Acceleration
If your home server has an NVIDIA GPU, consider exploring how to enable GPU acceleration for improved processing speed and performance.

Stay Updated

If you liked this guide, follow me on social media (links are in the website footer) to show your support and stay updated on similar articles in the future.

(Unauthorized copying of this page content is prohibited. Please use the Share buttons below or provide a direct link to this page instead. Thank you!)

Comments

2 responses to “How to Host DeepSeek Locally on a Docker Home Server”

Lad

4 February 2025
Works great, to make it work with a gpu, I just added this to the ollama component definition:
```
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
```
In case of a trouble you can check this page: https://docs.docker.com/compose/how-tos/gpu-support/
Esvee

2 March 2025

thanks for this tutorial. I found this project on GitHub that does the same, but with intel gpu support. I have an N100 too and after deploying this I can indeed see the gpu being utilised by ollama with intel_gpu_top. I need to up the allocated VRAM in Bios to get it faster (currently I’m remote). Thought people might be interested. https://github.com/mattcurf/ollama-intel-gpu?tab=readme-ov-file