Docker - Ollama

The official Ollama Docker image provides a containerized way to run Ollama with full GPU acceleration support.

Quick Start

CPU Only

Run Ollama using CPU inference:

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

This creates a container with:

Volume mount for persistent model storage
Port 11434 exposed for API access
Background daemon mode

GPU Support

NVIDIA GPU

For NVIDIA GPU acceleration, you need the NVIDIA Container Toolkit.

Install NVIDIA Container Toolkit

Ubuntu/Debian
RHEL/CentOS

# Configure the repository
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
    | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -fsSL https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
    | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
    | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update

# Install the toolkit
sudo apt-get install -y nvidia-container-toolkit

# Configure the repository
curl -fsSL https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo \
    | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo

# Install the toolkit
sudo yum install -y nvidia-container-toolkit

Configure Docker Runtime

Configure Docker to use the NVIDIA runtime:

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Run with GPU

Start Ollama with GPU support:

docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

For NVIDIA JetPack systems, set the JetPack version:

docker run -d --gpus=all -e JETSON_JETPACK=5 -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Use JETSON_JETPACK=6 for JetPack 6.

AMD GPU (ROCm)

For AMD GPU support, use the rocm tag:

docker run -d --device /dev/kfd --device /dev/dri \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama:rocm

Supported AMD GPUs:

Radeon RX 7900, 7800, 7700, 7600 series
Radeon RX 6900, 6800 series
Radeon PRO W7000/W6000 series
AMD Instinct MI series

In some distributions, you may need to pass additional --group-add arguments for device access. Check device permissions with:

ls -lnd /dev/kfd /dev/dri /dev/dri/*

Vulkan Support

Vulkan provides cross-platform GPU acceleration and is bundled in the standard image:

docker run -d --device /dev/kfd --device /dev/dri \
  -e OLLAMA_VULKAN=1 \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

Using Ollama in Docker

Run a Model

Execute commands inside the running container:

docker exec -it ollama ollama run llama3.2

Pull Models

docker exec -it ollama ollama pull gemma3

List Models

docker exec -it ollama ollama list

Check Running Models

docker exec -it ollama ollama ps

Docker Compose

Create a docker-compose.yml file:

services:
  ollama:
    image: ollama/ollama
    container_name: ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama:/root/.ollama
    restart: unless-stopped

volumes:
  ollama:

Start with Docker Compose:

docker compose up -d

API Access

Access the Ollama API from the host machine:

curl http://localhost:11434/api/generate -d '{
  "model": "gemma3",
  "prompt": "Why is the sky blue?",
  "stream": false
}'

From other containers in the same network:

curl http://ollama:11434/api/generate -d '{
  "model": "gemma3",
  "prompt": "Hello, world!"
}'

Configuration

Environment Variables

Customize Ollama behavior with environment variables:

docker run -d --gpus=all \
  -e OLLAMA_DEBUG=1 \
  -e OLLAMA_HOST=0.0.0.0:11434 \
  -e OLLAMA_KEEP_ALIVE=24h \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

Variable	Description	Default
`OLLAMA_HOST`	Server bind address	`0.0.0.0:11434`
`OLLAMA_DEBUG`	Enable debug logging	`0`
`OLLAMA_MODELS`	Model storage path	`/root/.ollama/models`
`OLLAMA_KEEP_ALIVE`	Model keep-alive time	`5m`
`OLLAMA_NUM_PARALLEL`	Max parallel requests	`1`
`OLLAMA_MAX_LOADED_MODELS`	Max concurrent models	`3`
`OLLAMA_VULKAN`	Enable Vulkan acceleration	`0`
`JETSON_JETPACK`	JetPack version (5 or 6)	Auto-detect

Custom Model Location

Mount a custom directory for models:

docker run -d --gpus=all \
  -v /path/to/models:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

Using a Proxy

Configure proxy settings:

docker run -d --gpus=all \
  -e HTTPS_PROXY=https://proxy.example.com \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

Custom CA Certificate

For self-signed certificates, create a custom image:

FROM ollama/ollama
COPY my-ca.pem /usr/local/share/ca-certificates/my-ca.crt
RUN update-ca-certificates

Build and run:

docker build -t ollama-with-ca .
docker run -d --gpus=all \
  -e HTTPS_PROXY=https://proxy.example.com \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama-with-ca

Networking

Expose on Network

By default, Ollama binds to 0.0.0.0 in the container. Access from other machines:

curl http://your-server-ip:11434/api/tags

Custom Network

Create a custom Docker network for inter-container communication:

docker network create ollama-net

docker run -d --gpus=all \
  --network ollama-net \
  -v ollama:/root/.ollama \
  --name ollama \
  ollama/ollama

Updates

Update the Ollama container:

# Pull the latest image
docker pull ollama/ollama

# Stop and remove old container
docker stop ollama
docker rm ollama

# Start new container
docker run -d --gpus=all \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

Your models are preserved in the ollama volume.

Logs and Debugging

View Container Logs

docker logs ollama

Follow Logs in Real-time

docker logs -f ollama

Enable Debug Mode

docker run -d --gpus=all \
  -e OLLAMA_DEBUG=1 \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

Interactive Shell

docker exec -it ollama /bin/bash

Troubleshooting

GPU Not Detected

For NVIDIA:

Verify the container runtime:

docker run --rm --gpus all ubuntu nvidia-smi

Check Docker daemon configuration:
```
cat /etc/docker/daemon.json
```
Ensure nvidia-container-toolkit is installed

For AMD:

Check device permissions:
```
ls -l /dev/kfd /dev/dri
```

Add appropriate group permissions:

docker run -d --device /dev/kfd --device /dev/dri \
  --group-add video --group-add render \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  ollama/ollama:rocm

Container Keeps Restarting

Check logs for errors:
```
docker logs ollama
```
Verify port availability:
```
netstat -tuln | grep 11434
```
Check system resources:
```
docker stats ollama
```

Cgroup Issues on Linux

If GPU discovery fails after some time, disable systemd cgroup management in Docker. Edit /etc/docker/daemon.json:

{
  "exec-opts": ["native.cgroupdriver=cgroupfs"]
}

Restart Docker:

sudo systemctl restart docker

Browse More Models

Explore available models at ollama.com/library.

Next Steps

API Reference

Integrate Ollama API in your applications

GPU Configuration

Optimize GPU settings and performance

Linux Setup

Install Ollama directly on Linux

Model Library

Browse available models

​Quick Start

​CPU Only

​GPU Support

​NVIDIA GPU

​AMD GPU (ROCm)

​Vulkan Support

​Using Ollama in Docker

​Run a Model

​Pull Models

​List Models

​Check Running Models

​Docker Compose

​API Access

​Configuration

​Environment Variables

​Custom Model Location

​Using a Proxy

​Custom CA Certificate

​Networking

​Expose on Network

​Custom Network

​Updates

​Logs and Debugging

​View Container Logs

​Follow Logs in Real-time

​Enable Debug Mode

​Interactive Shell

​Troubleshooting

​GPU Not Detected

​Container Keeps Restarting

​Cgroup Issues on Linux

​Browse More Models

​Next Steps

API Reference

GPU Configuration

Linux Setup

Model Library

Quick Start

CPU Only

GPU Support

NVIDIA GPU

AMD GPU (ROCm)

Vulkan Support

Using Ollama in Docker

Run a Model

Pull Models

List Models

Check Running Models

Docker Compose

API Access

Configuration

Environment Variables

Custom Model Location

Using a Proxy

Custom CA Certificate

Networking

Expose on Network

Custom Network

Updates

Logs and Debugging

View Container Logs

Follow Logs in Real-time

Enable Debug Mode

Interactive Shell

Troubleshooting

GPU Not Detected

Container Keeps Restarting

Cgroup Issues on Linux

Browse More Models

Next Steps