Skip to main content
The official Ollama Docker image provides a containerized way to run Ollama with full GPU acceleration support.

Quick Start

CPU Only

Run Ollama using CPU inference:
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
This creates a container with:
  • Volume mount for persistent model storage
  • Port 11434 exposed for API access
  • Background daemon mode

GPU Support

NVIDIA GPU

For NVIDIA GPU acceleration, you need the NVIDIA Container Toolkit.
1

Install NVIDIA Container Toolkit

# Configure the repository
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
    | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -fsSL https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
    | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
    | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update

# Install the toolkit
sudo apt-get install -y nvidia-container-toolkit
2

Configure Docker Runtime

Configure Docker to use the NVIDIA runtime:
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
3

Run with GPU

Start Ollama with GPU support:
docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
For NVIDIA JetPack systems, set the JetPack version:
docker run -d --gpus=all -e JETSON_JETPACK=5 -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
Use JETSON_JETPACK=6 for JetPack 6.

AMD GPU (ROCm)

For AMD GPU support, use the rocm tag:
docker run -d --device /dev/kfd --device /dev/dri \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama:rocm
Supported AMD GPUs:
  • Radeon RX 7900, 7800, 7700, 7600 series
  • Radeon RX 6900, 6800 series
  • Radeon PRO W7000/W6000 series
  • AMD Instinct MI series
In some distributions, you may need to pass additional --group-add arguments for device access. Check device permissions with:
ls -lnd /dev/kfd /dev/dri /dev/dri/*

Vulkan Support

Vulkan provides cross-platform GPU acceleration and is bundled in the standard image:
docker run -d --device /dev/kfd --device /dev/dri \
  -e OLLAMA_VULKAN=1 \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

Using Ollama in Docker

Run a Model

Execute commands inside the running container:
docker exec -it ollama ollama run llama3.2

Pull Models

docker exec -it ollama ollama pull gemma3

List Models

docker exec -it ollama ollama list

Check Running Models

docker exec -it ollama ollama ps

Docker Compose

Create a docker-compose.yml file:
services:
  ollama:
    image: ollama/ollama
    container_name: ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama:/root/.ollama
    restart: unless-stopped

volumes:
  ollama:
Start with Docker Compose:
docker compose up -d

API Access

Access the Ollama API from the host machine:
curl http://localhost:11434/api/generate -d '{
  "model": "gemma3",
  "prompt": "Why is the sky blue?",
  "stream": false
}'
From other containers in the same network:
curl http://ollama:11434/api/generate -d '{
  "model": "gemma3",
  "prompt": "Hello, world!"
}'

Configuration

Environment Variables

Customize Ollama behavior with environment variables:
docker run -d --gpus=all \
  -e OLLAMA_DEBUG=1 \
  -e OLLAMA_HOST=0.0.0.0:11434 \
  -e OLLAMA_KEEP_ALIVE=24h \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama
VariableDescriptionDefault
OLLAMA_HOSTServer bind address0.0.0.0:11434
OLLAMA_DEBUGEnable debug logging0
OLLAMA_MODELSModel storage path/root/.ollama/models
OLLAMA_KEEP_ALIVEModel keep-alive time5m
OLLAMA_NUM_PARALLELMax parallel requests1
OLLAMA_MAX_LOADED_MODELSMax concurrent models3
OLLAMA_VULKANEnable Vulkan acceleration0
JETSON_JETPACKJetPack version (5 or 6)Auto-detect

Custom Model Location

Mount a custom directory for models:
docker run -d --gpus=all \
  -v /path/to/models:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

Using a Proxy

Configure proxy settings:
docker run -d --gpus=all \
  -e HTTPS_PROXY=https://proxy.example.com \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

Custom CA Certificate

For self-signed certificates, create a custom image:
FROM ollama/ollama
COPY my-ca.pem /usr/local/share/ca-certificates/my-ca.crt
RUN update-ca-certificates
Build and run:
docker build -t ollama-with-ca .
docker run -d --gpus=all \
  -e HTTPS_PROXY=https://proxy.example.com \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama-with-ca

Networking

Expose on Network

By default, Ollama binds to 0.0.0.0 in the container. Access from other machines:
curl http://your-server-ip:11434/api/tags

Custom Network

Create a custom Docker network for inter-container communication:
docker network create ollama-net

docker run -d --gpus=all \
  --network ollama-net \
  -v ollama:/root/.ollama \
  --name ollama \
  ollama/ollama

Updates

Update the Ollama container:
# Pull the latest image
docker pull ollama/ollama

# Stop and remove old container
docker stop ollama
docker rm ollama

# Start new container
docker run -d --gpus=all \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama
Your models are preserved in the ollama volume.

Logs and Debugging

View Container Logs

docker logs ollama

Follow Logs in Real-time

docker logs -f ollama

Enable Debug Mode

docker run -d --gpus=all \
  -e OLLAMA_DEBUG=1 \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

Interactive Shell

docker exec -it ollama /bin/bash

Troubleshooting

GPU Not Detected

For NVIDIA:
  1. Verify the container runtime:
    docker run --rm --gpus all ubuntu nvidia-smi
    
  2. Check Docker daemon configuration:
    cat /etc/docker/daemon.json
    
  3. Ensure nvidia-container-toolkit is installed
For AMD:
  1. Check device permissions:
    ls -l /dev/kfd /dev/dri
    
  2. Add appropriate group permissions:
    docker run -d --device /dev/kfd --device /dev/dri \
      --group-add video --group-add render \
      -v ollama:/root/.ollama \
      -p 11434:11434 \
      ollama/ollama:rocm
    

Container Keeps Restarting

  1. Check logs for errors:
    docker logs ollama
    
  2. Verify port availability:
    netstat -tuln | grep 11434
    
  3. Check system resources:
    docker stats ollama
    

Cgroup Issues on Linux

If GPU discovery fails after some time, disable systemd cgroup management in Docker. Edit /etc/docker/daemon.json:
{
  "exec-opts": ["native.cgroupdriver=cgroupfs"]
}
Restart Docker:
sudo systemctl restart docker

Browse More Models

Explore available models at ollama.com/library.

Next Steps

API Reference

Integrate Ollama API in your applications

GPU Configuration

Optimize GPU settings and performance

Linux Setup

Install Ollama directly on Linux

Model Library

Browse available models