Skip to main content
Claude Code is Anthropic’s agentic coding tool that can read, modify, and execute code in your working directory. Through Ollama’s Anthropic-compatible API, you can use open models like glm-4.7, qwen3-coder, and gpt-oss with Claude Code. Claude Code with Ollama

Installation

Install Claude Code from the official sources:
curl -fsSL https://claude.ai/install.sh | bash
Learn more at code.claude.com/docs.

Quick Setup

ollama launch claude
Ollama automatically:
1

Selects a model

Interactive picker with local and cloud models
2

Configures aliases

Sets up model routing (primary, fast, subagents)
3

Launches Claude Code

Starts the tool in your current directory

Configuration Only

ollama launch claude --config

Specify a Model

ollama launch claude --model gpt-oss:120b

Manual Setup

To manually configure Claude Code for Ollama:
1

Set environment variables

export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=""
export ANTHROPIC_BASE_URL=http://localhost:11434
2

Run Claude Code with a model

claude --model gpt-oss:20b

Inline Environment Variables

ANTHROPIC_AUTH_TOKEN=ollama \
ANTHROPIC_BASE_URL=http://localhost:11434 \
ANTHROPIC_API_KEY="" \
claude --model qwen3-coder

Model Requirements

Claude Code requires a large context window (at least 64k tokens) to understand your codebase.
See Context Length for how to adjust context length in Ollama.

Cloud Models

qwen3-coder:480b-cloud

Advanced code generation (260k context)

gpt-oss:120b-cloud

Large reasoning model (130k context)

glm-5:cloud

Reasoning and code generation (200k context)

deepseek-v3.1:671b-cloud

Massive reasoning model (160k context)
Explore more at ollama.com/search?c=cloud.

Local Models

qwen3-coder

Efficient code generation (~11GB VRAM)

glm-4.7

Reasoning and coding (~25GB VRAM)

gpt-oss:20b

OpenAI-style model for coding (~16GB VRAM)

Model Aliases

Claude Code uses model aliases to route different types of requests:
  • primary — Main model for complex tasks
  • fast — Lightweight model for quick operations
Ollama configures these automatically when you run ollama launch claude. Cloud models automatically populate the fast alias.
When Claude Code makes a request with a model alias (e.g., @fast), Ollama’s Anthropic-compatible API translates it to the configured model. This lets you optimize cost and performance by routing simple tasks to smaller models.

Subagent Support

Claude Code can delegate tasks to specialized subagents. With Ollama, you can configure different models for different agent types:
ollama launch claude
# → Select primary model (e.g., glm-5:cloud)
# → Ollama prompts for subagent configuration

Example Subagent Setup

  • Primary: glm-5:cloud (main reasoning)
  • Fast: qwen3:8b (quick tasks)
  • Code: qwen3-coder:480b-cloud (code generation)

Features

Multi-file Editing

Make changes across your entire codebase

Tool Calling

Execute shell commands and run tests

Context-Aware

Understands project structure and dependencies

Subagents

Delegate specialized tasks to different models

Usage Examples

Start in a Project Directory

cd ~/projects/my-app
ollama launch claude

Ask Claude Code to Make Changes

claude "Add error handling to the API routes"

Pass Extra Arguments

ollama launch claude -- --sandbox workspace-write

Review Changes Before Applying

Claude Code shows diffs for proposed changes. Press:
  • y — Accept changes
  • n — Reject changes
  • e — Edit manually

Connecting to ollama.com

To use cloud models hosted on ollama.com instead of running locally:
1

Create an API key

2

Export the key

export OLLAMA_API_KEY=your-key-here
3

Update base URL

export ANTHROPIC_BASE_URL=https://ollama.com
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=$OLLAMA_API_KEY
4

Run Claude Code

claude --model glm-5:cloud

Troubleshooting

”Model not found” Error

Ensure the model is pulled:
ollama pull gpt-oss:20b
ollama list

Context Window Too Small

Increase the context window for local models:
ollama run qwen3-coder /set parameter num_ctx 65536
See Context Length for details.

Slow Performance

Considerations for better performance:
  • Use cloud models for large context windows
  • Use smaller local models for quick tasks
  • Ensure you have sufficient VRAM for the model
  • Close other applications using GPU resources

Authentication Issues

Verify environment variables:
env | grep ANTHROPIC
Expected output:
ANTHROPIC_AUTH_TOKEN=ollama
ANTHROPIC_API_KEY=
ANTHROPIC_BASE_URL=http://localhost:11434

Advanced Configuration

Persistent Environment Variables

Add to your shell profile (~/.bashrc, ~/.zshrc, etc.):
export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_API_KEY=""
export ANTHROPIC_BASE_URL=http://localhost:11434

Custom Aliases

Manually configure aliases in Ollama’s saved config:
~/.ollama/integrations/claude.json
{
  "models": ["glm-5:cloud"],
  "aliases": {
    "primary": "glm-5:cloud",
    "fast": "qwen3:8b",
    "code": "qwen3-coder:480b-cloud"
  }
}

Learn More

Claude Code Docs

Official Claude Code documentation

Anthropic API

Ollama’s Anthropic-compatible API

Context Length

Configure model context windows

Tool Calling

How tool calling works in Ollama