Skip to main content

Synopsis

ollama run MODEL [PROMPT]

Description

The run command starts a model and allows you to interact with it. If the model is not already present locally, it will be pulled automatically from the registry. You can use run in two modes:
  • Interactive mode: No prompt provided, opens a chat interface
  • Single-shot mode: Prompt provided, returns response and exits

Arguments

MODEL
string
required
Name of the model to run (e.g., llama3.2, mistral, codellama:7b)
PROMPT
string
Optional prompt to send to the model. If omitted, enters interactive mode.

Options

--keepalive
duration
Duration to keep the model loaded in memory (e.g., 5m, 1h)
  • Default: Server configuration (typically 5 minutes)
  • 0 - Unload immediately after use
  • -1 - Keep loaded indefinitely
--format
string
Response format. Use json to request JSON output.
ollama run llama3.2 --format json "List 3 programming languages"
--verbose
boolean
default:"false"
Show detailed timing information for the response, including:
  • Prompt evaluation time
  • Response generation time
  • Total tokens processed
--nowordwrap
boolean
default:"false"
Disable automatic word wrapping in terminal output

Thinking Mode Options

--think
string|boolean
Enable thinking mode for supported models. Accepts:
  • true or empty - Enable default thinking
  • false - Disable thinking
  • high, medium, low - Set thinking effort level
ollama run deepseek-r1 --think
ollama run deepseek-r1 --think=high
--hidethinking
boolean
default:"false"
Hide the thinking output from display (only show final answer)

Embedding Model Options

--truncate
boolean
default:"true"
For embedding models: truncate inputs that exceed context length. Set to false to error instead.
--dimensions
integer
default:"0"
For embedding models: truncate output embeddings to specified dimension

Experimental Options

--experimental
boolean
default:"false"
Enable experimental agent loop with tool support
--experimental-yolo
boolean
default:"false"
Skip all tool approval prompts (use with caution)
--experimental-websearch
boolean
default:"false"
Enable web search tool in experimental mode

Examples

Interactive Mode

Start an interactive chat session:
ollama run llama3.2

Single Prompt

Run a one-off prompt:
ollama run llama3.2 "Why is the sky blue?"

Piped Input

Pipe content from stdin:
cat document.txt | ollama run llama3.2 "Summarize this document:"

JSON Output

Request structured JSON responses:
ollama run llama3.2 --format json "List 3 colors"

Control Model Memory

# Keep model loaded for 10 minutes
ollama run llama3.2 --keepalive 10m "Hello"

# Unload immediately after response
ollama run llama3.2 --keepalive 0 "Hello"

# Keep loaded indefinitely
ollama run llama3.2 --keepalive -1

Thinking Models

Use reasoning models with visible thinking:
# Enable thinking with default settings
ollama run deepseek-r1 --think "Solve this problem: ..."

# High effort thinking
ollama run deepseek-r1 --think=high "Complex reasoning task"

# Hide thinking output
ollama run deepseek-r1 --think --hidethinking "Just show the answer"

Embedding Models

Generate embeddings:
ollama run nomic-embed-text "Your text here"
Embedding models return JSON arrays of floating-point numbers representing the text embedding.

Interactive Mode Commands

When in interactive mode, you can use these special commands:
  • /bye - Exit the session
  • /clear - Clear conversation history
  • /show - Display model information
  • /set - Set session parameters (e.g., /set think, /set nothink)
  • /load <image> - Load an image for multimodal models
  • Multiline input: Use """ to enter multiline mode, """ again to submit

Output Format

Interactive mode displays:
>>> Your prompt here

Model response appears here...
Non-interactive mode outputs the response directly to stdout.

Environment Variables

OLLAMA_HOST
string
default:"http://127.0.0.1:11434"
Ollama server address
OLLAMA_EDITOR
string
Editor to use for multiline input (e.g., vim, nano)
OLLAMA_NOHISTORY
boolean
default:"false"
Disable saving chat history

Exit Codes

  • 0 - Success
  • 1 - Model not found or error occurred
  • 130 - Interrupted by user (Ctrl+C)