Synopsis
Description
Therun command starts a model and allows you to interact with it. If the model is not already present locally, it will be pulled automatically from the registry.
You can use run in two modes:
- Interactive mode: No prompt provided, opens a chat interface
- Single-shot mode: Prompt provided, returns response and exits
Arguments
Name of the model to run (e.g.,
llama3.2, mistral, codellama:7b)Optional prompt to send to the model. If omitted, enters interactive mode.
Options
Duration to keep the model loaded in memory (e.g.,
5m, 1h)- Default: Server configuration (typically 5 minutes)
0- Unload immediately after use-1- Keep loaded indefinitely
Response format. Use
json to request JSON output.Show detailed timing information for the response, including:
- Prompt evaluation time
- Response generation time
- Total tokens processed
Disable automatic word wrapping in terminal output
Thinking Mode Options
Enable thinking mode for supported models. Accepts:
trueor empty - Enable default thinkingfalse- Disable thinkinghigh,medium,low- Set thinking effort level
Hide the thinking output from display (only show final answer)
Embedding Model Options
For embedding models: truncate inputs that exceed context length. Set to
false to error instead.For embedding models: truncate output embeddings to specified dimension
Experimental Options
Enable experimental agent loop with tool support
Skip all tool approval prompts (use with caution)
Enable web search tool in experimental mode
Examples
Interactive Mode
Start an interactive chat session:Single Prompt
Run a one-off prompt:Piped Input
Pipe content from stdin:JSON Output
Request structured JSON responses:Control Model Memory
Thinking Models
Use reasoning models with visible thinking:Embedding Models
Generate embeddings:Embedding models return JSON arrays of floating-point numbers representing the text embedding.
Interactive Mode Commands
When in interactive mode, you can use these special commands:/bye- Exit the session/clear- Clear conversation history/show- Display model information/set- Set session parameters (e.g.,/set think,/set nothink)/load <image>- Load an image for multimodal models- Multiline input: Use
"""to enter multiline mode,"""again to submit
Output Format
Interactive mode displays:Environment Variables
Ollama server address
Editor to use for multiline input (e.g.,
vim, nano)Disable saving chat history
Exit Codes
0- Success1- Model not found or error occurred130- Interrupted by user (Ctrl+C)
Related Commands
ollama pull- Download a model without running itollama show- Display model informationollama ps- List currently running modelsollama stop- Stop a running model