Skip to main content
Ollama Logo

Start building with open models

Ollama makes it easy to get up and running with large language models locally. Download, run, and manage LLMs on your own hardware with a simple command-line interface.

Quick Start

Get Ollama running in minutes with your first model

Installation

Install Ollama on macOS, Linux, Windows, or Docker

CLI Reference

Complete command-line interface documentation

REST API

Integrate Ollama into your applications

Key Features

Local & Private

Run models entirely on your own hardware with no data leaving your machine

Easy to Use

Simple CLI commands to pull, run, and manage models

Model Library

Access hundreds of open source models from ollama.com/library

REST API

Integrate with applications using the built-in HTTP API

Multiple Backends

Optimized performance with llama.cpp for CPU and GPU acceleration

Cross-Platform

Works on macOS, Linux, Windows, and Docker
Ollama supports a wide range of open source models:
  • Gemma 3 - Google’s latest language model family
  • Llama 3.2 - Meta’s powerful language model
  • Mistral - High-performance efficient models
  • Phi-3 - Microsoft’s compact but capable models
  • DeepSeek R1 - Advanced reasoning models
  • Qwen - Alibaba’s multilingual models
Browse all models →

Use Cases

1

Chat with Models

Run interactive chat sessions with language models directly in your terminal
ollama run gemma3
2

Integrate with Applications

Use the REST API to add AI capabilities to your applications
curl http://localhost:11434/api/chat -d '{
  "model": "gemma3",
  "messages": [{"role": "user", "content": "Hello!"}]
}'
3

Customize Models

Create custom models with specific instructions using Modelfiles
ollama create mymodel -f Modelfile

Client Libraries

Integrate Ollama with your favorite programming language:

Python

pip install ollama
ollama-python →

JavaScript

npm i ollama
ollama-js →

Community & Support

Discord

Join the Ollama community

Twitter/X

Follow for updates

Reddit

Discuss with users

Example: Quick Chat

# Pull and run a model
ollama run gemma3

>>> Send a message (/? for help)
The model will be downloaded automatically on first run, and you can start chatting immediately.
Ollama runs a local API server on port 11434 by default. The server starts automatically when you run a model.

What’s Next?

Follow the Quickstart

Get your first model running in under 5 minutes

Explore the Model Library

Browse hundreds of available models

Read the API Docs

Learn how to integrate Ollama into your applications

Create Custom Models

Customize models with Modelfiles