Skip to main content
Ollama supports tool calling (also known as function calling) which allows a model to invoke tools and incorporate their results into its replies.

Calling a single tool

Invoke a single tool and include its response in a follow-up request (also known as “single-shot” tool calling).
curl -s http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{
  "model": "qwen3",
  "messages": [{"role": "user", "content": "What is the temperature in New York?"}
  ],
  "stream": false,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_temperature",
        "description": "Get the current temperature for a city",
        "parameters": {
          "type": "object",
          "required": ["city"],
          "properties": {
            "city": {"type": "string", "description": "The name of the city"}
          }
        }
      }
    }
  ]
}'
Generate a response with the tool result:
curl -s http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{
  "model": "qwen3",
  "messages": [
    {"role": "user", "content": "What is the temperature in New York?"},
    {
      "role": "assistant",
      "tool_calls": [
        {
          "function": {
            "name": "get_temperature",
            "arguments": {"city": "New York"}
          }
        }
      ]
    },
    {"role": "tool", "tool_name": "get_temperature", "content": "22°C"}
  ],
  "stream": false
}'

Parallel tool calling

Models can request multiple tool calls in parallel, then you send all tool responses back to the model:
from ollama import chat

def get_temperature(city: str) -> str:
  """Get the current temperature for a city"""
  temperatures = {
    "New York": "22°C",
    "London": "15°C",
    "Tokyo": "18°C"
  }
  return temperatures.get(city, "Unknown")

def get_conditions(city: str) -> str:
  """Get the current weather conditions for a city"""
  conditions = {
    "New York": "Partly cloudy",
    "London": "Rainy",
    "Tokyo": "Sunny"
  }
  return conditions.get(city, "Unknown")

messages = [{'role': 'user', 'content': 'What are the current weather conditions and temperature in New York and London?'}]

# The Python client automatically parses functions as a tool schema
response = chat(model='qwen3', messages=messages, tools=[get_temperature, get_conditions], think=True)

# Add the assistant message to the messages
messages.append(response.message)
if response.message.tool_calls:
  # Process each tool call
  for call in response.message.tool_calls:
    # Execute the appropriate tool
    if call.function.name == 'get_temperature':
      result = get_temperature(**call.function.arguments)
    elif call.function.name == 'get_conditions':
      result = get_conditions(**call.function.arguments)
    else:
      result = 'Unknown tool'
    # Add the tool result to the messages
    messages.append({'role': 'tool', 'tool_name': call.function.name, 'content': str(result)})

  # Generate the final response
  final_response = chat(model='qwen3', messages=messages, tools=[get_temperature, get_conditions], think=True)
  print(final_response.message.content)

Multi-turn tool calling (Agent loop)

An agent loop allows the model to decide when to invoke tools and incorporate their results into its replies:
from ollama import chat, ChatResponse

def add(a: int, b: int) -> int:
  """Add two numbers"""
  return a + b

def multiply(a: int, b: int) -> int:
  """Multiply two numbers"""
  return a * b

available_functions = {
  'add': add,
  'multiply': multiply,
}

messages = [{'role': 'user', 'content': 'What is (11434+12341)*412?'}]
while True:
  response: ChatResponse = chat(
    model='qwen3',
    messages=messages,
    tools=[add, multiply],
    think=True,
  )
  messages.append(response.message)
  print("Thinking:", response.message.thinking)
  print("Content:", response.message.content)
  if response.message.tool_calls:
    for tc in response.message.tool_calls:
      if tc.function.name in available_functions:
        print(f"Calling {tc.function.name} with arguments {tc.function.arguments}")
        result = available_functions[tc.function.name](**tc.function.arguments)
        print(f"Result: {result}")
        # Add the tool result to the messages
        messages.append({'role': 'tool', 'tool_name': tc.function.name, 'content': str(result)})
  else:
    # End the loop when there are no more tool calls
    break

Tool calling with streaming

When streaming, gather every chunk of thinking, content, and tool_calls, then return those fields together with any tool results in the follow-up request. See the full example in the source documentation at /workspace/source/docs/capabilities/tool-calling.mdx starting at line 586.

Using functions as tools (Python SDK)

The Python SDK automatically parses functions as a tool schema:
from ollama import chat

def get_temperature(city: str) -> str:
  """Get the current temperature for a city

  Args:
    city: The name of the city

  Returns:
    The current temperature for the city
  """
  temperatures = {
    'New York': '22°C',
    'London': '15°C',
  }
  return temperatures.get(city, 'Unknown')

available_functions = {
  'get_temperature': get_temperature,
}

# Directly pass the function as part of the tools list
response = chat(
  model='qwen3',
  messages=messages,
  tools=available_functions.values(),
  think=True
)

Tool schema format

Tools are defined using a JSON schema:
{
  "type": "function",
  "function": {
    "name": "get_temperature",
    "description": "Get the current temperature for a city",
    "parameters": {
      "type": "object",
      "required": ["city"],
      "properties": {
        "city": {
          "type": "string",
          "description": "The name of the city"
        }
      }
    }
  }
}

Tool call response format

When a model wants to call a tool, it returns a tool_calls array:
{
  "role": "assistant",
  "tool_calls": [
    {
      "function": {
        "name": "get_temperature",
        "arguments": {"city": "New York"}
      }
    }
  ]
}
You must then respond with a tool message:
{
  "role": "tool",
  "tool_name": "get_temperature",
  "content": "22°C"
}

Tips

  • Use descriptive function names and clear docstrings
  • Provide detailed parameter descriptions to help the model understand when to use each tool
  • Enable think mode for better reasoning about which tools to use
  • Handle tool errors gracefully by returning error messages as tool results
  • For complex workflows, use an agent loop to let the model decide when it’s done
  • Always accumulate all partial fields when streaming with tool calls