Tool Calling - Ollama

Ollama supports tool calling (also known as function calling) which allows a model to invoke tools and incorporate their results into its replies.

Calling a single tool

Invoke a single tool and include its response in a follow-up request (also known as “single-shot” tool calling).

cURL
Python
JavaScript

curl -s http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{
  "model": "qwen3",
  "messages": [{"role": "user", "content": "What is the temperature in New York?"}
  ],
  "stream": false,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_temperature",
        "description": "Get the current temperature for a city",
        "parameters": {
          "type": "object",
          "required": ["city"],
          "properties": {
            "city": {"type": "string", "description": "The name of the city"}
          }
        }
      }
    }
  ]
}'

Generate a response with the tool result:

curl -s http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{
  "model": "qwen3",
  "messages": [
    {"role": "user", "content": "What is the temperature in New York?"},
    {
      "role": "assistant",
      "tool_calls": [
        {
          "function": {
            "name": "get_temperature",
            "arguments": {"city": "New York"}
          }
        }
      ]
    },
    {"role": "tool", "tool_name": "get_temperature", "content": "22°C"}
  ],
  "stream": false
}'

from ollama import chat

def get_temperature(city: str) -> str:
  """Get the current temperature for a city

  Args:
    city: The name of the city

  Returns:
    The current temperature for the city
  """
  temperatures = {
    "New York": "22°C",
    "London": "15°C",
    "Tokyo": "18°C",
  }
  return temperatures.get(city, "Unknown")

messages = [{"role": "user", "content": "What is the temperature in New York?"}]

# Pass functions directly as tools in the tools list or as a JSON schema
response = chat(model="qwen3", messages=messages, tools=[get_temperature], think=True)

messages.append(response.message)
if response.message.tool_calls:
  # Only recommended for models which only return a single tool call
  call = response.message.tool_calls[0]
  result = get_temperature(**call.function.arguments)
  # Add the tool result to the messages
  messages.append({"role": "tool", "tool_name": call.function.name, "content": str(result)})

  final_response = chat(model="qwen3", messages=messages, tools=[get_temperature], think=True)
  print(final_response.message.content)

import ollama from 'ollama'

function getTemperature(city: string): string {
  const temperatures: Record<string, string> = {
    'New York': '22°C',
    'London': '15°C',
    'Tokyo': '18°C',
  }
  return temperatures[city] ?? 'Unknown'
}

const tools = [
  {
    type: 'function',
    function: {
      name: 'get_temperature',
      description: 'Get the current temperature for a city',
      parameters: {
        type: 'object',
        required: ['city'],
        properties: {
          city: { type: 'string', description: 'The name of the city' },
        },
      },
    },
  },
]

const messages = [{ role: 'user', content: "What is the temperature in New York?" }]

const response = await ollama.chat({
  model: 'qwen3',
  messages,
  tools,
  think: true,
})

messages.push(response.message)
if (response.message.tool_calls?.length) {
  // Only recommended for models which only return a single tool call
  const call = response.message.tool_calls[0]
  const args = call.function.arguments as { city: string }
  const result = getTemperature(args.city)
  // Add the tool result to the messages
  messages.push({ role: 'tool', tool_name: call.function.name, content: result })

  // Generate the final response
  const finalResponse = await ollama.chat({ model: 'qwen3', messages, tools, think: true })
  console.log(finalResponse.message.content)
}

Parallel tool calling

Models can request multiple tool calls in parallel, then you send all tool responses back to the model:

Python
JavaScript

from ollama import chat

def get_temperature(city: str) -> str:
  """Get the current temperature for a city"""
  temperatures = {
    "New York": "22°C",
    "London": "15°C",
    "Tokyo": "18°C"
  }
  return temperatures.get(city, "Unknown")

def get_conditions(city: str) -> str:
  """Get the current weather conditions for a city"""
  conditions = {
    "New York": "Partly cloudy",
    "London": "Rainy",
    "Tokyo": "Sunny"
  }
  return conditions.get(city, "Unknown")

messages = [{'role': 'user', 'content': 'What are the current weather conditions and temperature in New York and London?'}]

# The Python client automatically parses functions as a tool schema
response = chat(model='qwen3', messages=messages, tools=[get_temperature, get_conditions], think=True)

# Add the assistant message to the messages
messages.append(response.message)
if response.message.tool_calls:
  # Process each tool call
  for call in response.message.tool_calls:
    # Execute the appropriate tool
    if call.function.name == 'get_temperature':
      result = get_temperature(**call.function.arguments)
    elif call.function.name == 'get_conditions':
      result = get_conditions(**call.function.arguments)
    else:
      result = 'Unknown tool'
    # Add the tool result to the messages
    messages.append({'role': 'tool', 'tool_name': call.function.name, 'content': str(result)})

  # Generate the final response
  final_response = chat(model='qwen3', messages=messages, tools=[get_temperature, get_conditions], think=True)
  print(final_response.message.content)

import ollama from 'ollama'

function getTemperature(city: string): string {
  const temperatures: { [key: string]: string } = {
    "New York": "22°C",
    "London": "15°C",
    "Tokyo": "18°C"
  }
  return temperatures[city] || "Unknown"
}

function getConditions(city: string): string {
  const conditions: { [key: string]: string } = {
    "New York": "Partly cloudy",
    "London": "Rainy",
    "Tokyo": "Sunny"
  }
  return conditions[city] || "Unknown"
}

const tools = [
  {
    type: 'function',
    function: {
      name: 'get_temperature',
      description: 'Get the current temperature for a city',
      parameters: {
        type: 'object',
        required: ['city'],
        properties: {
          city: { type: 'string', description: 'The name of the city' },
        },
      },
    },
  },
  {
    type: 'function',
    function: {
      name: 'get_conditions',
      description: 'Get the current weather conditions for a city',
      parameters: {
        type: 'object',
        required: ['city'],
        properties: {
          city: { type: 'string', description: 'The name of the city' },
        },
      },
    },
  }
]

const messages = [{ role: 'user', content: 'What are the current weather conditions and temperature in New York and London?' }]

const response = await ollama.chat({
  model: 'qwen3',
  messages,
  tools,
  think: true
})

// Add the assistant message to the messages
messages.push(response.message)
if (response.message.tool_calls) {
  // Process each tool call
  for (const call of response.message.tool_calls) {
    // Execute the appropriate tool
    let result: string
    if (call.function.name === 'get_temperature') {
      const args = call.function.arguments as { city: string }
      result = getTemperature(args.city)
    } else if (call.function.name === 'get_conditions') {
      const args = call.function.arguments as { city: string }
      result = getConditions(args.city)
    } else {
      result = 'Unknown tool'
    }
    // Add the tool result to the messages
    messages.push({ role: 'tool', tool_name: call.function.name, content: result })
  }

  // Generate the final response
  const finalResponse = await ollama.chat({ model: 'qwen3', messages, tools, think: true })
  console.log(finalResponse.message.content)
}

Multi-turn tool calling (Agent loop)

An agent loop allows the model to decide when to invoke tools and incorporate their results into its replies:

Python
JavaScript

from ollama import chat, ChatResponse

def add(a: int, b: int) -> int:
  """Add two numbers"""
  return a + b

def multiply(a: int, b: int) -> int:
  """Multiply two numbers"""
  return a * b

available_functions = {
  'add': add,
  'multiply': multiply,
}

messages = [{'role': 'user', 'content': 'What is (11434+12341)*412?'}]
while True:
  response: ChatResponse = chat(
    model='qwen3',
    messages=messages,
    tools=[add, multiply],
    think=True,
  )
  messages.append(response.message)
  print("Thinking:", response.message.thinking)
  print("Content:", response.message.content)
  if response.message.tool_calls:
    for tc in response.message.tool_calls:
      if tc.function.name in available_functions:
        print(f"Calling {tc.function.name} with arguments {tc.function.arguments}")
        result = available_functions[tc.function.name](**tc.function.arguments)
        print(f"Result: {result}")
        # Add the tool result to the messages
        messages.append({'role': 'tool', 'tool_name': tc.function.name, 'content': str(result)})
  else:
    # End the loop when there are no more tool calls
    break

import ollama from 'ollama'

type ToolName = 'add' | 'multiply'

function add(a: number, b: number): number {
  return a + b
}

function multiply(a: number, b: number): number {
  return a * b
}

const availableFunctions: Record<ToolName, (a: number, b: number) => number> = {
  add,
  multiply,
}

const tools = [
  {
    type: 'function',
    function: {
      name: 'add',
      description: 'Add two numbers',
      parameters: {
        type: 'object',
        required: ['a', 'b'],
        properties: {
          a: { type: 'integer', description: 'The first number' },
          b: { type: 'integer', description: 'The second number' },
        },
      },
    },
  },
  {
    type: 'function',
    function: {
      name: 'multiply',
      description: 'Multiply two numbers',
      parameters: {
        type: 'object',
        required: ['a', 'b'],
        properties: {
          a: { type: 'integer', description: 'The first number' },
          b: { type: 'integer', description: 'The second number' },
        },
      },
    },
  },
]

async function agentLoop() {
  const messages = [{ role: 'user', content: 'What is (11434+12341)*412?' }]

  while (true) {
    const response = await ollama.chat({
      model: 'qwen3',
      messages,
      tools,
      think: true,
    })

    messages.push(response.message)
    console.log('Thinking:', response.message.thinking)
    console.log('Content:', response.message.content)

    const toolCalls = response.message.tool_calls ?? []
    if (toolCalls.length) {
      for (const call of toolCalls) {
        const fn = availableFunctions[call.function.name as ToolName]
        if (!fn) {
          continue
        }

        const args = call.function.arguments as { a: number; b: number }
        console.log(`Calling ${call.function.name} with arguments`, args)
        const result = fn(args.a, args.b)
        console.log(`Result: ${result}`)
        messages.push({ role: 'tool', tool_name: call.function.name, content: String(result) })
      }
    } else {
      break
    }
  }
}

agentLoop().catch(console.error)

Tool calling with streaming

When streaming, gather every chunk of thinking, content, and tool_calls, then return those fields together with any tool results in the follow-up request. See the full example in the source documentation at /workspace/source/docs/capabilities/tool-calling.mdx starting at line 586.

Using functions as tools (Python SDK)

The Python SDK automatically parses functions as a tool schema:

from ollama import chat

def get_temperature(city: str) -> str:
  """Get the current temperature for a city

  Args:
    city: The name of the city

  Returns:
    The current temperature for the city
  """
  temperatures = {
    'New York': '22°C',
    'London': '15°C',
  }
  return temperatures.get(city, 'Unknown')

available_functions = {
  'get_temperature': get_temperature,
}

# Directly pass the function as part of the tools list
response = chat(
  model='qwen3',
  messages=messages,
  tools=available_functions.values(),
  think=True
)

Tool schema format

Tools are defined using a JSON schema:

{
  "type": "function",
  "function": {
    "name": "get_temperature",
    "description": "Get the current temperature for a city",
    "parameters": {
      "type": "object",
      "required": ["city"],
      "properties": {
        "city": {
          "type": "string",
          "description": "The name of the city"
        }
      }
    }
  }
}

Tool call response format

When a model wants to call a tool, it returns a tool_calls array:

{
  "role": "assistant",
  "tool_calls": [
    {
      "function": {
        "name": "get_temperature",
        "arguments": {"city": "New York"}
      }
    }
  ]
}

You must then respond with a tool message:

{
  "role": "tool",
  "tool_name": "get_temperature",
  "content": "22°C"
}

Tips

Use descriptive function names and clear docstrings
Provide detailed parameter descriptions to help the model understand when to use each tool
Enable think mode for better reasoning about which tools to use
Handle tool errors gracefully by returning error messages as tool results
For complex workflows, use an agent loop to let the model decide when it’s done
Always accumulate all partial fields when streaming with tool calls

​Calling a single tool

​Parallel tool calling

​Multi-turn tool calling (Agent loop)

​Tool calling with streaming

​Using functions as tools (Python SDK)

​Tool schema format

​Tool call response format

​Tips