0PricingLogin
AI Engineering Academy · Lesson

Parallel Function Calling

Handle responses where the model calls multiple functions simultaneously, execute them in parallel with asyncio, and batch the results into a single follow-up API call.

What Is Parallel Function Calling?

OpenAI's models can call multiple functions simultaneously in a single response when the answer requires information from several independent sources. Instead of chaining tool calls sequentially — each waiting for the previous one — the model emits multiple tool calls at once. Your application executes them in parallel and sends all results back together, dramatically reducing latency.

Recognizing Parallel Tool Calls

When the model issues parallel tool calls, the response message contains a tool_calls list with more than one entry. Each entry has a unique id, function name, and arguments. You must process all of them before making the follow-up API call — the model expects results for every tool call it issued.

from openai import OpenAI
import json

client = OpenAI()

# A question that naturally requires two independent lookups
response = client.chat.completions.create(
    model='gpt-4o',
    messages=[{'role': 'user', 'content': 'Compare the weather in London and Tokyo right now.'}],
    tools=tools
)

message = response.choices[0].message
print('Number of tool calls:', len(message.tool_calls))
# Might print: Number of tool calls: 2

for tc in message.tool_calls:
    print(f'  {tc.function.name}({tc.function.arguments})')
# get_current_weather({"location": "London"})
# get_current_weather({"location": "Tokyo"})

All lessons in this course

  1. Defining Function Schemas for the API
  2. Processing Tool Calls in Your Application
  3. Parallel Function Calling
  4. Building a Natural Language Database Interface
← Back to AI Engineering Academy