Home » AI Tool Use » Implement Function Calling

How to Implement Function Calling with LLMs

Implementing function calling requires defining tool schemas that describe your functions to the model, handling the structured tool call output the model generates, executing the corresponding functions, and feeding results back in a loop until the model produces a final response. The pattern is the same across Claude, GPT-4, and Gemini, though the exact API fields differ by provider. Once implemented, your LLM can invoke any function you expose, transforming it from a text generator into a capable agent.

Before You Start

You need an API key from your model provider (Anthropic for Claude, OpenAI for GPT-4, or Google for Gemini). You need a working API client in your language of choice, either the official SDK or a raw HTTP client. You need at least one function you want the model to call, something concrete like a database lookup, weather API, or calculator. Starting with a simple, deterministic function makes debugging easier because you can predict what the correct output should be.

This guide uses the Anthropic Claude API for examples because its tool use implementation is clean and well-documented, but the concepts transfer directly to other providers. The provider comparison guide covers the specific differences between Claude, GPT-4, and Gemini function calling.

Step-by-Step Implementation

Step 1: Define your tool schemas.
Each tool needs a name, a description, and an input_schema that describes the parameters using JSON Schema. The description is not just documentation; the model reads it to decide when to use the tool, so write it as if you are explaining the tool's purpose to a colleague who needs to decide when to call it.
tools = [ { "name": "get_weather", "description": "Returns current weather conditions for a city. Use this when the user asks about weather, temperature, or outdoor conditions for a specific location.", "input_schema": { "type": "object", "properties": { "city": { "type": "string", "description": "City name, e.g. 'San Francisco' or 'London'" }, "units": { "type": "string", "enum": ["fahrenheit", "celsius"], "description": "Temperature units. Defaults to fahrenheit for US cities, celsius for others." } }, "required": ["city"] } } ]
Step 2: Register tools with the model API.
Pass your tool definitions to the model alongside your messages. With the Anthropic SDK, tools are a top-level parameter on the message creation call. The model receives these definitions as part of its context and uses them to decide when to generate tool calls.
import anthropic client = anthropic.Anthropic() response = client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, tools=tools, messages=[ {"role": "user", "content": "What's the weather like in Tokyo?"} ] )
Step 3: Handle tool call responses.
When the model decides to use a tool, it returns a response with stop_reason="tool_use" instead of stop_reason="end_turn". The response content includes a tool_use block with the tool name, arguments, and a unique ID that you will use to send the result back. Check the stop_reason on every response to determine whether you need to execute a tool or can present the text response to the user.
# Check if the model wants to call a tool for block in response.content: if block.type == "tool_use": tool_name = block.name # "get_weather" tool_input = block.input # {"city": "Tokyo"} tool_use_id = block.id # unique ID for this call
Step 4: Execute the function and return results.
Map the tool name to your actual function, run it with the provided arguments, and send the result back to the model as a tool_result message. The tool_use_id links the result to the specific call the model made, which is important when handling parallel tool calls.
# Execute the actual function def get_weather(city, units="celsius"): # Your real implementation: call a weather API, query a database, etc. return {"temp": 22, "condition": "partly cloudy", "humidity": 65} result = get_weather(**tool_input) # Send the result back to the model followup = client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, tools=tools, messages=[ {"role": "user", "content": "What's the weather like in Tokyo?"}, {"role": "assistant", "content": response.content}, {"role": "user", "content": [ { "type": "tool_result", "tool_use_id": tool_use_id, "content": json.dumps(result) } ]} ] )
Step 5: Build the multi-turn tool loop.
In many interactions, the model needs to call multiple tools sequentially or process one result and then call another tool. Build a loop that continues sending tool results to the model until the response has stop_reason="end_turn", indicating the model is done calling tools and has produced a final text response for the user.
messages = [{"role": "user", "content": user_input}] while True: response = client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, tools=tools, messages=messages ) # Add the assistant's response to the message history messages.append({"role": "assistant", "content": response.content}) # If no more tool calls, we have the final response if response.stop_reason == "end_turn": break # Process all tool calls in this response tool_results = [] for block in response.content: if block.type == "tool_use": result = execute_tool(block.name, block.input) tool_results.append({ "type": "tool_result", "tool_use_id": block.id, "content": json.dumps(result) }) messages.append({"role": "user", "content": tool_results})
Step 6: Add error handling and timeouts.
Wrap every tool execution in try/except (or try/catch) blocks. When a tool fails, return the error message as the tool result instead of crashing the loop. Include enough context in the error for the model to either retry with different parameters or explain the problem to the user. Set timeouts on external API calls to prevent a single slow tool from blocking the entire conversation.
def execute_tool(name, arguments): try: func = tool_registry[name] result = func(**arguments) return {"success": True, "data": result} except KeyError: return {"success": False, "error": f"Unknown tool: {name}"} except TimeoutError: return {"success": False, "error": f"{name} timed out after 10 seconds. The service may be temporarily unavailable."} except Exception as e: return {"success": False, "error": f"{name} failed: {str(e)}"}

Common Pitfalls

The most common implementation mistake is not building the tool loop. Developers handle the first tool call correctly but then present the result directly to the user instead of sending it back to the model. The model needs to see its own tool results to generate a natural language response that interprets the data for the user. Without the loop, you get raw JSON shown to users instead of conversational responses.

Another frequent issue is forgetting to include the full message history in each API call. Unlike a chat session that maintains server-side state, API calls are stateless. Every call must include the entire conversation history including previous tool calls and results. Missing history causes the model to lose context and produce incoherent responses.

Tool definitions that are too vague cause the model to call tools incorrectly or at inappropriate times. If your tool description says "processes data" rather than "retrieves the order details for a specific order ID including status, shipping info, and line items," the model has to guess when to use it and often guesses wrong. Invest time in precise, detailed descriptions that disambiguate each tool from the others.

Finally, watch for infinite loops. If the model repeatedly calls the same tool with the same parameters (perhaps because the result does not satisfy its expectations), the loop runs indefinitely. Add a maximum iteration count (typically 10 to 20 cycles) and break out with an error message to the model explaining the limit was reached.

Next Steps

Once basic function calling works, move to designing better tool schemas for higher first-attempt accuracy, chaining multiple tools for complex workflows, and robust error handling for production reliability. For agents with many tools, the tool routing guide covers strategies for selecting the right tool from a large set.

Add persistent memory to your tool-using agent. Adaptive Recall lets your agent remember tool outcomes, learn usage patterns, and improve over time through cognitive scoring.

Get Started Free