How to Design Tool Schemas for AI Agents
Before You Start
You need a working function calling implementation so you can test your schemas with real model calls. Read the function calling implementation guide first if you have not set that up. You also need a clear list of the functions you want to expose as tools. Trying to design schemas in the abstract, without concrete functions and use cases, produces schemas that look good on paper but fail in practice.
Step-by-Step Schema Design
The tool name is the first thing the model evaluates when deciding which tool to use. Use verb-noun pairs that communicate exactly what the tool does:
get_order_status, create_support_ticket, search_products, send_notification. The verb tells the model what kind of operation this is (read vs. write vs. search), and the noun tells it what entity is involved. Avoid generic names like process, handle, or run_query that force the model to rely entirely on the description.
Consistency across tool names reduces selection errors. If you have CRUD operations on multiple entities, use the same verb pattern: get_customer, get_order, get_product rather than fetch_customer, retrieve_order, load_product. The model learns the naming convention and applies it when selecting tools, which is harder when every tool uses a different naming style.
The description is the most important field in the schema because it is what the model reads to decide when and why to use the tool. A good description answers four questions: What does this tool do? When should you use it? What does it return? How is it different from similar tools?
Compare these two descriptions for the same tool:
# Bad: vague, tells the model almost nothing useful
"description": "Gets user information"
# Good: precise, helps the model make selection decisions
"description": "Retrieves the full profile for a user by their user ID or email address. Returns name, email, subscription tier, signup date, and recent activity summary. Use this when you need to look up a specific user's account details. For searching users by criteria (name, company, or plan), use search_users instead."The good description tells the model what parameters to pass (user ID or email), what it gets back (specific fields), when to use it (looking up a specific user), and when not to use it (searching by criteria). This level of detail prevents the model from calling get_user when it should call search_users, a common error with vague descriptions.
Every constraint you add to a parameter schema reduces the space of possible errors. The model generates parameter values by reasoning about the schema definition and the user's input. The more constraints the schema provides, the less the model has to guess.
Use enum for any parameter that has a fixed set of valid values. If a status field can only be "open", "in_progress", "resolved", or "closed", define those as an enum rather than allowing any string. The model will always select from the enum values, eliminating the possibility of invalid status strings. Use minimum and maximum for numeric fields with valid ranges. If a quantity must be between 1 and 100, say so in the schema rather than hoping the model infers it. Use format hints for strings that follow standard patterns: "format": "date" for dates, "format": "email" for emails, "format": "uri" for URLs.
Write parameter descriptions that include format expectations and edge case guidance. "Customer email address. Case-insensitive. If the user provides a name instead of an email, ask for the email rather than guessing" is far better than "email". The description should preempt the most common parameter generation errors you have observed.
Every required parameter is a potential failure point. If the model cannot extract a required value from the conversation, it either hallucinates a value, asks the user (adding a turn of latency), or fails entirely. Only require parameters that the function truly cannot execute without. Default everything else.
For example, a search function might accept query, category, date_range, sort_order, and limit. The only truly required parameter is query. Category can default to "all", date_range can default to "all time", sort_order can default to "relevance", and limit can default to 10. With only query as required, the model can call the tool successfully in the common case where the user just asks a question without specifying filters. The optional parameters are available when the user does specify them, but they do not block the basic use case.
Design your test cases to cover the scenarios where tool selection and parameter generation are hardest. Ambiguous queries that could match multiple tools ("look up that order" when you have both
get_order and search_orders). Implicit parameters where the value is not stated directly ("send that to my boss" where "my boss" needs to be resolved to an email). Multi-tool queries where the model needs to chain calls ("find my latest order and check if it shipped"). Edge cases like missing information, invalid formats, and out-of-range values.
Run at least 50 diverse test queries against your schema set and track three metrics: tool selection accuracy (did the model pick the right tool?), parameter accuracy (were the parameter values correct?), and first-attempt success rate (did the tool call succeed without retries?). If any metric is below 90%, the schemas need refinement.
Categorize failures into selection errors (wrong tool chosen), parameter errors (right tool, wrong arguments), and description gaps (the model used a tool in a scenario you did not intend). Each category has a different fix. Selection errors mean tool names or descriptions are not distinct enough. Parameter errors mean type constraints or descriptions are not specific enough. Description gaps mean the usage guidance is incomplete.
Build a feedback loop: log every tool call in production (tool name, arguments, success/failure, any error message), review failures weekly, and refine schemas based on the patterns you observe. Schemas are not write-once artifacts. They evolve as you discover how the model interprets them and where its interpretation diverges from your intent.
Schema Anti-Patterns
Overloaded tools that do multiple things based on an "action" parameter are hard for models to use correctly. A tool called manage_user with an action parameter that can be "create", "update", "delete", or "get" combines four distinct operations into one schema. The model must reason about which parameter combinations are valid for each action, and errors are common. Split overloaded tools into separate, focused tools: create_user, update_user, delete_user, get_user.
Deeply nested parameter objects confuse models. If your tool requires a parameter like filter.criteria[0].field, the model must construct a multi-level JSON object correctly, which increases error rates significantly. Flatten nested structures into top-level parameters where possible: filter_field, filter_operator, filter_value instead of a nested filter object.
Missing descriptions on optional parameters cause the model to either ignore them or hallucinate inappropriate values. Every parameter, even optional ones, should have a description that explains what it does and what the default behavior is when it is omitted.
Design tool schemas that improve with use. Adaptive Recall tracks which tool calls succeed and which fail, building pattern memory that helps your agent make better tool decisions over time.
Try It Free