OpenAPI Agent
This section describes how to build an Agent which can work on an OpenAPI specification.
In this guide, we build an OpenAPI Agent which can parse an OpenAPI specification, and then take actions accordingly. OpenAPI Specification (formerly Swagger Specification) is an API description format for REST APIs. An OpenAPI file allows you to describe your entire API, including:
- Available endpoints (/users) and operations on each endpoint (GET /users, POST /users)
- Operation parameters Input and output for each operation
- Authentication methods
- Contact information, license, terms of use, and other information.
The format is easy to learn and readable to both humans and machines. It is often used by backend teams to explain their endpoints to client-side teams.
Learn more about the OpenAPI Specification in its official docs.
Why does this matter? This basically allows you to create an Agent which can help non-technical users interact with your API, kind of like an Assistant to your application's admin panel. As long as the OpenAPI spec is well-defined, your Agent can display data analytics, crunch numbers, and performs actions on your behalf.
The source code for this example is available on GitHub.
Video Tutorial
You can follow along this example by watching the video below:
Guide
Step 1: Setup usdk
Follow Setup the SDK to set up NodeJS and usdk
.
Step 2: Create an Agent (shortcut)
We can skip the Interview process and directly generate an Agent with a prompt by running:
This will directly scaffold an Agent for you in <your-agent-directory>
. Learn more
Step 3: Setup Zod and openapi-zod-client
We'll first need an OpenAPI specification. We'll use Swagger's official example, the Pet Store API, as an example.
The OpenAPI specification is often served as a YAML or JSON file. We get this from their GitHub (although, in a real scenario, it may be available in one of the server's routes).
We will use the openapi-zod-client
package from NPM to convert this specification to a bunch of Zod schemas. A Zod schema is just a more descriptive way to specify a data type. It also generates a Zodios client, which is, simply put, an easy way to call the endpoints specified in the OpenAPI specification.
First install the relevant packages:
Add a script in your package.json
which calls the openapi-zod-client
CLI tool (feel free to tweak the options by checking out what's available):
Step 4: Create a Dummy API
Run the script we added in Step 3:
This runs openapi-zod-client
, which, if successful, uses the OpenAPI specification to create the API in Zod and Zodios, in the api.ts
file.
The api.ts
file contains all the Zod schemas, and endpoints as an array, and exports a Zodios client.
Clean up the generated code by:
- Removing
partial
andoptional
keywords. - Commenting out endpoints that are unsupported, such as file uploads.
Step 4: Import Dependencies
Before creating your OpenAPI agent, ensure you have imported the relevant dependencies. Update the generated files to include only the required ones and clean up unnecessary code. For example:
Step 5: Build the OpenAPI Action Generator
Define a component that maps OpenAPI endpoints to actions for your agent. Each action will have a schema, description, and handler. Here’s how to start:
Here's a breakdown the code above:
Purpose
The OpenAPI Action Generator is designed to:
- Clean the data of any irregularities: Ensure consistency in handling query and body parameters from the OpenAPI spec.
- Reduce the size of the action: Limit the number of response items displayed for clarity during testing.
- Generalize the schema across all actions: Use a standard schema (
zod
objects) to validate input parameters dynamically for every endpoint.
Steps and Explanations
1. Retrieve API endpoints from the client
- Why?: Extract the list of endpoints from the generated API client. This allows dynamic mapping of endpoints into actionable components for the agent.
2. Reduce and clean endpoint parameters
- Why?: This step consolidates query and body parameters into separate objects for clarity.
- We currently handle only query and body parameters, to keep things simple.
z.object({})
initializes empty objects to represent schemas.
3. Check and define schema dynamically
- Why?: Depending on the endpoint, it might only have query parameters, body parameters, or neither.
- If both exist, they are wrapped into a single schema object.
- Otherwise, the body is kept in the root. This reduces the data size for the Agent.
4. Create actions for each endpoint
- Why?: Define a unique action for each endpoint to:
- Bind the schema for validation.
- Include descriptive metadata (
description
andexamples
) for easier testing and understanding. In our example of the Pet Store API, we didn't have any examples, but there may be examples in real-world use cases. - Assign a handler for executing the API request.
5. Handle API requests and responses
- Why?:
- Dynamically prepare the request arguments (
query
orbody
) based on the input. - Limit the response to a maximum of 10 items so that the AI's prompt isn't overloaded with information. Use this as you please.
- Dynamically prepare the request arguments (
6. Generate a monologue for the agent
- Why?: Summarize the action for the agent by displaying:
- The endpoint called.
- The request parameters sent.
- The response data received.
7. Wrap actions within the agent component
- Why?: Combine all dynamically generated actions into a cohesive agent interface. The
Prompt
provides instructions to the agent about its behavior when executing actions.
Step 6: (optional) Test the OpenAPI Agent
Run usdk chat
to test the Agent in your CLI.
You can start by asking:
What operations can you perform?
It (hopefully) will be very straightforward in telling you what it can do.
Now try asking it:
Can you create a dog called Rover for me?
It might take a bit, and will return a response. Plug that response in the actual Pet Store's GET pet/{petId}
API.
Does it work? Join our Discord community and tell us; we'd love to know!
Further Challenges
- Support Path arguments (e.g.
GET pets/{petId}
) - Allow the Agent to change query parameters in pagination, in order to get fine-grained results
- Try to get the Agent to chain its actions