How to generate an OpenAPI/Swagger spec with Pydantic V2

Pydantic (opens in a new tab) is considered by many API developers to be the best data validation library for Python, and with good reason. By defining an application’s models in Pydantic, developers benefit from a vastly improved development experience, runtime data validation and serialization, and automatic OpenAPI schema generation.

However, many developers don’t realize they can generate OpenAPI schemas from their Pydantic models, which they can then use to create SDKs, documentation, and server stubs.

In this guide, you’ll learn how to create new Pydantic models, generate an OpenAPI schema from them, and use the generated schema to create an SDK for your API. We’ll start with the simplest possible Pydantic model and gradually add more features to show how Pydantic models translate to OpenAPI schemas.

Prerequisites

Before we get started, make sure you have Python (opens in a new tab) 3.8 or higher installed on your machine. Check your Python version by running the following command:

Terminal
python --version

We use Python 3.12.4 in this guide, but any version of Python 3.8 or higher should work.

You can clone our example repository from GitHub (opens in a new tab) to follow along with the code snippets in this guide, or you can create a new Python project and install the required libraries as we go.

Create a New Python Project

First, create a new Python project and install the Pydantic library:

Terminal
# Create and open a new directory for the project
mkdir pydantic-openapi
cd pydantic-openapi
# Create a new virtual environment
python -m venv venv
# Activate the virtual environment
source venv/bin/activate

Install the Required Libraries

We’ll install Pydantic and PyYAML to generate and pretty-print the OpenAPI schema:

Terminal
# Install the Pydantic library
pip install pydantic
# Install the PyYAML library for pretty-printing the OpenAPI schema
pip install pyyaml

Pydantic to OpenAPI Schema Walkthrough

Let’s follow a step-by-step process to generate an OpenAPI schema from a Pydantic model without any additional libraries.

Define a Simple Pydantic Model

Create a new Python file called models.py and define a simple Pydantic model.

In this example, we define a Pydantic model called Pet with three fields: id, name, and breed. The id field is an integer, and the name and breed fields are strings.


Generate JSON Schema for the Pydantic Model

Add a new function called print_json_schema to the models.py file that prints the JSON schema for the Pet model.

This function uses the model_json_schema method provided by Pydantic to generate the JSON schema, which Python then prints to the console as YAML. We use YAML for readability, but the output is still a valid JSON schema.


Run python models.py to generate the JSON schema for the Pet model and print it as YAML:


Run python models.py to generate the JSON schema for both the Pet and Owner models and print it as YAML:


The generated schema includes definitions for both the Pet and Owner models. The Owner model has a reference to the Pet model, indicating that the Owner model contains a list of Pet objects.

Note that the root of the schema includes a $defs key that contains the definitions for both models, and the Owner model references the Pet model using the $ref keyword.


Next, we’ll update the print_json_schema function to print a JSON schema that resembles an OpenAPI schema’s components section.


Run python models.py to generate the OpenAPI schema for both the Pet and Owner models.

The generated OpenAPI schema includes the components section, with definitions for both the Pet and Owner models.


The JSON Schema we generated resembles an OpenAPI schema’s components section, but to generate a valid OpenAPI schema, we need to add the openapi and info sections.

Edit the print_json_schema function in models.py to include the openapi and info sections in the generated OpenAPI schema.


Run python models.py to generate the complete OpenAPI schema for both the Pet and Owner models.

The generated OpenAPI schema includes the openapi, info, and components sections with definitions for both the Pet and Owner models.


Now we have a complete OpenAPI document that we can use to generate SDK clients for our API. However, the generated OpenAPI schema does not contain descriptions or example values for the models. We can add these details to the Pydantic models to improve the generated OpenAPI schema.


If we run python models.py, we see that our Owner schema now includes a description field, derived from the docstring we added to the Owner Pydantic model.


If we run python models.py, we see that our Pet schema now includes descriptions for each field.


If we run python models.py, we see that our Pet schema now includes example values for each field.


If we run python models.py, we see that the breed field in the Pet schema now has two types: string and null, and it has been removed from the required list. Only id and name are required fields after marking breed as optional.


In our generated OpenAPI schema, we have a new pet_type field in the Pet schema.


This enum is represented as a separate schema in the OpenAPI document.

models.py
from pydantic import BaseModel
class Pet(BaseModel):
id: int
name: str
breed: str

Adding Paths and Operations to the OpenAPI Schema

Now that we have generated an OpenAPI schema from our Pydantic models, we can use the schema to generate SDK clients for our API.

However, the OpenAPI document we generated, while valid, does not include the paths section, which defines the API endpoints and operations.

When using Pydantic with FastAPI, you can define your API endpoints and operations directly in your FastAPI application. FastAPI automatically generates the OpenAPI schema for your API, including the paths section.

Let’s see how we can define API endpoints and operations in a framework-agnostic way and add them to the OpenAPI schema.

Install openapi-pydantic

We’ll use the openapi-pydantic (opens in a new tab) library to define a complete OpenAPI schema with paths and operations.

The benefit of using openapi-pydantic is that it allows you to define the API endpoints and operations in a Python dictionary, while still getting the benefit of Pydantic’s IDE support and type checking.

The library includes convenience methods to convert Pydantic models to OpenAPI schema components and add them to the OpenAPI schema.

Install the openapi-pydantic library:

Terminal
pip install openapi-pydantic

Create a new Python file called api.py and define the API endpoints and operations using the openapi-pydantic library.

The api.py file saves the complete OpenAPI schema to a file named openapi.yaml.


Run python api.py to generate the complete OpenAPI schema with paths and operations and save it to a file named openapi.yaml.


Our api.py file imports Pet and Owner models from models.py.


We’ll use the models.py file from the previous steps to define the Pydantic models for Pet and Owner.


In api.py, we then define two response schemas as Pydantic models: PetsResponse and OwnersResponse.

Defining response schemas as Pydantic models allows us to reuse them in multiple operations, and to use them for validation and serialization in our API request handlers.


We’ll start by defining a function called construct_base_open_api that returns an OpenAPI object with the base configuration for our API.

The function defines the API title, version, and servers, and includes the paths for the /pets, /pets/{pet_id}, and /owners endpoints.


The /pets path includes two operations: GET to list all pets and POST to create a pet.

The GET operation returns a list of pets using the PetsResponse schema.


Note that we added operationId and description fields to the operations to provide additional information about the operation.

Clear operation IDs and descriptions help API users understand the purpose of each operation and allow SDK generators to create more informative client code.


The POST operation creates a pet using the Pet schema as the request body and returns the created pet using the Pet schema.

We use the PydanticSchema class from openapi-pydantic to reference the Pydantic model in the OpenAPI schema.

In a real-world application, you would likely not include the pet’s ID in the request body as the server would generate the ID, but for simplicity, we include it here.


This translates to the following OpenAPI operation:


The /pets/{pet_id} path includes a GET operation to get a pet by ID.

The operation includes a path parameter pet_id to specify the ID of the pet to retrieve.


The GET operation’s parameters field includes the path parameter pet_id with a description, required flag, and schema definition.

The responses field includes a 200 response with the Pet schema as the response body.


This translates to the following OpenAPI operation.

Note how the generated schema closely resembles the Pydantic model.


We’ll leave the rest of the openapi.yaml file, as it is similar to the components generated in the previous section.

api.py
from typing import List
import yaml
from pydantic import BaseModel, Field
from openapi_pydantic.v3 import OpenAPI, Info, PathItem, Operation
from openapi_pydantic.util import PydanticSchema, construct_open_api_with_schema_class
from models import Pet, Owner
class PetsResponse(BaseModel):
"""A response containing a list of pets"""
pets: List[Pet] = Field(..., description="List of pets")
class OwnersResponse(BaseModel):
"""A response containing a list of owners"""
owners: List[Owner] = Field(..., description="List of owners")
def construct_base_open_api() -> OpenAPI:
return OpenAPI(
openapi="3.1.0",
info=Info(
title="Pet Sitter API",
version="0.0.1",
),
servers=[
{
"url": "http://127.0.0.1:4010",
"description": "Local prism server",
},
],
paths={
"/pets": PathItem(
get=Operation(
operationId="listPets",
description="List all pets",
responses={
"200": {
"description": "A list of pets",
"content": {
"application/json": {
"schema": PydanticSchema(schema_class=PetsResponse)
}
},
}
},
),
post=Operation(
operationId="createPet",
description="Create a pet",
requestBody={
"content": {
"application/json": {
"schema": PydanticSchema(schema_class=Pet)
}
}
},
responses={
"201": {
"description": "Pet created",
"content": {
"application/json": {
"schema": PydanticSchema(schema_class=Pet)
}
},
}
},
),
),
"/pets/{pet_id}": PathItem(
get=Operation(
operationId="getPetById",
description="Get a pet by ID",
parameters=[
{
"name": "pet_id",
"in": "path",
"description": "ID of pet to return",
"required": True,
"schema": {
"type": "integer",
"format": "int64",
},
"examples": {"1": {"value": 1}},
},
],
responses={
"200": {
"description": "A pet",
"content": {
"application/json": {
"schema": PydanticSchema(schema_class=Pet)
}
},
}
},
),
),
"/owners": PathItem(
get=Operation(
operationId="listOwners",
description="List all owners",
responses={
"200": {
"description": "A list of owners",
"content": {
"application/json": {
"schema": PydanticSchema(
schema_class=OwnersResponse
)
}
},
}
},
),
),
},
)
open_api = construct_base_open_api()
open_api = construct_open_api_with_schema_class(open_api)
if __name__ == "__main__":
with open("openapi.yaml", "w") as file:
file.write(
yaml.dump(
open_api.model_dump(
by_alias=True,
mode="json",
exclude_none=True,
exclude_unset=True,
),
sort_keys=False,
)
)

Generating an SDK from the OpenAPI Schema

Now that we have a complete OpenAPI schema with paths and operations, we can use it to generate an SDK client for our API.

Prerequisites for SDK Generation

Install Speakeasy by following the Speakeasy installation instructions.

On macOS, you can install Speakeasy using Homebrew:

Terminal
brew install speakeasy-api/homebrew-tap/speakeasy

Authenticate with Speakeasy using the following command:

Terminal
speakeasy auth login

Generate an SDK Using Speakeasy

Run the following command to generate an SDK from the openapi.yaml file:

Terminal
speakeasy quickstart

Follow the onscreen prompts to provide the necessary configuration details for your new SDK such as the name, schema location and output path. Enter openapi.yaml when prompted for the OpenAPI document location and select TypeScript when prompted for which language you would like to generate.

Adding Speakeasy Extensions to the OpenAPI Schema

Speakeasy uses OpenAPI extensions to provide additional information for generating SDKs.

We can add extensions using OpenAPI Overlays, which are YAML files that Speakeasy overlays on top of the OpenAPI schema.

Alternatively, you can add extensions directly to the OpenAPI schema using the x- prefix.

For example, you can add the x-speakeasy-retries extension to have Speakeasy generate retry logic in the SDK.

Import the Dict and Any types from the typing module in api.py, and ConfigDict from pydantic.

We’ll use these types to define the x-speakeasy-retries extension in the OpenAPI schema.


In the OpenAPIwithRetries class, we define the x-speakeasy-retries extension.

Note that we need to use the alias parameter to define the extension with the x- prefix, then allow ourselves to use the xSpeakeasyRetries attribute in the class by setting populate_by_name=True in the model_config.


We then update the construct_base_open_api function to return an OpenAPIwithRetries object.


Add xSpeakeasyRetries to the OpenAPIwithRetries object in the construct_base_open_api function.


This translates to the following OpenAPI schema:

api.py
from typing import List, Dict, Any
import yaml
from pydantic import BaseModel, Field, ConfigDict
from openapi_pydantic.v3 import OpenAPI, Info, PathItem, Operation
from openapi_pydantic.util import PydanticSchema, construct_open_api_with_schema_class
from models import Pet, Owner
class PetsResponse(BaseModel):
"""A response containing a list of pets"""
pets: List[Pet] = Field(..., description="List of pets")
class OwnersResponse(BaseModel):
"""A response containing a list of owners"""
owners: List[Owner] = Field(..., description="List of owners")
class OpenAPIwithRetries(OpenAPI):
"""
OpenAPI with xSpeakeasyRetries extension
This class extends the OpenAPI model to include the x-speakeasy-retries extension.
"""
xSpeakeasyRetries: Dict[str, Any] = Field(
...,
description="Retry configuration for the API",
alias="x-speakeasy-retries",
)
model_config = ConfigDict(
populate_by_name=True,
)
def construct_base_open_api() -> OpenAPIwithRetries:
return OpenAPIwithRetries(
openapi="3.1.0",
info=Info(
title="Pet Sitter API",
version="0.0.1",
),
servers=[
{
"url": "http://127.0.0.1:4010",
"description": "Local prism server",
},
],
xSpeakeasyRetries={
"strategy": "backoff",
"backoff": {
"initialInterval": 500,
"maxInterval": 60000,
"maxElapsedTime": 3600000,
"exponent": 1.5,
},
"statusCodes": ["5XX"],
"retryConnectionErrors": True,
},
paths={
"/pets": PathItem(
get=Operation(
operationId="listPets",
description="List all pets",
responses={
"200": {
"description": "A list of pets",
"content": {
"application/json": {
"schema": PydanticSchema(schema_class=PetsResponse)
}
},
}
},
),
post=Operation(
operationId="createPet",
description="Create a pet",
requestBody={
"content": {
"application/json": {
"schema": PydanticSchema(schema_class=Pet)
}
}
},
responses={
"201": {
"description": "Pet created",
"content": {
"application/json": {
"schema": PydanticSchema(schema_class=Pet)
}
},
}
},
),
),
"/pets/{pet_id}": PathItem(
get=Operation(
operationId="getPetById",
description="Get a pet by ID",
parameters=[
{
"name": "pet_id",
"in": "path",
"description": "ID of pet to return",
"required": True,
"schema": {
"type": "integer",
"format": "int64",
},
"examples": {"1": {"value": 1}},
},
],
responses={
"200": {
"description": "A pet",
"content": {
"application/json": {
"schema": PydanticSchema(schema_class=Pet)
}
},
}
},
),
),
"/owners": PathItem(
get=Operation(
operationId="listOwners",
description="List all owners",
responses={
"200": {
"description": "A list of owners",
"content": {
"application/json": {
"schema": PydanticSchema(
schema_class=OwnersResponse
)
}
},
}
},
),
),
},
)
open_api = construct_base_open_api()
open_api = construct_open_api_with_schema_class(open_api)
if __name__ == "__main__":
with open("openapi.yaml", "w") as file:
file.write(
yaml.dump(
open_api.model_dump(
by_alias=True,
mode="json",
exclude_none=True,
exclude_unset=True,
),
sort_keys=False,
)
)

Add Tags to the OpenAPI Schema

To group operations in the OpenAPI schema, you can use tags. This also allows Speakeasy to structure the generated SDK code and documentation logically.

Add a tags field to the OpenAPIwithRetries object, then add a tags field to each operation in the construct_base_open_api function:

api.py
def construct_base_open_api() -> OpenAPIwithRetries:
return OpenAPIwithRetries(
# ...
tags=[
{
"name": "pets",
"description": "Operations about pets",
},
{
"name": "owners",
"description": "Operations about owners",
},
],
paths={
"/pets": PathItem(
get=Operation(
# ...
tags=["pets"],
),
# ...
),
# ...
},
# ...
)

Run python api.py to update the openapi.yaml file with the tags field, then regenerate the SDK using Speakeasy.

Terminal
python api.py
speakeasy quickstart

Speakeasy will detect the changes to your OpenAPI schema, generate the SDK with the updated tags, and automatically increment the SDK’s version number.

Take a look at the generated SDK to see how Speakeasy groups operations by tags.

We Can Help Get Your Pydantic Models Ready for SDK Generation

In this tutorial, we learned how to generate an OpenAPI schema from Pydantic models and use it to generate an SDK client using Speakeasy.

If you would like to discuss how to get your Pydantic models ready for SDK generation, give us feedback, or shoot the breeze about all things OpenAPI and SDKs, join our Slack (opens in a new tab).