How to generate an OpenAPI/Swagger spec with Pydantic V2
Pydantic (opens in a new tab) is considered by many API developers to be the best data validation library for Python, and with good reason. By defining an application’s models in Pydantic, developers benefit from a vastly improved development experience, runtime data validation and serialization, and automatic OpenAPI schema generation.
However, many developers don’t realize they can generate OpenAPI schemas from their Pydantic models, which they can then use to create SDKs, documentation, and server stubs.
In this guide, you’ll learn how to create new Pydantic models, generate an OpenAPI schema from them, and use the generated schema to create an SDK for your API. We’ll start with the simplest possible Pydantic model and gradually add more features to show how Pydantic models translate to OpenAPI schemas.
Prerequisites
Before we get started, make sure you have Python (opens in a new tab) 3.8 or higher installed on your machine. Check your Python version by running the following command:
python --version
We use Python 3.12.4 in this guide, but any version of Python 3.8 or higher should work.
You can clone our example repository from GitHub (opens in a new tab) to follow along with the code snippets in this guide, or you can create a new Python project and install the required libraries as we go.
Create a New Python Project
First, create a new Python project and install the Pydantic library:
# Create and open a new directory for the projectmkdir pydantic-openapicd pydantic-openapi# Create a new virtual environmentpython -m venv venv# Activate the virtual environmentsource venv/bin/activate
Install the Required Libraries
We’ll install Pydantic and PyYAML to generate and pretty-print the OpenAPI schema:
# Install the Pydantic librarypip install pydantic# Install the PyYAML library for pretty-printing the OpenAPI schemapip install pyyaml
Pydantic to OpenAPI Schema Walkthrough
Let’s follow a step-by-step process to generate an OpenAPI schema from a Pydantic model without any additional libraries.
Define a Simple Pydantic Model
Create a new Python file called models.py
and define a simple Pydantic model.
In this example, we define a Pydantic model called Pet
with three fields: id
, name
, and breed
. The id
field is an integer, and the name
and breed
fields are strings.
Generate JSON Schema for the Pydantic Model
Add a new function called print_json_schema
to the models.py
file that prints the JSON schema for the Pet
model.
This function uses the model_json_schema
method provided by Pydantic to generate the JSON schema, which Python then prints to the console as YAML. We use YAML for readability, but the output is still a valid JSON schema.
Run python models.py
to generate the JSON schema for the Pet
model and print it as YAML:
Run python models.py
to generate the JSON schema for both the Pet
and Owner
models and print it as YAML:
The generated schema includes definitions for both the Pet
and Owner
models. The Owner
model has a reference to the Pet
model, indicating that the Owner
model contains a list of Pet
objects.
Note that the root of the schema includes a $defs
key that contains the definitions for both models, and the Owner
model references the Pet
model using the $ref
keyword.
Next, we’ll update the print_json_schema
function to print a JSON schema that resembles an OpenAPI schema’s components
section.
Run python models.py
to generate the OpenAPI schema for both the Pet
and Owner
models.
The generated OpenAPI schema includes the components
section, with definitions for both the Pet
and Owner
models.
The JSON Schema we generated resembles an OpenAPI schema’s components
section, but to generate a valid OpenAPI schema, we need to add the openapi
and info
sections.
Edit the print_json_schema
function in models.py
to include the openapi
and info
sections in the generated OpenAPI schema.
Run python models.py
to generate the complete OpenAPI schema for both the Pet
and Owner
models.
The generated OpenAPI schema includes the openapi
, info
, and components
sections with definitions for both the Pet
and Owner
models.
Now we have a complete OpenAPI document that we can use to generate SDK clients for our API. However, the generated OpenAPI schema does not contain descriptions or example values for the models. We can add these details to the Pydantic models to improve the generated OpenAPI schema.
If we run python models.py
, we see that our Owner
schema now includes a description field, derived from the docstring we added to the Owner
Pydantic model.
If we run python models.py
, we see that our Pet
schema now includes descriptions for each field.
If we run python models.py
, we see that our Pet
schema now includes example values for each field.
If we run python models.py
, we see that the breed
field in the Pet
schema now has two types: string
and null
, and it has been removed from the required
list. Only id
and name
are required fields after marking breed
as optional.
In our generated OpenAPI schema, we have a new pet_type
field in the Pet
schema.
This enum is represented as a separate schema in the OpenAPI document.
from pydantic import BaseModelclass Pet(BaseModel):id: intname: strbreed: str
Adding Paths and Operations to the OpenAPI Schema
Now that we have generated an OpenAPI schema from our Pydantic models, we can use the schema to generate SDK clients for our API.
However, the OpenAPI document we generated, while valid, does not include the paths
section, which defines the API endpoints and operations.
When using Pydantic with FastAPI, you can define your API endpoints and operations directly in your FastAPI application. FastAPI automatically generates the OpenAPI schema for your API, including the paths
section.
Let’s see how we can define API endpoints and operations in a framework-agnostic way and add them to the OpenAPI schema.
Install openapi-pydantic
We’ll use the openapi-pydantic
(opens in a new tab) library to define a complete OpenAPI schema with paths and operations.
The benefit of using openapi-pydantic
is that it allows you to define the API endpoints and operations in a Python dictionary, while still getting the benefit of Pydantic’s IDE support and type checking.
The library includes convenience methods to convert Pydantic models to OpenAPI schema components and add them to the OpenAPI schema.
Install the openapi-pydantic
library:
pip install openapi-pydantic
Create a new Python file called api.py
and define the API endpoints and operations using the openapi-pydantic
library.
The api.py
file saves the complete OpenAPI schema to a file named openapi.yaml
.
Run python api.py
to generate the complete OpenAPI schema with paths and operations and save it to a file named openapi.yaml
.
Our api.py
file imports Pet
and Owner
models from models.py
.
We’ll use the models.py
file from the previous steps to define the Pydantic models for Pet
and Owner
.
In api.py
, we then define two response schemas as Pydantic models: PetsResponse
and OwnersResponse
.
Defining response schemas as Pydantic models allows us to reuse them in multiple operations, and to use them for validation and serialization in our API request handlers.
We’ll start by defining a function called construct_base_open_api
that returns an OpenAPI
object with the base configuration for our API.
The function defines the API title, version, and servers, and includes the paths for the /pets
, /pets/{pet_id}
, and /owners
endpoints.
The /pets
path includes two operations: GET
to list all pets and POST
to create a pet.
The GET
operation returns a list of pets using the PetsResponse
schema.
Note that we added operationId
and description
fields to the operations to provide additional information about the operation.
Clear operation IDs and descriptions help API users understand the purpose of each operation and allow SDK generators to create more informative client code.
The POST
operation creates a pet using the Pet
schema as the request body and returns the created pet using the Pet
schema.
We use the PydanticSchema
class from openapi-pydantic
to reference the Pydantic model in the OpenAPI schema.
In a real-world application, you would likely not include the pet’s ID in the request body as the server would generate the ID, but for simplicity, we include it here.
This translates to the following OpenAPI operation:
The /pets/{pet_id}
path includes a GET
operation to get a pet by ID.
The operation includes a path parameter pet_id
to specify the ID of the pet to retrieve.
The GET
operation’s parameters
field includes the path parameter pet_id
with a description, required flag, and schema definition.
The responses
field includes a 200
response with the Pet
schema as the response body.
This translates to the following OpenAPI operation.
Note how the generated schema closely resembles the Pydantic model.
We’ll leave the rest of the openapi.yaml
file, as it is similar to the components generated in the previous section.
from typing import Listimport yamlfrom pydantic import BaseModel, Fieldfrom openapi_pydantic.v3 import OpenAPI, Info, PathItem, Operationfrom openapi_pydantic.util import PydanticSchema, construct_open_api_with_schema_classfrom models import Pet, Ownerclass PetsResponse(BaseModel):"""A response containing a list of pets"""pets: List[Pet] = Field(..., description="List of pets")class OwnersResponse(BaseModel):"""A response containing a list of owners"""owners: List[Owner] = Field(..., description="List of owners")def construct_base_open_api() -> OpenAPI:return OpenAPI(openapi="3.1.0",info=Info(title="Pet Sitter API",version="0.0.1",),servers=[{"url": "http://127.0.0.1:4010","description": "Local prism server",},],paths={"/pets": PathItem(get=Operation(operationId="listPets",description="List all pets",responses={"200": {"description": "A list of pets","content": {"application/json": {"schema": PydanticSchema(schema_class=PetsResponse)}},}},),post=Operation(operationId="createPet",description="Create a pet",requestBody={"content": {"application/json": {"schema": PydanticSchema(schema_class=Pet)}}},responses={"201": {"description": "Pet created","content": {"application/json": {"schema": PydanticSchema(schema_class=Pet)}},}},),),"/pets/{pet_id}": PathItem(get=Operation(operationId="getPetById",description="Get a pet by ID",parameters=[{"name": "pet_id","in": "path","description": "ID of pet to return","required": True,"schema": {"type": "integer","format": "int64",},"examples": {"1": {"value": 1}},},],responses={"200": {"description": "A pet","content": {"application/json": {"schema": PydanticSchema(schema_class=Pet)}},}},),),"/owners": PathItem(get=Operation(operationId="listOwners",description="List all owners",responses={"200": {"description": "A list of owners","content": {"application/json": {"schema": PydanticSchema(schema_class=OwnersResponse)}},}},),),},)open_api = construct_base_open_api()open_api = construct_open_api_with_schema_class(open_api)if __name__ == "__main__":with open("openapi.yaml", "w") as file:file.write(yaml.dump(open_api.model_dump(by_alias=True,mode="json",exclude_none=True,exclude_unset=True,),sort_keys=False,))
Generating an SDK from the OpenAPI Schema
Now that we have a complete OpenAPI schema with paths and operations, we can use it to generate an SDK client for our API.
Prerequisites for SDK Generation
Install Speakeasy by following the Speakeasy installation instructions.
On macOS, you can install Speakeasy using Homebrew:
brew install speakeasy-api/homebrew-tap/speakeasy
Authenticate with Speakeasy using the following command:
speakeasy auth login
Generate an SDK Using Speakeasy
Run the following command to generate an SDK from the openapi.yaml
file:
speakeasy quickstart
Follow the onscreen prompts to provide the necessary configuration details for your new SDK such as the name, schema location and output path. Enter openapi.yaml
when prompted for the OpenAPI document location and select TypeScript when prompted for which language you would like to generate.
Adding Speakeasy Extensions to the OpenAPI Schema
Speakeasy uses OpenAPI extensions to provide additional information for generating SDKs.
We can add extensions using OpenAPI Overlays, which are YAML files that Speakeasy overlays on top of the OpenAPI schema.
Alternatively, you can add extensions directly to the OpenAPI schema using the x-
prefix.
For example, you can add the x-speakeasy-retries
extension to have Speakeasy generate retry logic in the SDK.
Import the Dict
and Any
types from the typing
module in api.py
, and ConfigDict
from pydantic
.
We’ll use these types to define the x-speakeasy-retries
extension in the OpenAPI schema.
In the OpenAPIwithRetries
class, we define the x-speakeasy-retries
extension.
Note that we need to use the alias
parameter to define the extension with the x-
prefix, then allow ourselves to use the xSpeakeasyRetries
attribute in the class by setting populate_by_name=True
in the model_config
.
We then update the construct_base_open_api
function to return an OpenAPIwithRetries
object.
Add xSpeakeasyRetries
to the OpenAPIwithRetries
object in the construct_base_open_api
function.
This translates to the following OpenAPI schema:
from typing import List, Dict, Anyimport yamlfrom pydantic import BaseModel, Field, ConfigDictfrom openapi_pydantic.v3 import OpenAPI, Info, PathItem, Operationfrom openapi_pydantic.util import PydanticSchema, construct_open_api_with_schema_classfrom models import Pet, Ownerclass PetsResponse(BaseModel):"""A response containing a list of pets"""pets: List[Pet] = Field(..., description="List of pets")class OwnersResponse(BaseModel):"""A response containing a list of owners"""owners: List[Owner] = Field(..., description="List of owners")class OpenAPIwithRetries(OpenAPI):"""OpenAPI with xSpeakeasyRetries extensionThis class extends the OpenAPI model to include the x-speakeasy-retries extension."""xSpeakeasyRetries: Dict[str, Any] = Field(...,description="Retry configuration for the API",alias="x-speakeasy-retries",)model_config = ConfigDict(populate_by_name=True,)def construct_base_open_api() -> OpenAPIwithRetries:return OpenAPIwithRetries(openapi="3.1.0",info=Info(title="Pet Sitter API",version="0.0.1",),servers=[{"url": "http://127.0.0.1:4010","description": "Local prism server",},],xSpeakeasyRetries={"strategy": "backoff","backoff": {"initialInterval": 500,"maxInterval": 60000,"maxElapsedTime": 3600000,"exponent": 1.5,},"statusCodes": ["5XX"],"retryConnectionErrors": True,},paths={"/pets": PathItem(get=Operation(operationId="listPets",description="List all pets",responses={"200": {"description": "A list of pets","content": {"application/json": {"schema": PydanticSchema(schema_class=PetsResponse)}},}},),post=Operation(operationId="createPet",description="Create a pet",requestBody={"content": {"application/json": {"schema": PydanticSchema(schema_class=Pet)}}},responses={"201": {"description": "Pet created","content": {"application/json": {"schema": PydanticSchema(schema_class=Pet)}},}},),),"/pets/{pet_id}": PathItem(get=Operation(operationId="getPetById",description="Get a pet by ID",parameters=[{"name": "pet_id","in": "path","description": "ID of pet to return","required": True,"schema": {"type": "integer","format": "int64",},"examples": {"1": {"value": 1}},},],responses={"200": {"description": "A pet","content": {"application/json": {"schema": PydanticSchema(schema_class=Pet)}},}},),),"/owners": PathItem(get=Operation(operationId="listOwners",description="List all owners",responses={"200": {"description": "A list of owners","content": {"application/json": {"schema": PydanticSchema(schema_class=OwnersResponse)}},}},),),},)open_api = construct_base_open_api()open_api = construct_open_api_with_schema_class(open_api)if __name__ == "__main__":with open("openapi.yaml", "w") as file:file.write(yaml.dump(open_api.model_dump(by_alias=True,mode="json",exclude_none=True,exclude_unset=True,),sort_keys=False,))
Add Tags to the OpenAPI Schema
To group operations in the OpenAPI schema, you can use tags. This also allows Speakeasy to structure the generated SDK code and documentation logically.
Add a tags
field to the OpenAPIwithRetries
object, then add a tags
field to each operation in the construct_base_open_api
function:
def construct_base_open_api() -> OpenAPIwithRetries:return OpenAPIwithRetries(# ...tags=[{"name": "pets","description": "Operations about pets",},{"name": "owners","description": "Operations about owners",},],paths={"/pets": PathItem(get=Operation(# ...tags=["pets"],),# ...),# ...},# ...)
Run python api.py
to update the openapi.yaml
file with the tags
field, then regenerate the SDK using Speakeasy.
python api.pyspeakeasy quickstart
Speakeasy will detect the changes to your OpenAPI schema, generate the SDK with the updated tags, and automatically increment the SDK’s version number.
Take a look at the generated SDK to see how Speakeasy groups operations by tags.
We Can Help Get Your Pydantic Models Ready for SDK Generation
In this tutorial, we learned how to generate an OpenAPI schema from Pydantic models and use it to generate an SDK client using Speakeasy.
If you would like to discuss how to get your Pydantic models ready for SDK generation, give us feedback, or shoot the breeze about all things OpenAPI and SDKs, join our Slack (opens in a new tab).