Enabling Event-Streaming Operations
Server-sent events (SSE) is a core web feature that provides servers with a low overhead solution to push real-time events to the client when they become available. SSE can be used to stream chat completions from a large language model, real-time stock prices, and sensor readings to clients.
SSE is similar to WebSockets in that it uses a persistent connection but differs in that it is unidirectional - only the server sends events. SSE is simpler to implement in many existing backend HTTP frameworks.
INFO
Speakeasy makes it easy to build SSE into generated SDKs without vendor
extensions or heuristics. Leverage SSE by modeling SSE streams as
text/event-stream
responses with pure OpenAPI.
Here’s a short example of using an SDK to chat with an LLM and read its response as a stream:
import { SDK } from '@speakeasy/sdk';const sdk = new SDK()const response = await sdk.chat.create({prompt: "What are the top 3 French cheeses by consumption?"})for await (const event of response.chatStream) {process.stdout.write(event.data);}
INFO
The SSE feature is currently supported in TypeScript, Python, Go, and Java. Let us know if you’d like to see support for other languages.
Modeling SSE in OpenAPI
To implement SSE in a generated SDKs, model an API endpoint that serves an event
stream in an OpenAPI document. Each server-sent event can contain up to four
types of fields: id
, event
, data
, and retry
.
Basic Implementation
The example below illustrates an operation that streams events containing only a data
field that holds string content:
paths:/chat:post:summary: Create a chat completion from a promptoperationId: createtags: [chat]requestBody:required: truecontent:application/json:schema:$ref: "#/components/schemas/ChatRequest"responses:"200":description: Chat completion createdcontent:text/event-stream:schema:$ref: "#/components/schemas/ChatStream"components:schemas:ChatRequest:type: objectrequired: [prompt]properties:prompt:type: stringChatStream:description: A server-sent event containing chat completion contenttype: objectrequired: [data]properties:data:type: string
When data
is a JSON Object
SSE implementation isn’t limited to string data. If data
is specified as an
object instead of a string, then SDKs will assume the field will contain JSON
content. Raw data received from the server will be deserialized into an object
for the application code to consume.
components:schemas:ChatStream:description: A server-sent event containing chat completion contenttype: objectrequired: [data]properties:data:type: objectproperties:content:type: stringmodel:type: stringenum: ["foo-gpt-tiny", "foo-gpt-small"]created:type: integer
The Speakeasy-generated TypeScript SDK for the example above will allow users to access this object:
for await (const event of response.chatStream) {const { content, model, created } = event.data;process.stdout.write(content);}
Handling Multiple Event Types
Other streaming APIs send multiple types of events with the id
and event
fields. These event types can be described as a union (oneOf
) with the event
field acting as a discriminator:
components:schemas:ChatStream:oneOf:- $ref: "#/components/schemas/HeartbeatEvent"- $ref: "#/components/schemas/ChatEvent"discriminator:propertyName: eventmapping:ping: "#/components/schemas/HeartbeatEvent"completion: "#/components/schemas/ChatEvent"HeartbeatEvent:description: A server-sent event indicating that the server is still processing the requesttype: objectrequired: [event]properties:event:type: stringconst: "ping"ChatEvent:description: A server-sent event containing chat completion contenttype: objectrequired: [id, event, data]properties:id:type: stringevent:type: stringconst: completiondata:type: objectrequired: [content]properties:content:type: string
Endpoints with Multiple Response Types
For APIs that handle both JSON responses and streaming events, use URL fragments to define separate paths for each response type. Each fragment maps to a specific behavior—either returning a complete JSON response or streaming data. This approach allows Speakeasy to generate distinct SDK methods with clear return types while maintaining API flexibility.
paths:/chat:post:summary: >Create a chat completion from a prompt. The entire response isreturned as a single JSON object.operationId: createtags: [chat]requestBody:required: truecontent:application/json:schema:$ref: "#/components/schemas/ChatRequestJson"responses:"200":description: Chat completion createdcontent:application/json:schema:$ref: "#/components/schemas/ChatResponse"/chat#streamed:post:summary: >Create a chat completion from a prompt. The response is streamed inchunks as it is generated.operationId: createStreamedtags: [chat]requestBody:required: truecontent:application/json:schema:$ref: "#/components/schemas/ChatRequestStream"responses:"200":description: Chat completion createdcontent:text/event-stream:schema:$ref: "#/components/schemas/ChatStream"components:schemas:ChatRequest:# ...ChatRequestJson:allOf:- $ref: "#/components/schemas/ChatRequest"- type: objectproperties:stream:type: booleanenum: [false]default: falseChatRequestStream:allOf:- $ref: "#/components/schemas/ChatRequest"- type: objectproperties:stream:type: booleanenum: [true]default: trueChatResponse:# ...ChatStream:# ...
IMPORTANT
The stream
properties in the ChatRequestJson
and ChatRequestStream
schemas are treated as constants, ensuring that each request type always has a
fixed stream value (false for JSON responses and true for streamed responses).
In OpenAPI 3.0, this is achieved using single-value enums. For OpenAPI 3.1,
simplify schema by using the const
field instead of enum
, which
explicitly defines the property as having a constant value. This makes the
specification more concise and easier to maintain.
See the Speakeasy OpenAPI reference on enums for more information.
Use chat
for the non-streaming endpoint and chatStreamed
for the streaming
endpoint:
import { SDK } from '@speakeasy/sdk';const sdk = new SDK()// Non-streaming methodconst jsonResponse = await sdk.chat.create({prompt: "What are the top 3 French cheeses by consumption?"});console.log(jsonResponse.content);// Streaming methodconst stream = await sdk.chat.createStreamed({prompt: "What are the top 3 French cheeses by consumption?"});for await (const event of response.chatStream) {process.stdout.write(event.data);}
NOTE
Accross all of these examples, the schema for the events only ever specifies one or more of the four recognized fields. Adding other fields will trigger a validation error when generating an SDK with the Speakeasy CLI or GitHub action.
Sentinel events
Some SSE APIs will terminate the stream by sending a final, special event. This sentinel event is only used to signal that there are no more events and is not intended for application code to handle.
In the example below, the final data: [DONE]
event is the sentinel event:
HTTP/1.1 200 OKContent-Type: text/event-stream; charset=utf-8Date: Fri, 12 Jul 2024 14:29:22 GMTKeep-Alive: timeout=5, max=1000Connection: Keep-Alivedata: {"content": "there"}data: {"content": "are 7"}data: {"content": "continents in the world"}data: [DONE]
To hide this final event in generated SDK methods, use the x-speakeasy-sse-sentinel: <string>
extension on a text/event-stream
media object:
paths:/chat:post:summary: Create a chat completion from a promptoperationId: createtags: [chat]requestBody:required: truecontent:application/json:schema:$ref: '#/components/schemas/ChatRequest'responses:'200':description: Chat completion createdcontent:text/event-stream:+ x-speakeasy-sse-sentinel: '[DONE]'schema:$ref: '#/components/schemas/ChatEvent'components:schemas:ChatEvent:description: A server-sent event containing chat completion contenttype: objectrequired: [data]properties:data:type: objectrequired: [content]properties:content:type: string
Application code like the following TypeScript sample will behave as expected. The async iteration loop will finish when the sentinel event is encountered:
const llm = new LLM();const stream = await llm.chat.create({prompt: "How many continents are there?",});for await (const event of stream) {// ^? ChatEventprocess.stdout.write(event.data.content);}