0PricingLogin
AI Engineering Academy · Lesson

Structured Outputs with Pydantic

Define Pydantic models as your output schema, pass them to the API via the new structured outputs feature, and automatically deserialize responses into typed Python objects.

Why Pydantic for LLM Output?

Pydantic is a Python data validation library that defines data schemas using Python type hints. It excels at validating and deserializing data from external sources — and LLM output is one of the most unreliable external sources you will encounter. Combining Pydantic schemas with OpenAI's structured outputs gives you type-safe, validated, auto-deserialized responses from an AI model.

Instead of writing data = json.loads(response) followed by manual field extraction and type casting, you get a fully typed Python object where every field is guaranteed to have the correct type, complete with auto-completion in your IDE and runtime validation. This is how professional AI engineering teams handle structured extraction.

Defining a Basic Pydantic Schema

A Pydantic model is a class inheriting from BaseModel with fields defined using Python type annotations. Field types can be Python primitives, other Pydantic models for nesting, or types from the typing module for lists, optionals, and unions.

from pydantic import BaseModel, Field
from typing import Optional, List
from enum import Enum

class Sentiment(str, Enum):
    positive = 'positive'
    negative = 'negative'
    neutral = 'neutral'

class ReviewAnalysis(BaseModel):
    sentiment: Sentiment
    confidence: float = Field(ge=0.0, le=1.0, description='Confidence score 0-1')
    key_themes: List[str] = Field(description='Main topics mentioned in the review')
    summary: str = Field(max_length=200, description='One-sentence summary')
    product_name: Optional[str] = Field(default=None, description='Product mentioned, if any')
    would_recommend: Optional[bool] = None

# Pydantic validates types and constraints at instantiation
example = ReviewAnalysis(
    sentiment=Sentiment.positive,
    confidence=0.95,
    key_themes=['fast delivery', 'good quality'],
    summary='Customer loves the product and quick shipping.',
    product_name='Wireless Headphones',
    would_recommend=True
)
print(example.model_dump_json(indent=2))

All lessons in this course

  1. JSON Mode and response_format
  2. Structured Outputs with Pydantic
  3. Extracting Data from Unstructured Text
  4. Validating and Retrying Bad Outputs
← Back to AI Engineering Academy