Structured Output

Loading concept...

🎯 Structured Output in LangChain

The Magic Post Office Analogy

Imagine you’re running a magic post office. People send you messy, jumbled letters, and your job is to sort them into neat, labeled boxes. That’s exactly what Structured Output does in LangChain!

When an AI responds, it often gives you a big blob of text. But what if you need specific pieces of information in specific places? That’s where structured output comes in — it’s like training your AI to fill out a form instead of writing a free-form essay.


🔄 Output Parsing Strategies

What’s an Output Parser?

Think of it like a translator at the post office. The AI speaks in sentences, but your code speaks in data. The parser translates between them!

from langchain.output_parsers import (
    StructuredOutputParser
)

# Define what boxes we need
parser = StructuredOutputParser.from_response_schemas([
    {"name": "title", "description": "Book title"},
    {"name": "author", "description": "Author name"}
])

Three Main Strategies

Strategy When to Use Like…
Schema-based You know exactly what you want A form with blank fields
Regex-based Simple patterns Finding phone numbers
Function-based Complex extraction A smart assistant

📦 Structured Output Patterns

Pattern 1: The Pydantic Model

This is the gold standard. You create a blueprint, and the AI fills it in!

from pydantic import BaseModel, Field

class MovieReview(BaseModel):
    title: str = Field(
        description="Movie name"
    )
    rating: int = Field(
        description="Score 1-10"
    )
    summary: str = Field(
        description="One sentence"
    )

Why it works: It’s like giving someone a Mad Libs book. They can only put words where the blanks are!

Pattern 2: Dictionary Schema

Simpler, but less strict:

schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"}
    }
}

✨ The with_structured_output Method

This is the easiest way to get structured output. It’s like putting a mold on your AI’s mouth — whatever comes out fits perfectly!

from langchain_openai import ChatOpenAI
from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int

# Create the magic!
llm = ChatOpenAI(model="gpt-4")
structured_llm = llm.with_structured_output(
    Person
)

# Now it ALWAYS returns a Person
result = structured_llm.invoke(
    "Tell me about Marie Curie"
)
print(result.name)  # "Marie Curie"
print(result.age)   # 66

Why This is Amazing

graph TD A[Messy AI Response] --> B[with_structured_output] B --> C[Clean Pydantic Object] C --> D[Easy to Use in Code!]

Before: “Marie Curie was born in 1867 and lived to be 66 years old…” After: Person(name="Marie Curie", age=66)


🔧 JSON Mode

Sometimes you just want the AI to speak JSON. That’s it. No fancy objects, just clean JSON.

llm = ChatOpenAI(
    model="gpt-4",
    model_kwargs={
        "response_format": {
            "type": "json_object"
        }
    }
)

response = llm.invoke(
    "List 3 fruits as JSON"
)
# {"fruits": ["apple", "banana", "cherry"]}

When to Use JSON Mode

✅ Use JSON Mode ❌ Don’t Use
Simple data extraction Complex nested objects
Quick prototyping Strict validation needed
Flexible schemas Type safety required

🔧 Output Fixing Parser

The Problem: Sometimes the AI makes tiny mistakes. A missing quote. A wrong bracket. Your whole program crashes!

The Solution: The Output Fixing Parser — it’s like having a proofreader who can fix typos automatically.

from langchain.output_parsers import (
    OutputFixingParser,
    PydanticOutputParser
)

# Original parser
base_parser = PydanticOutputParser(
    pydantic_object=Person
)

# Wrap it with auto-fix powers!
fixing_parser = OutputFixingParser.from_llm(
    parser=base_parser,
    llm=ChatOpenAI()
)

# Now it can fix small errors
bad_output = '{"name": "Alice", age: 25}'
# Missing quotes around "age"!

fixed = fixing_parser.parse(bad_output)
# Works anyway! Returns Person object

How It Works

graph TD A[Broken Output] --> B{Can Parse?} B -->|Yes| C[Return Result] B -->|No| D[Ask AI to Fix] D --> E[Try Again] E --> C

🔄 Retry Parser

What if the AI just… gets it completely wrong? Not a typo, but a fundamental misunderstanding?

The Retry Parser doesn’t just fix — it asks again with better instructions!

from langchain.output_parsers import (
    RetryWithErrorOutputParser
)

retry_parser = RetryWithErrorOutputParser.from_llm(
    parser=base_parser,
    llm=ChatOpenAI()
)

# If parsing fails, it will:
# 1. Show the AI what went wrong
# 2. Ask it to try again
# 3. Parse the new response

The Difference

Parser What It Does Best For
Fixing Corrects small syntax errors Typos, missing quotes
Retry Re-asks with error feedback Wrong structure entirely

Think of it this way:

  • Fixing Parser = A spell checker
  • Retry Parser = A teacher saying “Try that answer again”

🛠️ Custom Output Parsers

Sometimes the built-in parsers don’t fit your needs. Time to build your own!

Creating a Custom Parser

from langchain.schema import BaseOutputParser

class EmotionParser(BaseOutputParser):
    """Extracts emotion from text."""

    def parse(self, text: str) -> str:
        # Your custom logic here
        text_lower = text.lower()

        if "happy" in text_lower:
            return "😊 Happy"
        elif "sad" in text_lower:
            return "😢 Sad"
        else:
            return "😐 Neutral"

    def get_format_instructions(self):
        return "Express an emotion."

Using Your Custom Parser

parser = EmotionParser()

result = parser.parse(
    "I'm so happy today!"
)
print(result)  # "😊 Happy"

Real-World Example: Email Parser

class EmailParser(BaseOutputParser):
    def parse(self, text: str) -> dict:
        lines = text.strip().split("\n")
        return {
            "subject": lines[0] if lines else "",
            "body": "\n".join(lines[1:])
        }

🎯 Putting It All Together

Here’s the complete flow:

graph TD A[User Question] --> B[LLM] B --> C{Output Type?} C -->|Simple JSON| D[JSON Mode] C -->|Strict Schema| E[with_structured_output] C -->|Custom Logic| F[Custom Parser] D --> G[Parse Result] E --> G F --> G G --> H{Valid?} H -->|No, Minor Error| I[Output Fixing Parser] H -->|No, Major Error| J[Retry Parser] H -->|Yes| K[Use in App!] I --> G J --> B

🚀 Quick Reference

Need Solution Code
Strict typing with_structured_output(Model) Pydantic model
Just JSON JSON mode response_format
Fix typos OutputFixingParser Wraps existing parser
Complete retry RetryWithErrorOutputParser Re-invokes LLM
Special logic Custom parser Extend BaseOutputParser

💡 Key Takeaways

  1. Structured output = AI filling out forms instead of writing essays
  2. with_structured_output is your best friend for type safety
  3. JSON mode is quick and dirty when you just need JSON
  4. Fixing parser handles typos automatically
  5. Retry parser handles complete misunderstandings
  6. Custom parsers let you build anything you need

You’re now ready to make your AI responses predictable, parseable, and powerful! 🎉

Loading story...

No Story Available

This concept doesn't have a story yet.

Story Preview

Story - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

Interactive Preview

Interactive - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

No Interactive Content

This concept doesn't have interactive content yet.

Cheatsheet Preview

Cheatsheet - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

No Cheatsheet Available

This concept doesn't have a cheatsheet yet.

Quiz Preview

Quiz - Premium Content

Please sign in to view this concept and start learning.

Upgrade to Premium to unlock full access to all content.

No Quiz Available

This concept doesn't have a quiz yet.