LLM Foundations/Prompt Engineering as a Discipline
Intermediate10 min

Structured Output Techniques

Getting reliable JSON, structured data, and type-safe outputs from LLMs. Covers JSON mode, function calling, constrained decoding, Pydantic validation, and handling partial/malformed output in streaming scenarios.

Quick Reference

  • JSON mode: model outputs valid JSON, but no schema enforcement (any valid JSON)
  • Function calling / tool use: model fills parameters for a declared function schema
  • Structured Outputs (OpenAI): guaranteed schema compliance via constrained decoding
  • Pydantic + instructor: Python-native schema validation with automatic retries
  • Streaming JSON: use partial parsers (e.g., partial-json-parser) to handle incomplete JSON
  • Always validate LLM output with a schema -- never trust the model to produce perfect structure

Approaches to Structured Output

There are several ways to get structured data from LLMs, each with different trade-offs in reliability, flexibility, and provider support. The right choice depends on your reliability requirements and provider.

ApproachSchema enforced?Provider supportReliabilityUse when
Prompt-basedNoAll providersLow (70-90%)Quick prototyping, simple schemas
JSON modePartial (valid JSON only)OpenAI, Anthropic, GeminiMedium (90-95%)Need valid JSON, flexible schema
Function callingYes (parameter schema)OpenAI, Anthropic, GeminiHigh (95-99%)Tool use patterns, well-defined schemas
Structured OutputsYes (constrained decoding)OpenAIVery high (99%+)Critical schemas, zero tolerance for malformation
Instructor + PydanticYes (validation + retry)Any providerVery high (99%+)Type-safe Python, any provider
Constrained decoding explained

OpenAI's Structured Outputs use constrained decoding: at each token generation step, the model is forced to only produce tokens that are valid continuations of the schema. This means the output is guaranteed to be schema-compliant -- not because the model is good at following instructions, but because it physically cannot produce invalid output.

JSON Mode and Function Calling

JSON mode: valid JSON, but no schema enforcement
Function calling: schema-enforced parameters
Function calling for extraction

Function calling was designed for tool use, but it is excellent for data extraction. Define a 'function' that represents your extraction schema, force the model to call it, and parse the arguments. This gives you schema enforcement without needing OpenAI's Structured Outputs feature.

Pydantic + Instructor for Type Safety

The instructor library wraps LLM APIs with Pydantic model validation, giving you Python-native type safety with automatic retries on validation failure. It works with any provider and is the recommended approach for production Python applications.

Structured extraction with instructor and Pydantic
Instructor works with Anthropic too
Always use Pydantic for production extraction

Raw JSON parsing with json.loads() is fragile. Pydantic gives you type validation, default values, custom validators, and clear error messages when the model output does not match your schema. Combined with instructor's automatic retry, this makes structured extraction robust enough for production.

Handling Streaming and Partial JSON

When streaming LLM responses, you receive tokens incrementally. If the output is JSON, the stream contains partial, invalid JSON until the response completes. Handling this requires special parsing approaches.

Streaming structured output with instructor
Streaming + validation trade-off

You cannot fully validate a Pydantic model until the stream completes. Partial streaming gives you early access to fields but cannot enforce cross-field validation (like 'end_date must be after start_date'). For critical validation, wait for the complete response. For UX, show partial results but indicate they are provisional.

Choosing the Right Approach

RequirementBest approachWhy
Quick prototype, any providerPrompt-based JSONFastest to implement, no library dependencies
Valid JSON guaranteedJSON mode (response_format)Built into the API, zero parsing failures
Schema enforcement, OpenAIStructured OutputsConstrained decoding, physically can't produce invalid output
Schema enforcement, any providerInstructor + PydanticWorks everywhere, type-safe, auto-retry
Tool use / agent workflowsFunction callingDesigned for this, all providers support it
Streaming with typed outputInstructor partial streamingHandles incremental Pydantic population
Defense in depth for structured output

Even with constrained decoding, validate the semantic content. A schema-valid JSON response can still contain nonsensical values (age: -5, date: '2099-13-45'). Layer your defenses: (1) constrained decoding for syntactic validity, (2) Pydantic validators for semantic validity, (3) business logic checks for domain validity.

  • Start with instructor + Pydantic -- it works with every provider and gives you the most flexibility
  • If you are OpenAI-only and need maximum reliability, use Structured Outputs
  • For agent/tool-use patterns, function calling is the natural fit across all providers
  • Always add Pydantic field validators for domain constraints (valid ranges, enum values, format checks)
  • Log validation failures -- they reveal prompt weaknesses and model limitations

Best Practices

Best Practices

Do

  • Use Pydantic models as your source of truth for output schemas -- they provide validation, docs, and type safety
  • Use instructor with automatic retries for production structured extraction
  • Add field-level validators in Pydantic for domain-specific constraints
  • Start with function calling or instructor -- avoid raw prompt-based JSON in production
  • Log and monitor validation failures to identify systematic extraction issues

Don’t

  • Don't use json.loads() without a Pydantic model -- you lose validation and type safety
  • Don't assume JSON mode guarantees your schema -- it only guarantees valid JSON syntax
  • Don't ignore streaming edge cases -- partial JSON requires special handling
  • Don't use Structured Outputs if you need provider portability (OpenAI-only feature)
  • Don't skip semantic validation -- schema-valid JSON can still contain wrong values

Key Takeaways

  • JSON mode guarantees valid JSON syntax but not your schema. Function calling and Structured Outputs enforce schemas.
  • Instructor + Pydantic is the best general-purpose approach: works with any provider, type-safe, auto-retries on failure.
  • Constrained decoding (OpenAI Structured Outputs) physically prevents schema violations at the token level.
  • Streaming JSON requires partial parsing -- use instructor's create_partial or partial-json-parser library.
  • Always validate semantics on top of structure: schema-valid JSON can still contain nonsensical values.

Video on this topic

Getting reliable JSON from LLMs every time

tiktok