You've shipped a feature that calls GPT-4 to extract structured data from user-submitted invoices.
In development it works perfectly — the model returns a clean JSON object every time. Then in production,
at 2 AM, you get a Sentry alert: JSON.parse: unexpected token. The model decided to preface its
response with "Sure! Here's the JSON you asked for:" before the actual payload. A week later, same feature,
different bug: the model returns totalAmount instead of total_amount, and your
downstream database write silently drops the field. If you've been prompting your way around LLM output
reliability, OpenAI Structured Outputs
is the fix you've been waiting for.
Structured Outputs, released by OpenAI in August 2024, lets you supply a
JSON Schema
via the response_format parameter and receive a guaranteed-valid response that
matches that schema exactly. This is different from the older JSON mode ({"type": "json_object"}),
which only ensured the output was valid JSON — not that it matched any particular shape. It's also distinct
from function calling, which routes the model's output into a tool call but adds its own layer of ceremony.
Structured Outputs is the cleanest path: describe the shape you want, get back exactly that shape, every time.
Under the hood, OpenAI uses constrained decoding — the model's token sampling is guided by your schema so
it literally cannot produce an invalid response.
Your First Structured Output
from openai import OpenAI
import json
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-2024-08-06",
messages=[
{
"role": "user",
"content": "Extract the vendor name, invoice number, and total amount from this text: "
"Invoice #INV-2024-0892 from Acme Supplies Ltd. Total due: $1,450.00"
}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "invoice_extract",
"strict": True,
"schema": {
"type": "object",
"properties": {
"vendor_name": {"type": "string"},
"invoice_number": {"type": "string"},
"total_amount": {"type": "number"}
},
"required": ["vendor_name", "invoice_number", "total_amount"],
"additionalProperties": False
}
}
}
)
data = json.loads(response.choices[0].message.content)
print(data)
# {"vendor_name": "Acme Supplies Ltd", "invoice_number": "INV-2024-0892", "total_amount": 1450.0}Three things to notice here. First, the model is gpt-4o-2024-08-06 — Structured Outputs
requires a model that explicitly supports it (the -2024-08-06 snapshot or later for GPT-4o, or
gpt-4o-mini). Second, response_format.type is "json_schema", not
"json_object". Third, "strict": True is what gives you the guarantee —
without it you're back to best-effort territory. The name field is a label the model sees;
it has no effect on parsing but makes your API logs readable.
Designing the JSON Schema
Here's a more realistic schema for a product catalog extraction task — the kind you'd use to pull structured data from unstructured product descriptions, e-commerce listings, or PDF datasheets. Use the JSON Schema Generator to build and validate your schema visually before wiring it into your API calls.
{
"type": "object",
"properties": {
"product_name": {
"type": "string",
"description": "The full commercial name of the product"
},
"sku": {
"type": "string",
"description": "Stock keeping unit identifier"
},
"price_usd": {
"type": "number",
"description": "Price in US dollars, numeric only"
},
"in_stock": {
"type": "boolean"
},
"categories": {
"type": "array",
"items": { "type": "string" }
},
"dimensions": {
"type": "object",
"properties": {
"width_cm": { "type": "number" },
"height_cm": { "type": "number" },
"depth_cm": { "type": "number" }
},
"required": ["width_cm", "height_cm", "depth_cm"],
"additionalProperties": false
}
},
"required": [
"product_name", "sku", "price_usd",
"in_stock", "categories", "dimensions"
],
"additionalProperties": false
}- All properties must be in
requiredin strict mode. You cannot have optional fields. If a field might not exist in the source data, use a union type:{"type": ["string", "null"]}and always include it inrequired. additionalPropertiesmust befalseat every object level. This applies recursively — your nested objects need it too, not just the root.- Supported types in strict mode:
string,number,integer,boolean,null,array,object. Type unions (["string", "null"]) are allowed. - No
$refor recursive schemas in strict mode. Everything must be inlined. If you need a reusable definition, copy it. - Add
descriptionfields generously. The model reads them. Saying "Price in US dollars, numeric only — do not include currency symbols" gets you cleaner output than hoping the model guesses right. - Enums work.
{"type": "string", "enum": ["pending", "shipped", "delivered"]}is fully supported and the model will only ever emit one of those three values.
Strict Mode vs Non-Strict
# Strict mode — guaranteed conformance, tighter schema rules
response_format_strict = {
"type": "json_schema",
"json_schema": {
"name": "product_extract",
"strict": True, # <-- the key flag
"schema": product_schema
}
}
# Non-strict — more schema flexibility, best-effort conformance
response_format_lenient = {
"type": "json_schema",
"json_schema": {
"name": "product_extract",
"strict": False,
"schema": product_schema
}
}With strict: True, OpenAI pre-processes your schema the first time it's used
and caches the constrained decoder. The first call with a new schema takes slightly longer; subsequent
calls with the same schema are fast. What you get in return: the model output is structurally
guaranteed — you can call json.loads() and then access fields directly without
defensive checks. What you give up: $ref, anyOf across structural variants,
and recursive schemas are not supported. Non-strict mode accepts a wider range of
JSON Schema
features but falls back to best-effort — the model tries to follow the schema but isn't constrained
at the token level. For production extraction pipelines, always use strict mode. The schema restrictions
are manageable once you understand them.
Nested Objects and Arrays
Nested structures work well, but every nested object needs its own "additionalProperties": false
and its own "required" array listing all properties. A common mistake is applying strict rules
to the root object and forgetting the children — OpenAI will reject the schema with a validation error.
from openai import OpenAI
import json
client = OpenAI()
order_schema = {
"type": "object",
"properties": {
"order_id": {"type": "string"},
"customer": {
"type": "object",
"properties": {
"name": {"type": "string"},
"email": {"type": "string"}
},
"required": ["name", "email"],
"additionalProperties": False # required on nested objects too
},
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"quantity": {"type": "integer"},
"unit_price": {"type": "number"}
},
"required": ["description", "quantity", "unit_price"],
"additionalProperties": False # required on array item schemas too
}
},
"total_usd": {"type": "number"}
},
"required": ["order_id", "customer", "line_items", "total_usd"],
"additionalProperties": False
}
response = client.chat.completions.create(
model="gpt-4o-2024-08-06",
messages=[{
"role": "user",
"content": (
"Parse this order: Order #ORD-5531 for Jane Smith ([email protected]). "
"2x Wireless Keyboard at $49.99 each, 1x USB Hub at $29.99. Total: $129.97"
)
}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "order_extract",
"strict": True,
"schema": order_schema
}
}
)
order = json.loads(response.choices[0].message.content)
for item in order["line_items"]:
print(f"${item['unit_price']:.2f} x{item['quantity']} {item['description']}")Handling Refusals
Even with Structured Outputs, the model can refuse to respond — typically when the prompt
triggers a content policy (asking it to extract data from something harmful). When this happens,
finish_reason is "stop" but message.content is null
and message.refusal contains the refusal text. If you don't check for this, you'll get
an AttributeError when you try to call json.loads(None). Also watch for
finish_reason == "length" — if the response was cut off due to max_tokens,
the JSON will be incomplete and unparseable regardless of Structured Outputs.
import json
from openai import OpenAI
client = OpenAI()
def extract_invoice(raw_text: str) -> dict | None:
response = client.chat.completions.create(
model="gpt-4o-2024-08-06",
messages=[{"role": "user", "content": f"Extract invoice fields: {raw_text}"}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "invoice_extract",
"strict": True,
"schema": invoice_schema
}
},
max_tokens=1024
)
choice = response.choices[0]
if choice.finish_reason == "length":
raise ValueError("Response truncated — increase max_tokens or simplify your schema")
if choice.message.refusal:
# Model refused to answer — log and return None rather than crashing
print(f"Model refused: {choice.message.refusal}")
return None
return json.loads(choice.message.content)
result = extract_invoice("Invoice #2024-441 from BuildRight Inc., due $3,200 by Dec 15")
if result:
print(result["vendor_name"], result["total_amount"])Using the Same Schema for Validation
One underused pattern: use the same JSON Schema
you pass to OpenAI to also validate data coming in from other sources — webhooks, file uploads,
third-party APIs. This gives you a single source of truth for your data shape. In Python, use the
jsonschema library. In Node.js, use Ajv.
You can also paste your schema into the JSON Validator to do a quick
manual sanity check without writing any code.
import json
import jsonschema
from jsonschema import validate, ValidationError
# The same schema used in your OpenAI call
invoice_schema = {
"type": "object",
"properties": {
"vendor_name": {"type": "string"},
"invoice_number": {"type": "string"},
"total_amount": {"type": "number"}
},
"required": ["vendor_name", "invoice_number", "total_amount"],
"additionalProperties": False
}
def validate_invoice(data: dict) -> bool:
try:
validate(instance=data, schema=invoice_schema)
return True
except ValidationError as e:
print(f"Validation failed: {e.message}")
print(f" Path: {' -> '.join(str(p) for p in e.path)}")
return False
# Validate a payload from a webhook — same schema, zero extra work
webhook_payload = json.loads(request_body)
if not validate_invoice(webhook_payload):
return HTTPResponse(status=400, body="Invalid invoice payload")
# Validate the OpenAI output too, for belt-and-suspenders safety
llm_output = json.loads(openai_response.choices[0].message.content)
assert validate_invoice(llm_output), "LLM output failed schema validation — check schema definition"
print(f"Processing invoice {llm_output['invoice_number']} for ${llm_output['total_amount']}")response_format.JavaScript / Node.js Version
The OpenAI Node.js SDK
mirrors the Python API almost exactly. The main difference is that strict sits inside the
json_schema object the same way, and you parse the response with JSON.parse().
Validation with Ajv is the Node.js
equivalent of Python's jsonschema library — it's faster and has excellent TypeScript support.
import OpenAI from "openai";
import Ajv from "ajv";
const client = new OpenAI();
const ajv = new Ajv();
const invoiceSchema = {
type: "object",
properties: {
vendor_name: { type: "string" },
invoice_number: { type: "string" },
total_amount: { type: "number" },
line_items: {
type: "array",
items: {
type: "object",
properties: {
description: { type: "string" },
amount: { type: "number" }
},
required: ["description", "amount"],
additionalProperties: false
}
}
},
required: ["vendor_name", "invoice_number", "total_amount", "line_items"],
additionalProperties: false
};
const validateInvoice = ajv.compile(invoiceSchema);
async function extractInvoice(rawText) {
const response = await client.chat.completions.create({
model: "gpt-4o-2024-08-06",
messages: [{ role: "user", content: `Extract invoice fields: ${rawText}` }],
response_format: {
type: "json_schema",
json_schema: {
name: "invoice_extract",
strict: true,
schema: invoiceSchema
}
}
});
const choice = response.choices[0];
if (choice.message.refusal) {
throw new Error(`Model refused: ${choice.message.refusal}`);
}
const data = JSON.parse(choice.message.content);
// Validate even though strict mode guarantees structure —
// useful for catching schema drift between environments
if (!validateInvoice(data)) {
console.error("Schema validation errors:", validateInvoice.errors);
throw new Error("Output failed schema validation");
}
return data;
}
const invoice = await extractInvoice(
"Invoice #INV-881 from Nordic Parts AS. " +
"3x Brake Pads at $28.50 each. Total: $85.50"
);
console.log(`${invoice.vendor_name} — Invoice ${invoice.invoice_number}`);
invoice.line_items.forEach(item =>
console.log(` ${item.description}: $${item.amount}`)
);Wrapping Up
Structured Outputs eliminates an entire category of production bugs — the kind where the model
returns almost-right JSON that breaks your parser at 2 AM. The workflow is straightforward: design your
schema carefully (every nested object needs additionalProperties: false and a full
required array), set strict: true, and handle refusals and truncation explicitly.
Once the schema is in place, you can reuse it across your stack — in the OpenAI call, in webhook validation,
in your test fixtures — with libraries like jsonschema (Python) or
Ajv (Node.js). If you're starting from
scratch on a schema, the JSON Schema Generator is the fastest way
to get a working base schema from a sample payload. The days of prompt-engineering your way to reliable
JSON output are over — use the tool that was built for it.