You've shipped a feature that calls GPT-4 to extract structured data from user-submitted invoices. In development it works perfectly — the model returns a clean JSON object every time. Then in production, at 2 AM, you get a Sentry alert: JSON.parse: unexpected token. The model decided to preface its response with "Sure! Here's the JSON you asked for:" before the actual payload. A week later, same feature, different bug: the model returns totalAmount instead of total_amount, and your downstream database write silently drops the field. If you've been prompting your way around LLM output reliability, OpenAI Structured Outputs is the fix you've been waiting for.

Structured Outputs, released by OpenAI in August 2024, lets you supply a JSON Schema via the response_format parameter and receive a guaranteed-valid response that matches that schema exactly. This is different from the older JSON mode ({"type": "json_object"}), which only ensured the output was valid JSON — not that it matched any particular shape. It's also distinct from function calling, which routes the model's output into a tool call but adds its own layer of ceremony. Structured Outputs is the cleanest path: describe the shape you want, get back exactly that shape, every time. Under the hood, OpenAI uses constrained decoding — the model's token sampling is guided by your schema so it literally cannot produce an invalid response.

Your First Structured Output

python
from openai import OpenAI
import json

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o-2024-08-06",
    messages=[
        {
            "role": "user",
            "content": "Extract the vendor name, invoice number, and total amount from this text: "
                       "Invoice #INV-2024-0892 from Acme Supplies Ltd. Total due: $1,450.00"
        }
    ],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "invoice_extract",
            "strict": True,
            "schema": {
                "type": "object",
                "properties": {
                    "vendor_name":     {"type": "string"},
                    "invoice_number":  {"type": "string"},
                    "total_amount":    {"type": "number"}
                },
                "required": ["vendor_name", "invoice_number", "total_amount"],
                "additionalProperties": False
            }
        }
    }
)

data = json.loads(response.choices[0].message.content)
print(data)
# {"vendor_name": "Acme Supplies Ltd", "invoice_number": "INV-2024-0892", "total_amount": 1450.0}

Three things to notice here. First, the model is gpt-4o-2024-08-06 — Structured Outputs requires a model that explicitly supports it (the -2024-08-06 snapshot or later for GPT-4o, or gpt-4o-mini). Second, response_format.type is "json_schema", not "json_object". Third, "strict": True is what gives you the guarantee — without it you're back to best-effort territory. The name field is a label the model sees; it has no effect on parsing but makes your API logs readable.

Designing the JSON Schema

Here's a more realistic schema for a product catalog extraction task — the kind you'd use to pull structured data from unstructured product descriptions, e-commerce listings, or PDF datasheets. Use the JSON Schema Generator to build and validate your schema visually before wiring it into your API calls.

json
{
  "type": "object",
  "properties": {
    "product_name": {
      "type": "string",
      "description": "The full commercial name of the product"
    },
    "sku": {
      "type": "string",
      "description": "Stock keeping unit identifier"
    },
    "price_usd": {
      "type": "number",
      "description": "Price in US dollars, numeric only"
    },
    "in_stock": {
      "type": "boolean"
    },
    "categories": {
      "type": "array",
      "items": { "type": "string" }
    },
    "dimensions": {
      "type": "object",
      "properties": {
        "width_cm":  { "type": "number" },
        "height_cm": { "type": "number" },
        "depth_cm":  { "type": "number" }
      },
      "required": ["width_cm", "height_cm", "depth_cm"],
      "additionalProperties": false
    }
  },
  "required": [
    "product_name", "sku", "price_usd",
    "in_stock", "categories", "dimensions"
  ],
  "additionalProperties": false
}
  • All properties must be in required in strict mode. You cannot have optional fields. If a field might not exist in the source data, use a union type: {"type": ["string", "null"]} and always include it in required.
  • additionalProperties must be false at every object level. This applies recursively — your nested objects need it too, not just the root.
  • Supported types in strict mode: string, number, integer, boolean, null, array, object. Type unions (["string", "null"]) are allowed.
  • No $ref or recursive schemas in strict mode. Everything must be inlined. If you need a reusable definition, copy it.
  • Add description fields generously. The model reads them. Saying "Price in US dollars, numeric only — do not include currency symbols" gets you cleaner output than hoping the model guesses right.
  • Enums work. {"type": "string", "enum": ["pending", "shipped", "delivered"]} is fully supported and the model will only ever emit one of those three values.

Strict Mode vs Non-Strict

python
# Strict mode — guaranteed conformance, tighter schema rules
response_format_strict = {
    "type": "json_schema",
    "json_schema": {
        "name": "product_extract",
        "strict": True,   # <-- the key flag
        "schema": product_schema
    }
}

# Non-strict — more schema flexibility, best-effort conformance
response_format_lenient = {
    "type": "json_schema",
    "json_schema": {
        "name": "product_extract",
        "strict": False,
        "schema": product_schema
    }
}

With strict: True, OpenAI pre-processes your schema the first time it's used and caches the constrained decoder. The first call with a new schema takes slightly longer; subsequent calls with the same schema are fast. What you get in return: the model output is structurally guaranteed — you can call json.loads() and then access fields directly without defensive checks. What you give up: $ref, anyOf across structural variants, and recursive schemas are not supported. Non-strict mode accepts a wider range of JSON Schema features but falls back to best-effort — the model tries to follow the schema but isn't constrained at the token level. For production extraction pipelines, always use strict mode. The schema restrictions are manageable once you understand them.

Nested Objects and Arrays

Nested structures work well, but every nested object needs its own "additionalProperties": false and its own "required" array listing all properties. A common mistake is applying strict rules to the root object and forgetting the children — OpenAI will reject the schema with a validation error.

python
from openai import OpenAI
import json

client = OpenAI()

order_schema = {
    "type": "object",
    "properties": {
        "order_id": {"type": "string"},
        "customer": {
            "type": "object",
            "properties": {
                "name":  {"type": "string"},
                "email": {"type": "string"}
            },
            "required": ["name", "email"],
            "additionalProperties": False   # required on nested objects too
        },
        "line_items": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "description": {"type": "string"},
                    "quantity":    {"type": "integer"},
                    "unit_price":  {"type": "number"}
                },
                "required": ["description", "quantity", "unit_price"],
                "additionalProperties": False  # required on array item schemas too
            }
        },
        "total_usd": {"type": "number"}
    },
    "required": ["order_id", "customer", "line_items", "total_usd"],
    "additionalProperties": False
}

response = client.chat.completions.create(
    model="gpt-4o-2024-08-06",
    messages=[{
        "role": "user",
        "content": (
            "Parse this order: Order #ORD-5531 for Jane Smith ([email protected]). "
            "2x Wireless Keyboard at $49.99 each, 1x USB Hub at $29.99. Total: $129.97"
        )
    }],
    response_format={
        "type": "json_schema",
        "json_schema": {
            "name": "order_extract",
            "strict": True,
            "schema": order_schema
        }
    }
)

order = json.loads(response.choices[0].message.content)
for item in order["line_items"]:
    print(f"${item['unit_price']:.2f} x{item['quantity']}  {item['description']}")

Handling Refusals

Even with Structured Outputs, the model can refuse to respond — typically when the prompt triggers a content policy (asking it to extract data from something harmful). When this happens, finish_reason is "stop" but message.content is null and message.refusal contains the refusal text. If you don't check for this, you'll get an AttributeError when you try to call json.loads(None). Also watch for finish_reason == "length" — if the response was cut off due to max_tokens, the JSON will be incomplete and unparseable regardless of Structured Outputs.

python
import json
from openai import OpenAI

client = OpenAI()

def extract_invoice(raw_text: str) -> dict | None:
    response = client.chat.completions.create(
        model="gpt-4o-2024-08-06",
        messages=[{"role": "user", "content": f"Extract invoice fields: {raw_text}"}],
        response_format={
            "type": "json_schema",
            "json_schema": {
                "name": "invoice_extract",
                "strict": True,
                "schema": invoice_schema
            }
        },
        max_tokens=1024
    )

    choice = response.choices[0]

    if choice.finish_reason == "length":
        raise ValueError("Response truncated — increase max_tokens or simplify your schema")

    if choice.message.refusal:
        # Model refused to answer — log and return None rather than crashing
        print(f"Model refused: {choice.message.refusal}")
        return None

    return json.loads(choice.message.content)


result = extract_invoice("Invoice #2024-441 from BuildRight Inc., due $3,200 by Dec 15")
if result:
    print(result["vendor_name"], result["total_amount"])

Using the Same Schema for Validation

One underused pattern: use the same JSON Schema you pass to OpenAI to also validate data coming in from other sources — webhooks, file uploads, third-party APIs. This gives you a single source of truth for your data shape. In Python, use the jsonschema library. In Node.js, use Ajv. You can also paste your schema into the JSON Validator to do a quick manual sanity check without writing any code.

python
import json
import jsonschema
from jsonschema import validate, ValidationError

# The same schema used in your OpenAI call
invoice_schema = {
    "type": "object",
    "properties": {
        "vendor_name":    {"type": "string"},
        "invoice_number": {"type": "string"},
        "total_amount":   {"type": "number"}
    },
    "required": ["vendor_name", "invoice_number", "total_amount"],
    "additionalProperties": False
}

def validate_invoice(data: dict) -> bool:
    try:
        validate(instance=data, schema=invoice_schema)
        return True
    except ValidationError as e:
        print(f"Validation failed: {e.message}")
        print(f"  Path: {' -> '.join(str(p) for p in e.path)}")
        return False


# Validate a payload from a webhook — same schema, zero extra work
webhook_payload = json.loads(request_body)
if not validate_invoice(webhook_payload):
    return HTTPResponse(status=400, body="Invalid invoice payload")

# Validate the OpenAI output too, for belt-and-suspenders safety
llm_output = json.loads(openai_response.choices[0].message.content)
assert validate_invoice(llm_output), "LLM output failed schema validation — check schema definition"
print(f"Processing invoice {llm_output['invoice_number']} for ${llm_output['total_amount']}")
Build your schema faster: Use the JSON Schema Generator to create and refine your schema visually — paste in a sample JSON object and it generates a starting schema automatically. Copy the result directly into your OpenAI response_format.

JavaScript / Node.js Version

The OpenAI Node.js SDK mirrors the Python API almost exactly. The main difference is that strict sits inside the json_schema object the same way, and you parse the response with JSON.parse(). Validation with Ajv is the Node.js equivalent of Python's jsonschema library — it's faster and has excellent TypeScript support.

js
import OpenAI from "openai";
import Ajv from "ajv";

const client = new OpenAI();
const ajv = new Ajv();

const invoiceSchema = {
  type: "object",
  properties: {
    vendor_name:    { type: "string" },
    invoice_number: { type: "string" },
    total_amount:   { type: "number" },
    line_items: {
      type: "array",
      items: {
        type: "object",
        properties: {
          description: { type: "string" },
          amount:      { type: "number" }
        },
        required: ["description", "amount"],
        additionalProperties: false
      }
    }
  },
  required: ["vendor_name", "invoice_number", "total_amount", "line_items"],
  additionalProperties: false
};

const validateInvoice = ajv.compile(invoiceSchema);

async function extractInvoice(rawText) {
  const response = await client.chat.completions.create({
    model: "gpt-4o-2024-08-06",
    messages: [{ role: "user", content: `Extract invoice fields: ${rawText}` }],
    response_format: {
      type: "json_schema",
      json_schema: {
        name: "invoice_extract",
        strict: true,
        schema: invoiceSchema
      }
    }
  });

  const choice = response.choices[0];

  if (choice.message.refusal) {
    throw new Error(`Model refused: ${choice.message.refusal}`);
  }

  const data = JSON.parse(choice.message.content);

  // Validate even though strict mode guarantees structure —
  // useful for catching schema drift between environments
  if (!validateInvoice(data)) {
    console.error("Schema validation errors:", validateInvoice.errors);
    throw new Error("Output failed schema validation");
  }

  return data;
}

const invoice = await extractInvoice(
  "Invoice #INV-881 from Nordic Parts AS. " +
  "3x Brake Pads at $28.50 each. Total: $85.50"
);

console.log(`${invoice.vendor_name} — Invoice ${invoice.invoice_number}`);
invoice.line_items.forEach(item =>
  console.log(`  ${item.description}: $${item.amount}`)
);

Wrapping Up

Structured Outputs eliminates an entire category of production bugs — the kind where the model returns almost-right JSON that breaks your parser at 2 AM. The workflow is straightforward: design your schema carefully (every nested object needs additionalProperties: false and a full required array), set strict: true, and handle refusals and truncation explicitly. Once the schema is in place, you can reuse it across your stack — in the OpenAI call, in webhook validation, in your test fixtures — with libraries like jsonschema (Python) or Ajv (Node.js). If you're starting from scratch on a schema, the JSON Schema Generator is the fastest way to get a working base schema from a sample payload. The days of prompt-engineering your way to reliable JSON output are over — use the tool that was built for it.