How to Reliably Parse JSON from LLM Responses

You asked the model for a JSON object containing invoice data. The prompt was clear: "Return only valid JSON. No explanation." What came back was a markdown code fence, two sentences of commentary, a JSON object — and then a helpful note at the bottom explaining each field. In production, at 2 AM, with a customer's data pipeline stalled. If you're building anything on top of LLM APIs, you already know this pain. LLMs are not JSON serializers. They're text generators that usually produce valid JSON — until they don't. This article covers the five ways they break it and the battle-tested patterns to handle each one.

The 5 Ways LLMs Break JSON

These aren't edge cases. Every single one of these will happen to you in production, usually the moment you stop checking for them.

Markdown code fences — The model wraps the JSON in ```json\n...\n``` because its training data is full of docs and README files that present JSON that way.
Trailing commentary — The model appends a sentence or paragraph after the closing brace: "Note: the total field is in USD."
Truncation — Long outputs get cut mid-object when the response hits the token limit, leaving you with structurally broken JSON and no closing braces.
Hallucinated keys — The model invents field names not in your schema. You asked for invoice_number, you got invoiceNumber, invoice_no, and ref_id — sometimes in the same response.
Wrong types — Numbers arrive as strings ("49.99" instead of 49.99), booleans as "true", arrays as comma-separated strings. Type coercion bugs in disguise.

Pattern 1: Strip Markdown Code Fences

This is the most common breakage and the easiest to fix. A simple regex strips the fence regardless of whether the language tag is json, JSON, or missing entirely. Run this before any other processing — it costs nothing and prevents a large class of errors.

python

import re

def strip_code_fences(text: str) -> str:
    """Remove markdown code fences from LLM output."""
    # Handles ```json, ```JSON, ``` (no lang tag), etc.
    pattern = r'^```(?:json|JSON)?\s*\n?(.*?)\n?```$'
    match = re.search(pattern, text.strip(), re.DOTALL)
    if match:
        return match.group(1).strip()
    return text.strip()

# Example: model returned a fenced block
raw = """
```json
{
  "invoice_number": "INV-2024-0192",
  "vendor": "Acme Supplies",
  "total": 1249.99,
  "currency": "USD"
}
```
"""

clean = strip_code_fences(raw)
invoice = json.loads(clean)  # now safe

function stripCodeFences(text) {
  // Handles ```json, ```JSON, bare ``` (no lang), etc.
  const match = text.trim().match(/^```(?:json|JSON)?\s*\n?([\s\S]*?)\n?```$/s);
  return match ? match[1].trim() : text.trim();
}

// raw response contains a triple-backtick fence (shown here as a single-quoted string)
const raw = '```json\n{\n  "invoice_number": "INV-2024-0192",\n  "vendor": "Acme Supplies",\n  "total": 1249.99\n}\n```';

const clean = stripCodeFences(raw);
const invoice = JSON.parse(clean); // safe

Pattern 2: Extract JSON with Regex

When the model adds text before or after the JSON object — "Here is the extracted data:", "Let me know if you need changes." — stripping fences isn't enough. You need to find the outermost {...} block and pull it out. The trick is using a greedy match that handles nested objects correctly. Note that this approach handles objects ({}); if your schema is an array, swap the character class accordingly.

python

import re
import json

def extract_json_object(text: str) -> str | None:
    """
    Extract the first complete JSON object from a string that may
    contain surrounding prose or commentary.
    """
    # Find the first { and last } to grab the outermost object
    match = re.search(r'\{.*\}', text, re.DOTALL)
    if not match:
        # Fall back to array extraction if no object found
        match = re.search(r'\[.*\]', text, re.DOTALL)
    return match.group(0) if match else None

# Model returned prose + JSON + footnote
raw_response = """
Based on the document you provided, here is the structured data:

{
  "invoice_number": "INV-2024-0192",
  "vendor": "Acme Supplies",
  "line_items": [
    {"description": "Office chairs", "qty": 4, "unit_price": 299.99},
    {"description": "Standing desk", "qty": 1, "unit_price": 649.99}
  ],
  "total": 1849.95
}

Note: unit prices are pre-tax. Let me know if you need the tax breakdown.
"""

json_str = extract_json_object(raw_response)
if json_str:
    invoice = json.loads(json_str)
    print(f"Parsed invoice: {invoice['invoice_number']}")
else:
    raise ValueError("No JSON object found in LLM response")

Pattern 3: Use json-repair for Structural Errors

Truncation and minor structural errors — a missing closing brace, an unquoted key, a trailing comma — are where regex extraction falls short. The json-repair library was built exactly for this. It applies a series of heuristics to recover as much valid structure as possible from broken JSON, similar to how browsers tolerate malformed HTML. Install it with pip install json-repair, then drop it into your parsing pipeline as the last line of defense before you give up on a response.

python

import json
import json_repair  # pip install json-repair

def parse_with_repair(text: str) -> dict | list | None:
    """
    Attempt standard parse first; fall back to json_repair for
    structurally broken responses (truncation, missing braces, etc.).
    """
    # First pass: clean up fences and extract the JSON substring
    cleaned = extract_json_object(strip_code_fences(text))
    if not cleaned:
        return None

    # Second pass: try the fast standard parse
    try:
        return json.loads(cleaned)
    except json.JSONDecodeError:
        pass

    # Third pass: let json_repair reconstruct broken structure
    try:
        repaired = json_repair.repair_json(cleaned, return_objects=True)
        return repaired if repaired else None
    except Exception:
        return None

# Works even on truncated output from a token-limited response
truncated = """
{
  "invoice_number": "INV-2024-0192",
  "vendor": "Acme Supplies",
  "line_items": [
    {"description": "Office chairs", "qty": 4
"""

result = parse_with_repair(truncated)
# Returns {"invoice_number": "INV-2024-0192", "vendor": "Acme Supplies",
#          "line_items": [{"description": "Office chairs", "qty": 4}]}

Manual debugging tip: When you're investigating a specific broken response, paste it into the JSON Fixer to see exactly what json-repair does to it — or use the JSON Validator to identify the exact line and character position of the syntax error before deciding whether to repair or re-prompt.

Pattern 4: Retry with Explicit Prompting

Sometimes the best parser is the model itself. If the output is garbled beyond what json-repair can fix — hallucinated keys, completely wrong structure, a response that's more prose than data — send the broken output back to the model with the parse error and ask it to fix its own mistake. Models are surprisingly good at this. Keep the retry count low (2–3 max) and track attempts to avoid infinite loops.

python

import json
from openai import OpenAI

client = OpenAI()

def call_model(messages: list) -> str:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )
    return response.choices[0].message.content

def extract_invoice_data(document_text: str, max_retries: int = 3) -> dict:
    """Extract structured invoice data with automatic retry on parse failure."""
    system_prompt = """Extract invoice data and return ONLY a JSON object with these fields:
{
  "invoice_number": string,
  "vendor": string,
  "issue_date": string (YYYY-MM-DD),
  "due_date": string (YYYY-MM-DD) or null,
  "line_items": [{"description": string, "qty": number, "unit_price": number}],
  "subtotal": number,
  "tax": number,
  "total": number,
  "currency": string (ISO 4217)
}
Return ONLY the JSON object. No markdown. No explanation."""

    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": f"Extract invoice data from:\n\n{document_text}"}
    ]

    for attempt in range(max_retries):
        raw = call_model(messages)

        try:
            cleaned = extract_json_object(strip_code_fences(raw))
            return json.loads(cleaned)
        except (json.JSONDecodeError, TypeError) as e:
            if attempt == max_retries - 1:
                raise ValueError(
                    f"Failed to parse JSON after {max_retries} attempts. "
                    f"Last error: {e}. Last response: {raw[:200]}"
                )

            # Feed the error back — the model often corrects itself
            messages.append({"role": "assistant", "content": raw})
            messages.append({
                "role": "user",
                "content": (
                    f"That response caused a JSON parse error: {e}\n"
                    f"Please return ONLY a valid JSON object. No markdown fences, "
                    f"no commentary, just the raw JSON."
                )
            })

    raise ValueError("Unexpected exit from retry loop")

Pattern 5: Skip Parsing — Use Structured Outputs Instead

If you control the model call and can afford to use newer APIs, structured outputs eliminate most of this complexity entirely. OpenAI Structured Outputs (available on GPT-4o and later) and Gemini's response schema both constrain the model's output at the token-generation level — it's mathematically impossible for the model to return a malformed JSON object because invalid tokens are suppressed during decoding. The downside: you give up some model creativity and these APIs cost slightly more per call. For high-volume extraction pipelines, they're usually worth it.

python

from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()

class LineItem(BaseModel):
    description: str
    qty: int
    unit_price: float

class Invoice(BaseModel):
    invoice_number: str
    vendor: str
    issue_date: str          # YYYY-MM-DD
    total: float
    currency: str            # ISO 4217
    line_items: list[LineItem]

def extract_invoice_structured(document_text: str) -> Invoice:
    """
    Extract invoice using OpenAI Structured Outputs.
    The API guarantees the response matches the Invoice schema —
    no manual parsing or repair needed.
    """
    completion = client.beta.chat.completions.parse(
        model="gpt-4o-2024-08-06",
        messages=[
            {
                "role": "system",
                "content": "Extract invoice data from the provided document."
            },
            {"role": "user", "content": document_text}
        ],
        response_format=Invoice
    )
    return completion.choices[0].message.parsed

invoice = extract_invoice_structured(document_text)
print(f"Invoice {invoice.invoice_number}: ${invoice.total:.2f} {invoice.currency}")

A Production-Ready Parser (Python)

Here's what a production extraction function looks like when you combine all four defensive patterns into a single utility. This is the version I actually run in services that process thousands of LLM responses per day. It strips fences, extracts the JSON substring, attempts a clean parse, falls back to json_repair, and optionally validates against a JSON Schema before returning. If you're not using structured outputs, this is your foundation.

python

import re
import json
from typing import Any
import json_repair        # pip install json-repair
import jsonschema         # pip install jsonschema

def strip_code_fences(text: str) -> str:
    match = re.search(r'^```(?:\w+)?\s*\n?(.*?)\n?```$', text.strip(), re.DOTALL)
    return match.group(1).strip() if match else text.strip()

def extract_json_substring(text: str) -> str | None:
    match = re.search(r'\{.*\}', text, re.DOTALL) or re.search(r'\[.*\]', text, re.DOTALL)
    return match.group(0) if match else None

def parse_llm_json(text: str, schema: dict | None = None) -> Any:
    """
    Robustly parse JSON from LLM output.

    Steps:
      1. Strip markdown code fences
      2. Extract outermost JSON object/array (handles surrounding prose)
      3. Fast-path: standard json.loads
      4. Slow-path: json_repair for structurally broken responses
      5. Optional: validate against a JSON Schema

    Args:
        text:   Raw text returned by the LLM
        schema: Optional JSON Schema dict to validate the parsed result

    Returns:
        Parsed Python object (dict or list)

    Raises:
        ValueError: If parsing fails after all recovery attempts
        jsonschema.ValidationError: If schema validation fails
    """
    if not text or not text.strip():
        raise ValueError("LLM returned an empty response")

    # Step 1 — strip fences
    text = strip_code_fences(text)

    # Step 2 — extract JSON substring (handles prose before/after)
    json_str = extract_json_substring(text)
    if not json_str:
        raise ValueError(f"No JSON object or array found in response: {text[:200]!r}")

    # Step 3 — standard parse (fast path, no overhead)
    parsed = None
    try:
        parsed = json.loads(json_str)
    except json.JSONDecodeError as original_error:
        # Step 4 — repair and retry
        try:
            repaired = json_repair.repair_json(json_str, return_objects=True)
            if repaired is not None:
                parsed = repaired
        except Exception as repair_error:
            raise ValueError(
                f"JSON parse failed and repair also failed.\n"
                f"Parse error: {original_error}\n"
                f"Repair error: {repair_error}\n"
                f"Input (first 500 chars): {json_str[:500]!r}"
            ) from original_error

    if parsed is None:
        raise ValueError(f"Parsing returned None for input: {json_str[:200]!r}")

    # Step 5 — optional schema validation
    if schema is not None:
        jsonschema.validate(parsed, schema)  # raises ValidationError on mismatch

    return parsed


# --- Usage ---

INVOICE_SCHEMA = {
    "type": "object",
    "required": ["invoice_number", "vendor", "total"],
    "properties": {
        "invoice_number": {"type": "string"},
        "vendor":         {"type": "string"},
        "total":          {"type": "number"},
        "currency":       {"type": "string"},
        "line_items":     {"type": "array"}
    }
}

llm_response = """
Sure! Here's the structured data:

```json
{
  "invoice_number": "INV-2024-0192",
  "vendor": "Acme Supplies",
  "total": 1849.95,
  "currency": "USD",
  "line_items": [
    {"description": "Office chairs", "qty": 4, "unit_price": 299.99}
  ]
}
```

Let me know if you need any changes!
"""

invoice = parse_llm_json(llm_response, schema=INVOICE_SCHEMA)
print(f"Vendor: {invoice['vendor']}, Total: ${invoice['total']}")

JavaScript Version

The same logic in JavaScript. For the repair step, the closest equivalent to json_repair is JSON5 for tolerant parsing of near-valid JSON, or you can write a lightweight repair wrapper yourself. For client-side work, JSON.parse() with a good try/catch and a regex fallback covers the vast majority of production cases.

// npm install json5   (optional — for tolerant parsing of near-valid JSON)
import JSON5 from 'json5';

function stripCodeFences(text) {
  const match = text.trim().match(/^```(?:\w+)?\s*\n?([\s\S]*?)\n?```$/);
  return match ? match[1].trim() : text.trim();
}

function extractJsonSubstring(text) {
  // Greedy match for outermost object or array
  const objectMatch = text.match(/\{[\s\S]*\}/);
  if (objectMatch) return objectMatch[0];
  const arrayMatch = text.match(/\[[\s\S]*\]/);
  return arrayMatch ? arrayMatch[0] : null;
}

/**
 * Robustly parse JSON from LLM output.
 * Steps: strip fences → extract substring → JSON.parse → JSON5 fallback
 *
 * @param {string} text - Raw LLM response text
 * @returns {object|Array} Parsed JavaScript value
 * @throws {Error} If all parse attempts fail
 */
function parseLlmJson(text) {
  if (!text || !text.trim()) {
    throw new Error('LLM returned an empty response');
  }

  // Step 1 — strip markdown fences
  let cleaned = stripCodeFences(text);

  // Step 2 — extract JSON substring (skip surrounding prose)
  const jsonStr = extractJsonSubstring(cleaned);
  if (!jsonStr) {
    throw new Error(`No JSON object or array found in response: ${text.slice(0, 200)}`);
  }

  // Step 3 — standard JSON.parse (fast path)
  try {
    return JSON.parse(jsonStr);
  } catch (stdError) {
    // Step 4 — JSON5 tolerant parser (handles trailing commas, unquoted keys, etc.)
    try {
      return JSON5.parse(jsonStr);
    } catch (json5Error) {
      throw new Error(
        `JSON parse failed.\nStandard error: ${stdError.message}\nJSON5 error: ${json5Error.message}\nInput: ${jsonStr.slice(0, 300)}`
      );
    }
  }
}

// --- Usage ---

const llmResponse = `
Here is the product data you requested:

\`\`\`json
{
  "product_id": "SKU-8821-B",
  "name": "Ergonomic Office Chair",
  "price": 299.99,
  "in_stock": true,
  "tags": ["furniture", "ergonomic", "office"]
}
\`\`\`

Let me know if you need the full catalog!
`;

const product = parseLlmJson(llmResponse);
console.log(`Product: ${product.name} — $${product.price}`);
// → Product: Ergonomic Office Chair — $299.99

Wrapping Up

LLMs break JSON in five predictable ways, and each one has a predictable fix. Markdown fences and surrounding prose are cosmetic — a couple of regexes handle them reliably. Structural damage from truncation or minor formatting errors is what json_repair was built for. When the structure is correct but the content is wrong — bad keys, wrong types — that's a prompting problem, and a retry loop with the error message fed back to the model is your best tool. And if you can use Structured Outputs, do it — it eliminates the problem at the source rather than treating the symptoms. For ad-hoc debugging when a specific response is misbehaving, the JSON Fixer and JSON Formatter will save you time. Build the parse_llm_json utility once, test it against your worst historical responses, and move on — there are better problems to spend your debugging hours on.

← All JSON articles Browse all categories →