Structured Output Prompting: Get JSON, Tables, and Formatted Data From Any AI

February 1, 2026 · Updated February 27, 2026 · 6 min read

#structured output AI #JSON output prompting #data extraction prompts #format control AI

“Give me the data in JSON format” works about 60% of the time. The other 40%, you get malformed JSON, extra commentary, missing fields, or the model deciding to explain why JSON is a great format instead of actually producing it.

Structured Output Prompting: Get JSON, Tables, and Formatted Data From Any AI - Person working with AI tools on laptop

Structured output prompting is the set of techniques that push that 60% to 95%+. If you’re building anything that programmatically consumes AI output, this is essential.

Why Models Struggle With Structure

LLMs generate text token by token. They don’t have a concept of “valid JSON” — they’re predicting the next most likely character based on patterns. This means:

They might forget to close a bracket
They might add a trailing comma (invalid JSON)
They might wrap the JSON in markdown code blocks you didn’t ask for
They might add explanatory text before or after the JSON

Understanding this helps you write prompts that work with the model’s tendencies, not against them.

Developer typing on keyboard

The Reliable JSON Prompt Template

This template works consistently across GPT-4, Claude, Gemini, and most capable models:

Extract the following information from the text below.

Respond with ONLY a JSON object. No explanation, no markdown, no additional text.

Required fields:
- name (string): The person's full name
- email (string): Email address, or null if not found
- company (string): Company name, or null if not found
- role (string): Job title, or null if not found

Example output:
{"name": "Jane Smith", "email": "[email protected]", "company": "Acme Inc", "role": "CTO"}

Text to extract from:
"""
{input_text}
"""

Why each element matters:

“Respond with ONLY a JSON object” — Explicit instruction to suppress commentary
Field definitions with types — The model knows exactly what to produce
Null handling — Tells the model what to do when data is missing (instead of guessing)
Example output — Shows the exact format expected
Delimited input — Triple quotes separate the instruction from the data

Beyond JSON: Other Structured Formats

CSV Output

Convert the following data into CSV format.
Output ONLY the CSV data with a header row. No explanation.
Use commas as delimiters. Wrap fields containing commas in double quotes.

Header: Name, Email, Department, Start Date

Data:
{input}

Markdown Tables

Organize this information into a markdown table.
Output ONLY the table. No text before or after.

Columns: Feature | Free Plan | Pro Plan | Enterprise
Sort rows by feature name alphabetically.

Information:
{input}

YAML

Convert this configuration into valid YAML.
Output ONLY the YAML. No code fences, no explanation.
Use 2-space indentation. Include comments for non-obvious values.

Configuration:
{input}

Data visualization dashboard

Validation Strategies

Never trust AI output blindly. Always validate.

Python JSON Validation

import json

def parse_ai_json(response: str) -> dict | None:
    # Strip markdown code fences if present
    cleaned = response.strip()
    if cleaned.startswith("```"):
        cleaned = cleaned.split("\n", 1)[1]  # Remove first line
        if cleaned.endswith("```"):
            cleaned = cleaned[:-3]
        cleaned = cleaned.strip()
    
    try:
        data = json.loads(cleaned)
    except json.JSONDecodeError:
        # Try to fix common issues
        # Trailing comma before closing brace
        cleaned = re.sub(r",\s*([}\]])", r"\1", cleaned)
        # Single quotes instead of double
        cleaned = cleaned.replace("'", '"')
        try:
            data = json.loads(cleaned)
        except json.JSONDecodeError:
            return None
    
    return data


def validate_schema(data: dict, required_fields: dict) -> list[str]:
    """Validate that output matches expected schema"""
    errors = []
    for field, expected_type in required_fields.items():
        if field not in data:
            errors.append(f"Missing field: {field}")
        elif not isinstance(data[field], expected_type) and data[field] is not None:
            errors.append(f"Wrong type for {field}: expected {expected_type.__name__}")
    return errors

Retry With Error Feedback

When validation fails, feed the error back to the model:

Your previous response was not valid JSON. The error was:
{error_message}

Your response was:
{previous_response}

Please fix the JSON and respond with ONLY the corrected JSON object.

This self-correction loop fixes most issues in one retry.

Advanced Techniques

Schema Enforcement With Pydantic

For production systems, define your expected output as a Pydantic model:

from pydantic import BaseModel, Field

class ExtractedContact(BaseModel):
    name: str = Field(description="Full name")
    email: str | None = Field(description="Email address")
    company: str | None = Field(description="Company name")
    role: str | None = Field(description="Job title")

# Include the schema in your prompt
schema_str = ExtractedContact.model_json_schema()
prompt = f"Extract contact info. Output must conform to this JSON schema:\n{schema_str}"

Function Calling / Tool Use

Most modern APIs (OpenAI, Anthropic, Google) support function calling, which forces the model to output structured data matching a predefined schema. This is more reliable than prompt-based approaches.

# OpenAI function calling example
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": f"Extract contact info from: {text}"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "save_contact",
            "parameters": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "email": {"type": "string"},
                    "company": {"type": "string"},
                },
                "required": ["name"]
            }
        }
    }]
)

Function calling gives you schema validation at the API level — the model is constrained to produce valid output.

Batch Processing

When extracting structured data from multiple items, process them one at a time rather than asking for an array of 50 objects. Models are more reliable with single-item extraction.

results = []
for item in items:
    result = extract_single(item)  # One API call per item
    if validate(result):
        results.append(result)
    else:
        results.append(retry_extract(item))

More API calls, but dramatically higher accuracy.

Programming workspace with coffee

Common Failure Modes

1. The Chatty Model. “Sure! Here’s the JSON you requested:” followed by the actual JSON. Fix: “Respond with ONLY the JSON. Your entire response must be parseable as JSON.”

2. Markdown Wrapping. The model wraps output in ```json code fences. Fix: “Do not use code fences or markdown formatting.” Or just strip them in post-processing.

3. Hallucinated Fields. The model adds fields you didn’t ask for. Fix: “Include ONLY the fields listed above. Do not add any additional fields.”

4. Inconsistent Null Handling. Sometimes null, sometimes empty string, sometimes “N/A”. Fix: “Use null for missing values. Do not use empty strings, ‘N/A’, or ‘unknown’.”

Recommended Gear

TIJN Blue Light Blocking Glasses

~$16

View on Amazon →

Gimars Memory Foam Keyboard Wrist Rest

~$10

View on Amazon →

Logitech M325s Wireless Mouse

~$15

View on Amazon →

Hands on laptop keyboard coding

Key Takeaways

Always include an example of the exact output format you want.
Use “Respond with ONLY [format]” to suppress commentary.
Define every field with its type and null behavior.
Validate and retry — never trust raw AI output in production.
For production systems, use function calling / tool use over prompt-based formatting.

Structured output is where prompt engineering meets software engineering. Get it right, and AI becomes a reliable data processing tool. Get it wrong, and you’re writing regex to parse free-form text — which is exactly what you were trying to avoid.