After this, you'll be able to enforce structured output from any prompt that feeds a downstream system, using the right tool for the job: JSON mode, XML tags, or schema validation.
Before you start
Complete Build Your First RAG Pipeline first; this lesson builds on retrieval so the chunks your pipeline returns have a contract the next step can reliably parse.
The idea
Your AI just produced a great recommendation as flowing prose. Now your database write is choking on it. The downstream system expected a JSON object with a `category` field and a `priority` integer. It got three paragraphs. The output was correct. The shape was wrong.
The moment your AI output feeds any downstream process, free-form text becomes a liability. A database write, a UI render, a second model call, a webhook: all of them need predictable shapes. Without a contract, every consumer writes its own fragile parser. With one, the output is the API.
There are three tools and they are not interchangeable. JSON mode (available in OpenAI and most providers) enforces valid JSON syntax but does not enforce field names or types. Use it for simple key-value outputs when you control the consumer. XML tags are the right choice for multi-part responses where field order matters or where the response mixes structured data with free text. Use tags like `<summary>`, `<action>`, `<confidence>` to carve the response into labeled sections your code can parse with a simple regex or XML parser. Schema validation (OpenAI structured outputs, Anthropic tool use with JSON schemas) is the strict option: it enforces field names, types, and required fields at the API level, rejecting responses that do not conform.
Here is the before and after: a team built an agent that triaged customer support tickets. The agent output a JSON object with three fields: `category` (one of five enum values), `priority` (1-5 integer), and `summary` (string under 200 characters). That object wrote directly to their database. Before schema enforcement, the agent occasionally returned `'priority': 'high'` instead of `'priority': 3`, and the database write crashed silently. After adding a strict schema via Anthropic tool use, invalid outputs were caught at the API level before they ever reached the database. The schema validation step caught 3 type errors per 100 calls that would have crashed the write silently.
Try it (20 min)
Watch out for
Paste this into Claude:
I have a prompt that currently produces free-form text but feeds a downstream process. Here is the prompt: [paste your prompt]. Here is what the downstream process needs: [describe the fields, types, and constraints: e.g., a JSON object with a 'category' field (one of: bug, feature, question), a 'priority' field (integer 1-5), and a 'summary' field (string under 150 chars)]. Help me: (1) choose the right structured output method for this use case (JSON mode, XML tags, or schema validation), (2) rewrite the prompt to enforce that structure, and (3) show me what a valid output looks like and what an invalid output looks like so I can test it.
What good looks like:
What a good response looks like:
For your support ticket triage prompt, here is my analysis.
Recommended method: schema validation via Anthropic tool use. Your downstream process is a database write with strict column types. JSON mode would catch `{'priority': 'high'}` only at parse time (after the write fails). Schema validation catches it at the API level before your code ever sees the response.
Here is the schema definition:
```json
{
"name": "triage_ticket",
"input_schema": {
"type": "object",
"properties": {
"category": { "type": "string", "enum": ["bug", "feature", "billing", "access", "other"] },
"priority": { "type": "integer", "minimum": 1, "maximum": 5 },
"summary": { "type": "string", "maxLength": 150 }
},
"required": ["category", "priority", "summary"]
}
}
```
Valid output example:
```json
{ "category": "bug", "priority": 3, "summary": "User cannot log in after password reset. Token expires immediately on first use." }
```
Invalid output the schema would reject (and that JSON mode would pass):
```json
{ "category": "bug", "priority": "high", "summary": "Login issue" }
```
The `priority: 'high'` would fail the `integer` type check. The API returns a validation error instead of a malformed response. Your database write never runs.Go deeper (15 min)
Paste this into Claude:
Pick one prompt from your current workflow that produces free-form text. It does not have to feed a database. It just needs to have predictable parts: maybe it always outputs a recommendation, a reason, and a confidence level, even if those are buried in prose right now. Add XML tags to that prompt to carve out the structure explicitly: for example, wrap the recommendation in `<recommendation>`, the reason in `<reason>`, and the confidence in `<confidence score='1-10'>`. Run it five times and confirm the tags are always present and parseable.
What good looks like:
What a good response looks like:
Your vendor evaluation prompt already has implicit structure. Every response you get contains a recommendation, a list of reasons, and a confidence signal buried in phrases like 'I am fairly confident' or 'this is a strong fit.' Here is the XML-tagged version. Original prompt ending: '...evaluate this vendor and give me your recommendation.' Revised prompt ending: '...evaluate this vendor and structure your response as follows: <recommendation>APPROVE or REJECT or DEFER</recommendation> <reasons> <reason>One specific reason per tag</reason> </reasons> <confidence score="1-10">Integer only</confidence>' Sample output after 5 runs (consistent): ```xml <recommendation>APPROVE</recommendation> <reasons> <reason>SOC 2 Type II certified, audit report available</reason> <reason>EU data residency supported, no cross-border transfer required</reason> <reason>SLA guarantees 99.9% uptime with financial penalty clause</reason> </reasons> <confidence score="8">8</confidence> ``` Two-line parser: ```python import re recommendation = re.search(r'<recommendation>(.*?)</recommendation>', output).group(1) confidence = int(re.search(r'<confidence score="(\d+)">', output).group(1)) ``` All 5 runs: tags present, confidence always an integer 1-10, recommendation always one of the three allowed values.
When this breaks
Claude can do it for you
Paste your prompt to Claude and say: 'This prompt feeds a downstream process. Design a structured output schema for it, choose the right enforcement method, and rewrite the prompt to use it. Show me a valid output and an invalid output so I can write a test.' It will design the contract for you.
You can now
Produce the same structured shape across 5 consecutive runs of a prompt that previously returned free text, and parse all 5 outputs cleanly without manual cleanup.
Key takeaways
Structured output is a contract between the model and the consumer. JSON mode for syntax, XML tags for multi-part prose, schema validation for strict types. Choose deliberately.