LLMs hallucinate structure. A production agent needs a validation layer between the raw model output and the code that acts on it. Here's a practical, type-safe pattern using Zod.
Here's a failure mode I've seen in nearly every AI agent codebase I've reviewed: the agent receives a model response, trusts the JSON it contains, and calls `.result.items[0].id` — which throws `Cannot read properties of undefined` at 2 AM because the model returned `{"result": null}` on an edge case.
The model didn't hallucinate the content. It hallucinated the *structure*.
This is surprisingly common, and the fix isn't "use a better prompt." The fix is a validation layer that runs between the raw model output and the code that acts on it.
---
Claude and GPT-4 both support structured output modes that constrain the model to emit valid JSON matching a given schema. This is genuinely useful and you should use it. But it doesn't fully solve the problem, for two reasons:
**1. JSON-valid is not semantically valid.**
The model can emit perfectly valid JSON that conforms to your schema and still be wrong. A string field that should be a UUID might contain a made-up identifier that fails a database lookup. An integer field labeled `confidence_score` might be 847 when your code expects a 0-1 float. The schema enforces types, not semantics.
**2. Not all LLM calls use structured output.**
If you're doing multi-step reasoning, chain-of-thought steps, tool call parsing, or processing outputs from models that don't support native JSON mode, you're parsing free-text responses. You need to handle that robustly.
---
Every agent call I build now goes through three stages:
raw model output
↓
[PARSE] – extract the structure from the text
↓
[VALIDATE] – assert the structure matches expectations
↓
[CLASSIFY] – categorize the outcome so the caller can handle itHere's the TypeScript implementation I actually use:
import { z } from "zod";
// 1. Define the schema for what you expect
const AnalysisResultSchema = z.object({
sentiment: z.enum(["positive", "negative", "neutral"]),
confidence: z.number().min(0).max(1),
key_points: z.array(z.string()).min(1).max(10),
action_required: z.boolean(),
follow_up: z.string().optional(),
});
type AnalysisResult = z.infer<typeof AnalysisResultSchema>;
// 2. The parse-validate-classify wrapper
type AgentOutput<T> =
| { ok: true; data: T }
| { ok: false; reason: "parse_failure" | "validation_failure" | "empty_response"; raw: string; error?: string };
function parseAgentOutput<T>(
raw: string,
schema: z.ZodSchema<T>
): AgentOutput<T> {
// Guard: empty or whitespace-only response
if (!raw.trim()) {
return { ok: false, reason: "empty_response", raw };
}
// Extract JSON from the response — models often wrap it in prose or code fences
const jsonMatch = raw.match(/```(?:json)?\s*([\s\S]*?)```/) ||
raw.match(/(\{[\s\S]*\}|\[[\s\S]*\])/);
const jsonString = jsonMatch ? jsonMatch[1] ?? jsonMatch[0] : raw.trim();
let parsed: unknown;