Skip to content
Agentic Levels
  • New to AI?
  • Assessment
  • Levels
  • Lessons
  • Tracks
  • Resources
  • Reference
  • What's New
  • What's Next
  • More
    Tool SetupCompareAboutThanksFAQPricingPreferences
  • New to AI?
  • Assessment
  • Levels
  • Lessons
  • Tracks
  • Resources
  • Reference
  • Tool Setup
  • Compare
  • What's New
  • About
  • Thanks
  • FAQ
  • What's Next
  • Pricing

© 2026 Fuentes Studio·Privacy·Terms

yourCouncil
Ready to help
✦

What do you want to understand?

Ask anything about what you're learning.

L8Lesson 3

Add Structured Logging

After this, you'll be able to instrument an agent run with structured JSON logs and a trace ID, then read a trace to find the real failure point in 90 seconds instead of 30 minutes of manual debugging.

Before you start

Before diving in, complete Spec-as-Test so each log line connects to a known expected output and failures are immediately traceable to a specific spec criterion.

The idea

Your pipeline fails at step 14 with a 'null reference' error. You spend 20 minutes debugging step 14. The actual cause: at step 7, an MCP call returned an empty object because the file path had a trailing slash. Every step from 7 to 14 silently passed malformed data forward until step 14 finally broke. Without a trace linking all 14 steps, that root cause is nearly impossible to find.

Structured logging is the fix. Each agent action writes one JSON line: timestamp, trace ID, step number, tool called, one-line input summary, one-line output summary, error if any. A trace ID is a single UUID generated at run start and attached to every log line from that run. When a run fails, you filter by trace ID and read the lines in order. The real failure point is where the output summary first shows an unexpected value, not where the exception fired.

Here is the before and after: a five-step deploy pipeline writes one JSON line per step, all sharing trace_id a3f9c2d1. Step 5 throws on a missing DATABASE_URL. Reading the trace from step 1 shows step 3 (fetch_env_vars) returned an empty object. The real failure is the missing .env.staging file, two steps upstream of where the exception fired.

The instrument-first habit: before you scale any task, add logging. Three lines of structured output per step. You will find the step-7 trailing-slash bug in 90 seconds. Without it, you spend 30 minutes on step 14.

Try it (22 min)

Watch out for

  • Logging the error message only, not the step context. The error message tells you what broke. The step context tells you why.
  • Using a different trace ID per step instead of per run. One run, one trace ID. That is what makes filtering possible.
  • Structured logs that are too verbose. Log input and output summaries, not full payloads. Full payloads make traces unreadable.
  • Debugging from the error line first. Always read the trace in sequence from step 1. The error line is the last symptom, not the root cause.
  • Adding logging after a failure. Add it before you need it. The first run you scale is the one that will fail silently.

Paste this into Claude:

I want to add structured logging to an agent workflow I have. Here is the workflow or skill: [paste your skill definition or describe a multi-step task you run regularly]. Help me add structured logging to it. For each step, log: (1) a trace ID (generate one UUID per run at the start), (2) the step number, (3) the tool called, (4) a one-line input summary, (5) a one-line output summary or error. Format each log line as JSON. Then show me what a trace would look like for a run that fails at step 3 because the tool returned an empty result.

What good looks like:

  • Every step in your workflow now writes a structured JSON log line
  • A single trace ID links all log lines from one run
  • You can filter all log lines for one run using just the trace ID
  • You simulated a failure mid-run and can identify the failure step from the trace alone, without reading the full output
  • You can read the trace in sequence and name the step where malformed data first appeared

What a good response looks like:

Structured trace for a 5-step deploy workflow (run trace_id: a3f9c2d1):

```json
{"trace_id":"a3f9c2d1","step":1,"tool":"fetch_config","input":"path=/deploy/config.yaml","output":"config loaded: 12 keys","error":null,"ts":"2026-04-26T14:00:01Z"}
{"trace_id":"a3f9c2d1","step":2,"tool":"validate_schema","input":"config: 12 keys","output":"validation passed","error":null,"ts":"2026-04-26T14:00:02Z"}
{"trace_id":"a3f9c2d1","step":3,"tool":"fetch_env_vars","input":"env=staging","output":"{}","error":null,"ts":"2026-04-26T14:00:03Z"}
{"trace_id":"a3f9c2d1","step":4,"tool":"merge_config","input":"config+env","output":"merged: 12 keys (env vars: 0)","error":null,"ts":"2026-04-26T14:00:04Z"}
{"trace_id":"a3f9c2d1","step":5,"tool":"deploy","input":"merged config","error":"TypeError: Cannot read properties of undefined (reading 'DATABASE_URL')","ts":"2026-04-26T14:00:09Z"}
```

Reading the trace: step 3 returned {} (empty env vars). Step 4 silently merged with zero env keys. Step 5 tried to read DATABASE_URL and failed. Root cause: step 3, not step 5. Fix: check why fetch_env_vars returned empty for env=staging (missing .env.staging file).

When this breaks

  • Breaks when each step uses its own trace ID because filtering by ID no longer reconstructs a single run, and the timeline that makes debugging possible is gone.
  • Breaks when logs contain full payloads because the trace becomes unreadable and the root-cause search reverts to ad-hoc grepping.
  • Breaks when logging is added after the first incident because the run that motivated the work was already lost, and the pattern that produced it is harder to recognize without history.

Claude can do it for you

Say to Claude: 'I want to add structured logging to this workflow: [paste steps]. For each step, write a JSON log line with trace ID, step number, tool, input summary, output summary, and error if any. Then show me what the trace would look like if step [N] returned an empty result, and identify which downstream steps would be silently affected.'

You can now

Read a real failed-run trace from step 1 and name the step where malformed data first appeared, distinct from the step where the exception fired.

Key takeaways

The error you see is downstream of where the problem happened. Read the trace from step 1, not from the error line. Log everything before you need to debug anything.

  • Every step writes a JSON line with trace ID, step number, tool, input summary, output summary, and error
  • One run, one trace ID. That is what makes per-run filtering possible.
  • The exception fires at the last symptom, not the root cause. Read traces in order, from step 1.
  • Add logging before you scale, not after the first silent failure. The first run you scale is the one that will need a trace.

Go deeper

  • Claude Code Hooks and Settings (PostToolUse hooks for structured logging)
  • Harness Engineering (OpenAI, audit trail and replay patterns)