After this, you'll be able to measure the token cost of a skill or MCP workflow, identify the expensive steps, and restructure the sequence to front-load cheap validation before costly generation.
Before you start
Complete Trust No Tool Response first; this lesson builds on clean, labeled tool calls so your cost measurements reflect real work rather than retries caused by injection failures.
The idea
You build a deploy skill and run it 20 times in a day. You do not notice until the bill arrives that each run cost 35,000 tokens because you put the generation step before the validation step. Every validation failure triggered a full regeneration cycle. The sequence was wrong, and it cost you 10x what it needed to.
A linter call or schema check costs roughly 500 to 2,000 tokens. A code generation or refactoring call costs 10,000 to 50,000 tokens. That is a 10-to-50x gap. One rule closes most of the waste: run all validation steps before any generation steps. If the validator fails, stop. Do not generate. Fix the input, then retry.
Here is the before and after: original sequence was (1) generate config, (2) validate, (3) fix errors, (4) push. Every validation failure looped back to step 1. Average cost: 35,000 tokens. Reordered: (1) fetch current config, (2) validate existing state, (3) generate only if clean, (4) push. Average cost: 8,000 tokens per run, a 77% reduction from one sequencing change.
For Claude Desktop users: after any skill run, ask 'how many tokens did each step use?' and Claude will estimate from the session transcript. For Claude Code users: wire a PostToolUse hook to log input and output token counts per tool call. After 5 runs you will see exactly which step is consuming the budget.
Try it (18 min)
Watch out for
Paste this into Claude:
I want to audit the token cost of a skill I have built (or a workflow I do regularly). Here are the steps in the workflow: [list your steps in order, with what tool or action each step calls]. For each step, estimate: (1) Is this a cheap validation step or an expensive generation step? (2) Roughly how many tokens does this step consume on average? (3) If a step fails, does the failure trigger another expensive generation cycle? After the audit, reorder the steps so that all cheap validation steps run before any expensive generation steps. Then show me the estimated cost difference between the original order and the optimized order.
What good looks like:
What a good response looks like:
Token cost audit for a 4-step deploy skill: Original order: Step 1: generate_deployment_config — EXPENSIVE (~18,000 tokens, full generation) Step 2: run_validator — CHEAP (~800 tokens, shell call) Step 3: fix_validation_errors — EXPENSIVE (~12,000 tokens if errors found, loops back to step 1) Step 4: push — CHEAP (~400 tokens) Problems found: - Steps 1 and 3 create a retry loop: every validation failure triggers a new generation cycle - Average runs with 1 validation error: ~31,200 tokens. With 2 errors: ~49,600 tokens. Optimized order: Step 1: fetch_current_config — CHEAP (~600 tokens, MCP read) Step 2: run_validator on existing config — CHEAP (~800 tokens) Step 3: generate_delta_only (skipped entirely if step 2 passes) — EXPENSIVE (~8,000 tokens, smaller scope) Step 4: push — CHEAP (~400 tokens) Estimated cost: 9,800 tokens per run when config is already valid (most runs). 68% reduction. The key insight: validate what already exists before you generate anything new.
When this breaks
Claude can do it for you
Paste your skill steps to Claude and say: 'For each step, classify it as cheap validation or expensive generation, estimate token cost, and tell me if any failure triggers a retry cycle. Then give me an optimized ordering and estimate how much cheaper it would be per run.' Claude Desktop users: just ask this after running the skill once. Claude Code users: wire a hook to log tokens per tool call automatically.
You can now
Reorder a real skill so validation steps run before generation steps and produce a cost-per-iteration estimate that shows at least a 30% reduction from the original sequence.
Key takeaways
Validate cheap before you generate expensive. Know your cost-per-iteration from the first run, not after you have scaled a wasteful workflow.