Multi-agent systems that coordinate on their own
After this, you'll be able to distinguish peer-to-peer coordination from hub-and-spoke orchestration, name the emergent failure modes that only appear at fleet scale, and decide when an autonomous team is the wrong choice.
Before you start
Complete Running Background Agents first; this lesson builds on the branch-per-agent isolation and cost-management patterns that peer-to-peer coordination extends.
The idea
Level 10 removes the orchestrator bottleneck. At Level 9, one coordinator dispatches to workers and collects results. Level 10 is peer-to-peer: agents claim tasks from a shared store, pass findings directly to each other, and coordinate without a human relay. This is not an incremental improvement on Level 9. It is a different architecture with different failure modes.
Be honest about what is and is not solved. True peer-to-peer coordination, where agents communicate directly without any shared coordinator, is still a research pattern in early 2026. Most real multi-agent setups that claim peer-to-peer are closer to distributed hub-and-spoke. The seams are visible. Nobody has fully solved reliability, recovery, and trust boundaries at production scale.
Here is the before and after: a team shipped a 'fully autonomous' 8-agent pipeline for content generation. On inspection, 6 of 8 agents still routed through a central Redis queue with a single coordinator process reading results. That coordinator was a hub. Calling it peer-to-peer was marketing. The actual peer exchange (2 agents passing structured data directly via shared memory) worked reliably. The other 6 connections did not. Honest architecture naming would have caught the design gap in week 1 instead of week 6.
The hard problem at Level 10 is emergent failures: failure modes that no individual agent would produce in isolation. Feedback loops. Conflicting writes. Resource starvation. A single agent making a bad decision is recoverable. A fleet of agents amplifying that decision at machine speed is not. Reproducibility is the first safety requirement. If you cannot replay a multi-agent run and get the same result, you cannot debug it.
Level 10 also means knowing when not to deploy an agent team. When a task requires nuanced judgment, irreversible actions, or domain expertise the model does not have, human review checkpoints are not overhead. They are the correct design. The goal is not maximum autonomy. It is maximum autonomy within the boundaries where correctness is machine-checkable.
Try it (5 min)
Watch out for
Paste this into Claude:
I am running a Level 9 multi-agent setup and considering moving toward Level 10 peer-to-peer coordination. Here is my current architecture: [describe how your supervisor dispatches to workers, what each worker does, and how results flow back]. Help me with three things. (1) Identify the single biggest bottleneck in my orchestrator: is it a throughput limit, a decision only the orchestrator can make, or a state synchronization problem? (2) Tell me whether removing the orchestrator would actually solve that bottleneck, or whether it would just relocate the problem. (3) If decentralization is justified, name one seam in my architecture that is genuinely safe to convert to peer-to-peer and one seam that must remain coordinator-gated, with a specific reason for each.
What good looks like:
When this breaks
You can now
Identify in your current Level 9 architecture which single seam, if any, is genuinely safe to convert to peer-to-peer and explain why the other seams must stay coordinator-gated.
Key takeaways
Level 10 is peer-to-peer coordination without a hub. The hard problem is emergent failures the fleet creates that no single agent would, and the discipline is knowing when not to deploy autonomy.