Skip to content
Agentic Levels
  • New to AI?
  • Assessment
  • Levels
  • Lessons
  • Tracks
  • Resources
  • Reference
  • What's New
  • What's Next
  • More
    Tool SetupCompareAboutThanksFAQPricingPreferences
  • New to AI?
  • Assessment
  • Levels
  • Lessons
  • Tracks
  • Resources
  • Reference
  • Tool Setup
  • Compare
  • What's New
  • About
  • Thanks
  • FAQ
  • What's Next
  • Pricing

© 2026 Fuentes Studio·Privacy·Terms

yourCouncil
Ready to help
✦

What do you want to understand?

Ask anything about what you're learning.

L10Lesson 1

Hub-and-Spoke vs Peer-to-Peer

After this, you'll be able to distinguish hub-and-spoke from peer-to-peer agent architectures, identify which pattern you're actually running, and map the seams in your own project where the architectures differ.

Before you start

Before diving in, complete Hosted or Self-Hosted so your architecture choice is grounded in the hosting infrastructure you already have in place.

The idea

Your L9 setup has a supervisor agent that dispatches to workers and aggregates results. You are wondering whether you even need that supervisor. What if each agent just claimed tasks from a shared queue and handed off results directly? That is the Level 10 question, and the answer is more nuanced than it looks.

Level 9 is hub-and-spoke: a coordinator in the middle, workers at the edges. Level 10 explores removing that coordinator. Peer-to-peer means agents claim tasks from a shared store, pass findings to other agents directly, and coordinate without any single orchestrator. That is the idea. The reality as of 2026 is more cautious.

A well-cited early result: 16 parallel agents built a C compiler by claiming subtasks from a shared queue. That architecture is closer to distributed hub-and-spoke than true peer-to-peer. Each agent still consulted a shared task store, which is a passive coordinator. True peer-to-peer, where agents communicate directly with no shared arbiter at all, is still a research pattern. It is not production-ready in most contexts.

Here is the before and after: The failure mode difference matters. Hub-and-spoke: if the coordinator fails, work stops but nothing conflicts. Peer-to-peer: if two agents claim overlapping work or one agent's output triggers another in a loop, you get emergent failures that no single agent caused and no single point of control catches.

The practical question is not which is better. It is: which seams in your project could tolerate shared-queue coordination, and which still need a coordinator? Most L10 practitioners run hybrid architectures. Hub-and-spoke for high-stakes seams. Shared-queue for low-stakes parallelizable work where the tasks are fully independent.

Desktop path: sketch your architecture in a Claude.ai chat session. Ask Claude to draw the communication topology as an ASCII diagram and classify each seam.

CLI path: use Claude Code Sub-Agents with a shared task queue file as the coordination layer for your peer-style seams.

Try it (20 min)

Watch out for

  • Calling your architecture peer-to-peer because agents run in parallel. Parallel workers reporting to a shared coordinator is still hub-and-spoke. The pattern is in the communication topology, not the concurrency.
  • Removing the coordinator before you have observability. Hub-and-spoke gives you one place to check status. Remove it without a replacement visibility layer and you lose the ability to debug mid-run.
  • Treating the shared task queue as a passive file. The queue is the coordinator. Its consistency guarantees are as important as any supervisor agent's logic.
  • Assuming peer-to-peer is the upgrade path from hub-and-spoke. Many production systems top out at well-designed hub-and-spoke and that is the right call. Peer-to-peer is not a reward for reaching Level 10, it is a tradeoff.
  • Conflating agent count with architecture level. 20 workers behind one supervisor is still L9. Two agents sharing a queue with no supervisor is closer to L10.

Paste this into Claude:

I want to map my current agent architecture as hub-and-spoke or peer-to-peer. Here is how my agents currently work: [describe your setup, e.g. 'I have a supervisor that dispatches to 4 workers and aggregates results', or 'I have agents that each pull from a task queue independently']. Help me: (1) Classify my current architecture as hub-and-spoke, distributed hub-spoke, or peer-to-peer and explain which signals point to each. (2) Draw a simple diagram in text showing the communication paths: which agents talk to which, and through what intermediary if any. (3) Identify one seam in my project where removing the hub coordinator would be safe, and one seam where it would not be safe and why. (4) Write a one-paragraph summary of my architecture I could hand to a new team member.

What good looks like:

  • You produced a text diagram showing actual communication paths, not an ideal-state diagram
  • Your classification is specific: hub-and-spoke, distributed hub-spoke, or peer-to-peer, with the signal that determined it
  • You identified at least one seam safe for peer-to-peer and one that is not, with a reason for each
  • The reason for the unsafe seam is specific: what would conflict, loop, or stall without a coordinator
  • You can explain the distinction to someone who has only worked at Level 9

What a good response looks like:

Architecture classification for a documentation + testing pipeline:

Current setup: supervisor dispatches to 4 workers (doc-writer, test-writer, type-checker, linter). Workers return results to supervisor. Supervisor aggregates and reports.

Classification: DISTRIBUTED HUB-AND-SPOKE. Signal: workers do not communicate with each other directly. All results flow through the supervisor. Even though workers run in parallel, the topology is still hub-and-spoke.

Text diagram:
  You
   |
  Supervisor
  /  |  |  \
doc test type lint
  \  |  |  /
  Supervisor (aggregates)
   |
  You (review)

Safe seam for shared-queue: the linter and type-checker. They are fully independent, read-only, and have no shared state. Either could claim tasks from a queue and return results without coordination.

Unsafe seam: doc-writer and test-writer. Both read src/models/ to understand types. If one modifies a model file mid-run, the other's output becomes stale. This seam needs the supervisor to serialize or gate the shared file access.

Summary for new team member: 'We run a distributed hub-and-spoke pipeline. The supervisor coordinates all workers. Linting and type-checking are candidates to move to shared-queue coordination; doc and test generation must remain gated.'

Go deeper (20 min)

Paste this into Claude:

Sketch what your project would look like if you converted one hub-and-spoke seam to a shared-queue pattern. Pick the seam you identified as safe. Design: (1) What goes into the shared task queue (task format, fields, status field). (2) How an agent claims a task atomically without two agents grabbing the same one. (3) What a claiming agent writes back to the queue when it finishes. (4) What happens if an agent claims a task and crashes before finishing. Do not implement this yet. Just design the protocol on paper and identify the two biggest risks.

What good looks like:

  • Your task queue schema has at least: task ID, status (unclaimed/claimed/done/failed), claimer ID, claimed-at timestamp
  • Your claim protocol prevents double-claiming (describe the atomic check, even if informal)
  • You have a timeout or heartbeat mechanism for crashed agents
  • You named two specific risks in your design, not generic 'things could go wrong'

What a good response looks like:

Shared-queue design for the linter seam:

Task queue entry:
{
  "task_id": "lint-src-routes-users",
  "status": "unclaimed",
  "target_file": "src/routes/users.ts",
  "claimer_id": null,
  "claimed_at": null,
  "completed_at": null,
  "result": null
}

Claim protocol: agent reads queue, finds first unclaimed task, writes its agent_id and a timestamp to claimer_id and claimed_at atomically (file lock or compare-and-swap). If two agents read simultaneously, the second write fails the lock and the second agent moves to the next unclaimed task.

On finish: agent writes result and sets status to 'done'. If lint passed: result = 'PASS'. If lint found issues: result = 'FAIL: [list of violations]'.

Crash handling: a watchdog checks for tasks where status = 'claimed' and claimed_at is more than 5 minutes ago. It resets those tasks to 'unclaimed' for retry.

Two biggest risks:
1. The file lock is not atomic on the filesystem being used (NFS, some cloud volumes). If locking fails silently, two agents claim the same task and both write results, creating a race on the result field.
2. The watchdog reclaims a task that was actually in progress but slow. The original agent finishes and writes a result after the reclaimed agent already wrote one. Last-write-wins silently discards the first result.

When this breaks

  • Breaks when teams classify by agent count instead of communication topology because 20 parallel workers behind one supervisor still funnels every decision through the hub, so the assumed parallelism gains never appear.
  • Breaks when the shared task queue is treated as a passive file rather than the coordinator because the queue's consistency guarantees (atomic claims, ordering, durability) silently become the system's reliability ceiling.
  • Breaks when peer-to-peer is adopted without observability because the single status pane that hub-and-spoke provided is gone, and mid-run debugging becomes archaeology over scattered logs.

Claude can do it for you

Say to Claude: 'Review my current agent setup: [describe it]. Classify it as hub-and-spoke, distributed hub-spoke, or peer-to-peer. Draw the communication paths as a text diagram. Then identify the one seam most ready to move toward a shared-queue pattern and the one seam that should stay coordinator-gated, with a reason for each.'

You can now

Classify your current multi-agent architecture as hub-and-spoke, distributed hub-spoke, or peer-to-peer using the communication topology (not the agent count) and name one seam safe for shared-queue coordination plus one that must remain coordinator-gated.

Key takeaways

Most multi-agent systems people call peer-to-peer are distributed hub-and-spoke. Know which one you are running before you try to remove the hub.

  • Hub-and-spoke routes every communication through a coordinator; peer-to-peer routes agent-to-agent. Classify by topology, not concurrency
  • Most production systems labeled peer-to-peer are distributed hub-and-spoke with a passive shared queue acting as coordinator
  • Pick seams individually: high-stakes work stays coordinator-gated, low-stakes independent work can move to shared-queue
  • Remove the orchestrator only when you have replaced its observability and consistency guarantees, not before

Go deeper

  • Claude Code Sub-Agents (coordination patterns reference)
  • microsoft/autogen (research-grade multi-agent coordination)
  • 12-factor-agents: Factor 12, stateless composable agents