Multi-Agent Orchestration Patterns: Queues, State, and Handoffs
Reliable multi-agent systems aren't about clever prompts — they're about boring distributed-systems discipline: durable queues between agents, state held outside the model, and idempotent handoffs that survive retries. The model is the worker; the queue is the backbone.
Every Wednesday. 28,400+ operators. Zero fluff.
✓ Check your inbox — click the confirmation link to complete sign-up.
✓ You're subscribed!
✓ You're already on the list.
Table of contents
Open Table of contents
Pattern 1: Put a durable queue between every agent
The first instinct is to call agent B directly from inside agent A. Don’t. Direct calls couple the two: if B is slow, A blocks; if B fails, A’s work is lost; if you need to scale B, you can’t without touching A.
Instead, A finishes its work and enqueues a message for B. B is a separate worker that drains the queue at its own pace.
// Agent A finishes, hands off via the queue — no direct call to B
await env.ENRICH_QUEUE.send({
traceId,
type: "enrich",
payload: classifierResult,
});
// A's job is done. B will pick this up independently.On Cloudflare I use Workers Queues for exactly this — the same primitives behind the agent stack I use. The queue gives you four things for free: buffering (B can be down without losing work), retries (failed messages redeliver), backpressure (a spike queues instead of crashing), and decoupling (scale or redeploy B without touching A). Every one of those is something you’d otherwise have to build by hand and get wrong.
Pattern 2: Hold state outside the model, always
The most common multi-agent bug is assuming the model remembers anything between steps. It doesn’t. Each model call is stateless; the only memory is what you put in the prompt. So the source of truth for “where is this job in the pipeline” must live in a database, not in a conversation.
I keep a single job record that every agent reads and updates:
interface JobState {
traceId: string;
stage: "classified" | "enriched" | "acted" | "done" | "failed";
data: Record<string, unknown>;
attempts: number;
updatedAt: number;
}Each agent does the same loop: read the job state, do its work, write the new state, enqueue the next stage. The model never holds the state — it receives the relevant slice as input and returns a result. This is what makes the system restartable: if a worker dies mid-job, the state record still says exactly where things stood, and the redelivered queue message picks up from there. It also makes debugging tractable, because the state table is a queryable record of every job’s journey — the same instrumentation mindset from how I measure whether an agent is working.
Pattern 3: Make every handoff idempotent
Queues guarantee at-least-once delivery, not exactly-once. That means a message can be delivered twice — network blips, retries, redeploys. If your agent’s action isn’t idempotent, a double-delivery double-acts: two confirmation emails, two bookings, two charges. This is the single nastiest class of orchestration bug, and it’s the one teams discover in production.
The fix is to make actions idempotent with a key:
async function handleEnrich(msg: QueueMessage, env: Env) {
const job = await getJob(env, msg.traceId);
if (job.stage !== "classified") {
// Already processed past this stage — this is a duplicate delivery. Skip.
return;
}
const result = await enrich(job.data);
await advanceJob(env, msg.traceId, "enriched", result);
await env.ACT_QUEUE.send({ traceId: msg.traceId, type: "act" });
}The stage check makes the operation safe to run twice: the second delivery sees the job has already advanced and no-ops. For external side effects (sending an email, charging a card), pass an idempotency key to the downstream API so it deduplicates too. Assume every message will be delivered twice and design so that’s harmless — because eventually it will be.
Pattern 4: Orchestrator vs choreography — pick deliberately
There are two ways to wire the flow, and the right choice depends on complexity.
Choreography (what I default to): each agent knows only the next step and enqueues it. The flow emerges from the chain. Simple, decentralized, easy to extend — add a stage by inserting a queue. The downside is that no single place describes the whole flow, so a complex pipeline can get hard to reason about.
Orchestration (a central coordinator): one orchestrator owns the flow, calls each agent in turn, and decides what’s next based on results. The whole flow lives in one readable place and branching logic is explicit. The cost is a central component that must itself be durable — if the orchestrator’s own state isn’t externalized (Pattern 2), it becomes the single point of failure.
My rule: choreography until branching gets complex, then a durable orchestrator. A linear three-stage pipeline is choreography. A flow with conditional routing, parallel fan-out, and joins wants an orchestrator whose state lives in the database so it can resume after a crash.
Pattern 5: Fan-out, fan-in without losing pieces
When one job spawns N parallel sub-tasks (enrich 50 records, summarize 20 docs) and you need to wait for all of them before continuing, you need a join. The trick is a counter in the job state:
- Parent enqueues N child messages and writes
expected: N, completed: 0to the job record. - Each child does its work and atomically increments
completed. - The child that bumps
completedto equalexpectedenqueues the next stage.
The atomic increment is load-bearing — without it, two children finishing simultaneously can both think they’re not the last, and the join never fires. Use a counter the datastore can increment atomically, or a transaction. This pattern lets you parallelize the expensive middle of a pipeline (often Haiku-cheap work — see the Haiku vs Sonnet cost math) while keeping a clean join at the end.
What I’d skip
You don’t need a heavyweight agent framework to do any of this. Queues, a state table, and idempotency keys are primitives every platform already has. I’ve watched teams reach for elaborate multi-agent frameworks to get features a queue gives you for free, and inherit a black box that’s harder to debug than the plumbing it replaced. Start with the boring primitives. Reach for a framework only when you’ve felt a specific pain it solves.
The summary: agents are stateless workers, queues are the durable backbone, state lives in a database, and every handoff is safe to run twice. That’s the whole game.
FAQ
Should agents call each other directly or go through a queue?
Through a queue. Direct calls couple agents — one’s failure or slowness propagates to the other, and you can’t scale or redeploy independently. A durable queue gives you buffering, retries, backpressure, and decoupling for free.
Where should multi-agent state live?
Outside the model, in a database, as a job record each agent reads and updates. Model calls are stateless, so the source of truth for pipeline progress must be external — that’s what makes the system restartable after a crash.
How do I prevent an agent from acting twice on the same job?
Make handoffs idempotent. Check the job’s stage before acting and no-op if it’s already advanced, and pass idempotency keys to external APIs. Queues deliver at-least-once, so assume every message can arrive twice and design so duplicates are harmless.
Do I need a multi-agent framework?
Usually no. Durable queues, a state table, and idempotency keys cover most production needs with primitives your platform already provides. Adopt a framework only when you hit a concrete problem it uniquely solves, not by default.
Every Wednesday. 28,400+ operators. Zero fluff.
✓ Check your inbox — click the confirmation link to complete sign-up.
✓ You're subscribed!
✓ You're already on the list.
Get the AI playbook in your inbox
Every Wednesday. 28,400+ operators. Zero fluff.
Check your inbox.
We sent you a confirmation email — click the link inside to complete your subscription. Check spam if you don't see it within a minute.
You're subscribed.
Welcome — the next edition lands in your inbox soon.
You're already on the list — look for it every Wednesday.