Multi-Agent Orchestration Patterns: Queues, State, and Handoffs

Alejandro Rioja

June 7, 2026 6 min read

TL;DR

Reliable multi-agent systems aren't about clever prompts — they're about boring distributed-systems discipline: durable queues between agents, state held outside the model, and idempotent handoffs that survive retries. The model is the worker; the queue is the backbone.

Free newsletter

Every Wednesday. 28,400+ operators. Zero fluff.

Open Table of contents

Pattern 1: Put a durable queue between every agent
Pattern 2: Hold state outside the model, always
Pattern 3: Make every handoff idempotent
Pattern 4: Orchestrator vs choreography — pick deliberately
Pattern 5: Fan-out, fan-in without losing pieces
What I’d skip
FAQ

Pattern 1: Put a durable queue between every agent

The first instinct is to call agent B directly from inside agent A. Don’t. Direct calls couple the two: if B is slow, A blocks; if B fails, A’s work is lost; if you need to scale B, you can’t without touching A.

Instead, A finishes its work and enqueues a message for B. B is a separate worker that drains the queue at its own pace.

typescript

// Agent A finishes, hands off via the queue — no direct call to B
await env.ENRICH_QUEUE.send({
  traceId,
  type: "enrich",
  payload: classifierResult,
});
// A's job is done. B will pick this up independently.

On Cloudflare I use Workers Queues for exactly this — the same primitives behind the agent stack I use. The queue gives you four things for free: buffering (B can be down without losing work), retries (failed messages redeliver), backpressure (a spike queues instead of crashing), and decoupling (scale or redeploy B without touching A). Every one of those is something you’d otherwise have to build by hand and get wrong.

Pattern 2: Hold state outside the model, always

The most common multi-agent bug is assuming the model remembers anything between steps. It doesn’t. Each model call is stateless; the only memory is what you put in the prompt. So the source of truth for “where is this job in the pipeline” must live in a database, not in a conversation.

I keep a single job record that every agent reads and updates:

typescript

interface JobState {
  traceId: string;
  stage: "classified" | "enriched" | "acted" | "done" | "failed";
  data: Record<string, unknown>;
  attempts: number;
  updatedAt: number;
}

Each agent does the same loop: read the job state, do its work, write the new state, enqueue the next stage. The model never holds the state — it receives the relevant slice as input and returns a result. This is what makes the system restartable: if a worker dies mid-job, the state record still says exactly where things stood, and the redelivered queue message picks up from there. It also makes debugging tractable, because the state table is a queryable record of every job’s journey — the same instrumentation mindset from how I measure whether an agent is working.

Pattern 3: Make every handoff idempotent

Queues guarantee at-least-once delivery, not exactly-once. That means a message can be delivered twice — network blips, retries, redeploys. If your agent’s action isn’t idempotent, a double-delivery double-acts: two confirmation emails, two bookings, two charges. This is the single nastiest class of orchestration bug, and it’s the one teams discover in production.

The fix is to make actions idempotent with a key:

typescript

async function handleEnrich(msg: QueueMessage, env: Env) {
  const job = await getJob(env, msg.traceId);
  if (job.stage !== "classified") {
    // Already processed past this stage — this is a duplicate delivery. Skip.
    return;
  }
  const result = await enrich(job.data);
  await advanceJob(env, msg.traceId, "enriched", result);
  await env.ACT_QUEUE.send({ traceId: msg.traceId, type: "act" });
}

The stage check makes the operation safe to run twice: the second delivery sees the job has already advanced and no-ops. For external side effects (sending an email, charging a card), pass an idempotency key to the downstream API so it deduplicates too. Assume every message will be delivered twice and design so that’s harmless — because eventually it will be.

Pattern 4: Orchestrator vs choreography — pick deliberately

There are two ways to wire the flow, and the right choice depends on complexity.

Choreography (what I default to): each agent knows only the next step and enqueues it. The flow emerges from the chain. Simple, decentralized, easy to extend — add a stage by inserting a queue. The downside is that no single place describes the whole flow, so a complex pipeline can get hard to reason about.

Orchestration (a central coordinator): one orchestrator owns the flow, calls each agent in turn, and decides what’s next based on results. The whole flow lives in one readable place and branching logic is explicit. The cost is a central component that must itself be durable — if the orchestrator’s own state isn’t externalized (Pattern 2), it becomes the single point of failure.

My rule: choreography until branching gets complex, then a durable orchestrator. A linear three-stage pipeline is choreography. A flow with conditional routing, parallel fan-out, and joins wants an orchestrator whose state lives in the database so it can resume after a crash.

Pattern 5: Fan-out, fan-in without losing pieces

When one job spawns N parallel sub-tasks (enrich 50 records, summarize 20 docs) and you need to wait for all of them before continuing, you need a join. The trick is a counter in the job state:

Parent enqueues N child messages and writes expected: N, completed: 0 to the job record.
Each child does its work and atomically increments completed.
The child that bumps completed to equal expected enqueues the next stage.

The atomic increment is load-bearing — without it, two children finishing simultaneously can both think they’re not the last, and the join never fires. Use a counter the datastore can increment atomically, or a transaction. This pattern lets you parallelize the expensive middle of a pipeline (often Haiku-cheap work — see the Haiku vs Sonnet cost math) while keeping a clean join at the end.

What I’d skip

You don’t need a heavyweight agent framework to do any of this. Queues, a state table, and idempotency keys are primitives every platform already has. I’ve watched teams reach for elaborate multi-agent frameworks to get features a queue gives you for free, and inherit a black box that’s harder to debug than the plumbing it replaced. Start with the boring primitives. Reach for a framework only when you’ve felt a specific pain it solves.

The summary: agents are stateless workers, queues are the durable backbone, state lives in a database, and every handoff is safe to run twice. That’s the whole game.

FAQ

Should agents call each other directly or go through a queue?

Through a queue. Direct calls couple agents — one’s failure or slowness propagates to the other, and you can’t scale or redeploy independently. A durable queue gives you buffering, retries, backpressure, and decoupling for free.

Where should multi-agent state live?

Outside the model, in a database, as a job record each agent reads and updates. Model calls are stateless, so the source of truth for pipeline progress must be external — that’s what makes the system restartable after a crash.

How do I prevent an agent from acting twice on the same job?

Make handoffs idempotent. Check the job’s stage before acting and no-op if it’s already advanced, and pass idempotency keys to external APIs. Queues deliver at-least-once, so assume every message can arrive twice and design so duplicates are harmless.

Do I need a multi-agent framework?

Usually no. Durable queues, a state table, and idempotency keys cover most production needs with primitives your platform already provides. Adopt a framework only when you hit a concrete problem it uniquely solves, not by default.

Keep reading

AI Agents

Get the AI playbook in your inbox

Every Wednesday. 28,400+ operators. Zero fluff.

Multi-Agent Orchestration Patterns: Queues, State, and Handoffs

Table of contents

Pattern 1: Put a durable queue between every agent

Pattern 2: Hold state outside the model, always

Pattern 3: Make every handoff idempotent

Pattern 4: Orchestrator vs choreography — pick deliberately

Pattern 5: Fan-out, fan-in without losing pieces

What I’d skip

FAQ

Should agents call each other directly or go through a queue?

Where should multi-agent state live?

How do I prevent an agent from acting twice on the same job?

Do I need a multi-agent framework?

AI Agent ROI: How I Decide Whether an Automation Is Worth Building

How to Automate Your Small Business with AI Agents: A Practitioner's Guide

Prompt Caching with the Claude API: Cut Your Input Costs Without Switching Models

Get the AI playbook in your inbox

Multi-Agent Orchestration Patterns: Queues, State, and Handoffs

Table of contents

Pattern 1: Put a durable queue between every agent

Pattern 2: Hold state outside the model, always

Pattern 3: Make every handoff idempotent

Pattern 4: Orchestrator vs choreography — pick deliberately

Pattern 5: Fan-out, fan-in without losing pieces

What I’d skip

FAQ

Should agents call each other directly or go through a queue?

Where should multi-agent state live?

How do I prevent an agent from acting twice on the same job?

Do I need a multi-agent framework?

Related posts

AI Agent ROI: How I Decide Whether an Automation Is Worth Building

How to Automate Your Small Business with AI Agents: A Practitioner's Guide

Prompt Caching with the Claude API: Cut Your Input Costs Without Switching Models

Get the AI playbook in your inbox