Alejandro Rioja.
AI Agents

The Agent Stack I Use to Run 30+ Production Agents (No Python)

Alejandro Rioja
Alejandro Rioja
6 min read
TL;DR

I run 30+ production AI agents using TypeScript, Cloudflare Workers/Queues/KV, and Claude models — no Python, no agent framework. The stack is boring on purpose: Workers handle scheduling and queuing, KV stores state, and the Anthropic SDK drives the model calls directly. The constraint that matters is not the AI layer — it's the infrastructure around it.

Free newsletter

Every Wednesday. 28,400+ operators. Zero fluff.

Table of contents

Open Table of contents

Why no Python

The honest answer: I write TypeScript every day for my website and product work. Adding a second language for agents means two runtimes, two dependency trees, two deploy pipelines. The productivity cost isn’t theoretical — I’ve paid it on past projects and decided not to again.

The second reason is Cloudflare. Workers run TypeScript natively at the edge, with Queues, KV, Durable Objects, and Cron Triggers built in. The entire agent infrastructure I need — scheduling, state, async job processing — is one wrangler deploy away. There is no Python equivalent of that with the same operational surface area.

The third reason is that most “Python-is-better-for-AI” arguments are really “Python has more ML libraries.” I don’t train models. I call APIs. The Anthropic SDK is first-class TypeScript. LangChain and its cousins are complexity I don’t want. When you’re shipping agents, not researching them, simplicity wins.

The core infrastructure: three Cloudflare primitives

Every agent I run touches at least one of these three:

Cloudflare Workers — the compute layer. A Worker is the agent’s runtime: it receives a trigger (cron, queue message, HTTP), runs the model call(s), and writes outputs somewhere. Cold start is under 5ms. Execution limit is 30 seconds CPU time on the free plan, 15 minutes on paid. Almost everything I build fits in 30 seconds; the ones that don’t use Queues to fan out.

Cloudflare Queues — async job processing. When a task might take longer than a request, or when I need to fan out (generate 12 translations in parallel), I push messages onto a Queue and let bound consumers process them independently. No polling, no setTimeout hacks.

Cloudflare KV — lightweight state. Agent run history, last-processed timestamps, cached API responses. KV is eventually consistent, which is fine for agents — I’m not running transactions. It gives me a dead-simple key-value store I can read/write from any Worker without spinning up a database.

The model layer: Anthropic SDK, two models

I use exactly two Claude models:

The Anthropic SDK in TypeScript is straightforward. Here’s the pattern I use for every model call:

typescript
import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ apiKey: env.ANTHROPIC_API_KEY });

async function runAgent(prompt: string, systemPrompt: string): Promise<string> {
  const message = await client.messages.create({
    model: "claude-sonnet-4-6",
    max_tokens: 2048,
    system: systemPrompt,
    messages: [{ role: "user", content: prompt }],
  });

  const block = message.content[0];
  if (block.type !== "text") throw new Error("Unexpected content type");
  return block.text;
}

That’s the whole model interface. No abstractions on top. When I need tool use, I add a tools array. When I need streaming, I swap messages.create for messages.stream. There is no framework managing this for me — and I don’t want one.

A real agent: the content pipeline

The most complex agent I run is the content pipeline. It generates blog posts, translates them into 12 languages, renders OG card SVGs, and drafts LinkedIn promos — all as drafts, gated behind my review before anything publishes.

The Worker entry point looks like this:

typescript
// src/workers/content-pipeline.ts
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { topic, slug } = await request.json<{ topic: string; slug: string }>();

    // Step 1: generate EN post
    const enPost = await generatePost(topic, env);
    await env.CONTENT_KV.put(`draft:${slug}:en`, enPost);

    // Step 2: fan out translations via Queue
    const locales = ["ar", "de", "es", "fr", "hi", "it", "ja", "ko", "nl", "pt", "ru", "zh"];
    for (const locale of locales) {
      await env.TRANSLATION_QUEUE.send({ slug, locale, content: enPost });
    }

    return Response.json({ status: "queued", slug });
  },
};

The Queue consumer handles each translation independently:

typescript
// src/workers/translation-consumer.ts
export default {
  async queue(batch: MessageBatch<TranslationJob>, env: Env): Promise<void> {
    for (const message of batch.messages) {
      const { slug, locale, content } = message.body;
      const translated = await translatePost(content, locale, env);
      await env.CONTENT_KV.put(`draft:${slug}:${locale}`, translated);
      message.ack();
    }
  },
};

Each translation runs in its own Worker invocation. If one fails, the Queue retries it automatically. I get 12 parallel translations without managing threads, promises, or rate-limit backoff myself.

A real agent: the event promoter

Pickleland runs pickleball events. I built an agent that scans the booking platform for events in the next 4 days, drafts Facebook group posts per event, and surfaces them for my review before anything goes out.

The agent calls a scraping Worker, passes the event list to Claude with a structured prompt, and writes the draft posts to KV. The prompt is explicit about tone (community-focused, not salesy) and format (one post per event, under 150 words, include the booking link).

typescript
const systemPrompt = `You are a community manager for a pickleball facility.
Write Facebook group posts for upcoming events.
Rules:
- Max 150 words per post
- Lead with what's fun about the event, not the price
- Include the booking URL exactly as provided
- Do not use exclamation marks more than once per post
- Tone: friendly, local, not corporate`;

The constraint that matters here isn’t the model — it’s the workflow. The agent runs on a cron trigger at 8am daily. The draft posts land in a review queue. I approve or edit, then a separate publish Worker fires. No event gets posted without a human seeing it first.

How I manage 30+ agents without losing my mind

The honest answer: Cloudflare’s dashboard is my control plane. Every Worker shows me invocation count, error rate, and CPU time. Every Queue shows message throughput and failures. KV shows storage usage.

Beyond that:

The discipline isn’t technical. It’s deciding what an agent is allowed to do autonomously versus what needs my sign-off. Content drafts: autonomous. Anything that touches a customer: human review. Anything that sends money: not an agent job.

What I’d change if I were starting today

One thing: I’d set up structured outputs (JSON mode) from day one instead of retrofitting it onto agents that already shipped. Parsing free-text Claude output is a tax. When you define a Zod schema and pass it as the expected response shape, you get typed data back and your downstream Workers don’t have to guess.

typescript
import { z } from "zod";

const EventPostSchema = z.object({
  headline: z.string().max(80),
  body: z.string().max(600),
  bookingUrl: z.string().url(),
  suggestedPostTime: z.enum(["morning", "afternoon", "evening"]),
});

Then I pass the schema definition to Claude as a tool, use tool_choice: { type: "tool", name: "format_post" }, and get structured output back every time. No regex, no “sometimes Claude adds a preamble” bugs.

The operator’s bottom line

The agent stack that works in production is the one you can debug at 10pm when something breaks. For me, that’s TypeScript + Cloudflare + Anthropic SDK — not because it’s the flashiest combination, but because every layer is observable, deployable, and replaceable independently. Frameworks are bets on abstractions. I’d rather own the plumbing.


Related: Event-triggered vs scheduled agents: which pattern for which job · How I measure whether an AI agent is actually working · The cheapest way to run a content agent on Cloudflare

Want to run AI agents in your business? Get in touch — I design and deploy production agent stacks for operator teams.

Keep reading

Get the AI playbook in your inbox

Every Wednesday. 28,400+ operators. Zero fluff.

↵ to see all results esc esc to close