Alejandro Rioja.
AI Agents Growth

The Cheapest Way to Run a Content Agent on Cloudflare

Alejandro Rioja
Alejandro Rioja
7 min read
TL;DR

You can run a full content-generation and translation pipeline on Cloudflare Workers + Claude API for roughly $0.50 per post with Sonnet 3.5, or $0.12 with Haiku. Cloudflare's free tier handles 100K requests/day — you only pay when scale demands it. The constraint is cold-start latency for long-running LLM calls, which Queues solves.

Free newsletter

Every Wednesday. 28,400+ operators. Zero fluff.

Table of contents

Open Table of contents

Why Cloudflare instead of AWS or Vercel

AWS Lambda charges per GB-second of compute. Vercel charges per execution and has aggressive rate limits on the hobby tier. Both add cold-start unpredictability at scale.

Cloudflare Workers runs on V8 isolates, not containers. Cold starts are sub-millisecond. The free tier is genuinely useful: 100,000 requests per day, 10ms CPU time per request (wall-clock time is longer). The paid plan is $5/month for 10 million requests.

For an AI content pipeline, the Workers compute cost is nearly zero. The real cost is the Claude API. That’s the only variable you need to optimize.

The architecture (text diagram)

code
HTTP trigger (cron or webhook)


  Worker: orchestrator

        ├─ Write job metadata → KV (status: "pending")

        └─ Enqueue 1 EN generation job


         Queue: content-jobs

        ┌──────┴──────┐
        ▼             ▼
  Worker: writer   Worker: translator (×12 locales)
        │                    │
        ▼                    ▼
  Claude API             Claude API
        │                    │
        └──────┬─────────────┘

        KV: posts/{slug}/{locale}


     Webhook → GitHub → deploy

The orchestrator Worker is the entry point. It writes an initial status record to KV, then pushes one job onto a Cloudflare Queue. The Queue fan-out spawns the writer Worker (EN) and, once the EN body is ready, 12 translator Workers in parallel. Everything lands in KV keyed by {slug}/{locale}. A final webhook kicks a GitHub Actions deploy.

Cloudflare cost table

ResourceFree tierPaid ($5/mo)My actual usage
Workers requests100K/day10M/month~500/day
Workers CPU time10ms/req30s/req~8ms/req
KV reads100K/day10M/month~2K/day
KV writes1K/day1M/month~300/day
Queues messages1M/month included~300/month
KV storage1 GB1 GB included~200 MB

For a site publishing 10 posts/month, the Cloudflare bill is $0 on the free tier. Workers are free. KV reads/writes at that volume are free. The only paid feature I use is Queues — which requires the $5/month Workers Paid plan, but that plan includes 1M queue messages.

My Cloudflare bill: $5/month flat, regardless of post volume up to ~3,000 posts/month.

Claude API cost table

This is where the real money goes. Here’s the breakdown per post.

EN generation (one post, ~1,500 words):

ModelInput tokensOutput tokensCost
Claude Haiku 3.5~2,000~2,500$0.005
Claude Sonnet 3.5~2,000~2,500$0.042

12 translations (each ~1,500 words):

ModelInput tokens (×12)Output tokens (×12)Cost
Claude Haiku 3.5~24,000~30,000$0.054
Claude Sonnet 3.5~24,000~30,000$0.500

Total per post (EN + 12 translations):

ModelTotal cost
Claude Haiku 3.5~$0.059
Claude Sonnet 3.5~$0.542

I use Sonnet for EN generation (quality matters for the canonical post) and Haiku for translations (the source text is already written; the model just needs to faithfully translate). That blended approach costs about $0.10 per post in practice.

TypeScript Worker: the orchestrator

typescript
// src/workers/orchestrator.ts
import { Queue } from "@cloudflare/workers-types";

interface Env {
  CONTENT_QUEUE: Queue;
  POSTS_KV: KVNamespace;
}

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const { slug, topic } = await request.json<{ slug: string; topic: string }>();

    // Write initial status
    await env.POSTS_KV.put(
      `status:${slug}`,
      JSON.stringify({ status: "pending", createdAt: Date.now() })
    );

    // Enqueue the EN generation job
    await env.CONTENT_QUEUE.send({
      type: "generate",
      slug,
      topic,
      locale: "en",
    });

    return Response.json({ queued: true, slug });
  },
};

TypeScript Worker: the writer + translator

typescript
// src/workers/content-processor.ts
interface Env {
  CONTENT_QUEUE: Queue;
  POSTS_KV: KVNamespace;
  ANTHROPIC_API_KEY: string;
}

const LOCALES = ["ar","de","es","fr","hi","it","ja","ko","nl","pt","ru","zh"];

export default {
  async queue(batch: MessageBatch<ContentJob>, env: Env): Promise<void> {
    for (const msg of batch.messages) {
      const job = msg.body;

      if (job.type === "generate") {
        const post = await generatePost(job.topic, env.ANTHROPIC_API_KEY);
        await env.POSTS_KV.put(`post:${job.slug}:en`, post);

        // Fan out translations
        for (const locale of LOCALES) {
          await env.CONTENT_QUEUE.send({
            type: "translate",
            slug: job.slug,
            locale,
            sourceText: post,
          });
        }
      }

      if (job.type === "translate") {
        const translated = await translatePost(
          job.sourceText,
          job.locale,
          env.ANTHROPIC_API_KEY,
          // Use Haiku for translations to cut cost
          "claude-haiku-3-5-20241022"
        );
        await env.POSTS_KV.put(`post:${job.slug}:${job.locale}`, translated);
      }

      msg.ack();
    }
  },
};

async function generatePost(topic: string, apiKey: string): Promise<string> {
  const res = await fetch("https://api.anthropic.com/v1/messages", {
    method: "POST",
    headers: {
      "x-api-key": apiKey,
      "anthropic-version": "2023-06-01",
      "content-type": "application/json",
    },
    body: JSON.stringify({
      model: "claude-sonnet-3-5-20241022",
      max_tokens: 4096,
      messages: [{ role: "user", content: `Write a detailed blog post about: ${topic}` }],
    }),
  });
  const data = await res.json<{ content: Array<{ text: string }> }>();
  return data.content[0].text;
}

async function translatePost(
  text: string,
  locale: string,
  apiKey: string,
  model: string
): Promise<string> {
  const res = await fetch("https://api.anthropic.com/v1/messages", {
    method: "POST",
    headers: {
      "x-api-key": apiKey,
      "anthropic-version": "2023-06-01",
      "content-type": "application/json",
    },
    body: JSON.stringify({
      model,
      max_tokens: 4096,
      messages: [
        {
          role: "user",
          content: `Translate the following blog post to ${locale}. Preserve all markdown. Keep code blocks in English.\n\n${text}`,
        },
      ],
    }),
  });
  const data = await res.json<{ content: Array<{ text: string }> }>();
  return data.content[0].text;
}

Alternatives comparison

PlatformBase costPer-request costCold startsTypeScript DX
Cloudflare Workers$0–$5/moNear zeroSub-msExcellent
AWS Lambda$0 (limited)$0.20/1M req100ms–1sGood
Vercel Functions$0 (limited)Usage-based200ms–2sGood
Self-hosted VPS$5–$20/mo$0None (always on)Any
Fly.io Machines$0 (limited)Per GB-sec~500msGood

Self-hosted is the one case where Cloudflare loses. A $6/month Hetzner VPS runs Node 24/7 with no cold starts and no per-request cost. But you manage deploys, uptime, and scaling yourself. Workers handles all that for free at my volume.

The AWS Lambda comparison sounds competitive on paper, but Lambda’s 15-minute max execution time is a real limit for long LLM chains. Workers has a 30-second CPU limit on the paid plan — which sounds worse but is fine because you’re using Queues for async work, not blocking a single Lambda.

KV for state management

KV is Cloudflare’s globally distributed key-value store. Reads are fast everywhere (edge-cached). Writes propagate globally in under 60 seconds.

For a content pipeline, I use three key patterns:

typescript
// Job status
await kv.put(`status:${slug}`, JSON.stringify({ status, updatedAt: Date.now() }));

// Post content
await kv.put(`post:${slug}:${locale}`, markdownContent);

// Index for listing
const index = await kv.get<string[]>("index:posts", "json") ?? [];
await kv.put("index:posts", JSON.stringify([...index, slug]));

KV costs at my scale: $0. The free tier gives 1K writes/day and 100K reads/day. I write ~300 KV entries per month (25 posts × 13 locales = 325). Nowhere near the limit.

If you’re publishing 100+ posts/month in 12 locales, you’d hit ~39,000 writes/month — still under the 1M/month included in the $5 paid plan.

The operator’s bottom line

My full content pipeline — generate, translate, store, deploy — costs $5.10/month in infrastructure ($5 Cloudflare Workers Paid + $0.10 Claude API per post at 10 posts/month). That’s it. No servers to manage, no container orchestration, no surprise bills from Lambda cold starts hitting a provisioned concurrency threshold.

The blended model strategy (Sonnet for EN, Haiku for translations) cuts Claude costs by 80% with no perceptible quality drop on translations. Run the numbers for your volume and you’ll find Cloudflare + Claude Haiku is genuinely the cheapest way to run this kind of pipeline at founder scale.


Related: The agent stack I use to run 30+ production agents · Event-triggered vs scheduled agents: which pattern for which job · How I measure whether an AI agent is actually working

Want to run a content pipeline like this for your site? Get in touch — I design and deploy production agent stacks for operator teams.

Keep reading

Get the AI playbook in your inbox

Every Wednesday. 28,400+ operators. Zero fluff.

↵ to see all results esc esc to close