The Cheapest Way to Run a Content Agent on Cloudflare
You can run a full content-generation and translation pipeline on Cloudflare Workers + Claude API for roughly $0.50 per post with Sonnet 3.5, or $0.12 with Haiku. Cloudflare's free tier handles 100K requests/day — you only pay when scale demands it. The constraint is cold-start latency for long-running LLM calls, which Queues solves.
Every Wednesday. 28,400+ operators. Zero fluff.
✓ Check your inbox — click the confirmation link to complete sign-up.
✓ You're subscribed!
✓ You're already on the list.
Table of contents
Open Table of contents
Why Cloudflare instead of AWS or Vercel
AWS Lambda charges per GB-second of compute. Vercel charges per execution and has aggressive rate limits on the hobby tier. Both add cold-start unpredictability at scale.
Cloudflare Workers runs on V8 isolates, not containers. Cold starts are sub-millisecond. The free tier is genuinely useful: 100,000 requests per day, 10ms CPU time per request (wall-clock time is longer). The paid plan is $5/month for 10 million requests.
For an AI content pipeline, the Workers compute cost is nearly zero. The real cost is the Claude API. That’s the only variable you need to optimize.
The architecture (text diagram)
HTTP trigger (cron or webhook)
│
▼
Worker: orchestrator
│
├─ Write job metadata → KV (status: "pending")
│
└─ Enqueue 1 EN generation job
│
▼
Queue: content-jobs
│
┌──────┴──────┐
▼ ▼
Worker: writer Worker: translator (×12 locales)
│ │
▼ ▼
Claude API Claude API
│ │
└──────┬─────────────┘
▼
KV: posts/{slug}/{locale}
│
▼
Webhook → GitHub → deployThe orchestrator Worker is the entry point. It writes an initial status record to KV, then pushes one job onto a Cloudflare Queue. The Queue fan-out spawns the writer Worker (EN) and, once the EN body is ready, 12 translator Workers in parallel. Everything lands in KV keyed by {slug}/{locale}. A final webhook kicks a GitHub Actions deploy.
Cloudflare cost table
| Resource | Free tier | Paid ($5/mo) | My actual usage |
|---|---|---|---|
| Workers requests | 100K/day | 10M/month | ~500/day |
| Workers CPU time | 10ms/req | 30s/req | ~8ms/req |
| KV reads | 100K/day | 10M/month | ~2K/day |
| KV writes | 1K/day | 1M/month | ~300/day |
| Queues messages | — | 1M/month included | ~300/month |
| KV storage | 1 GB | 1 GB included | ~200 MB |
For a site publishing 10 posts/month, the Cloudflare bill is $0 on the free tier. Workers are free. KV reads/writes at that volume are free. The only paid feature I use is Queues — which requires the $5/month Workers Paid plan, but that plan includes 1M queue messages.
My Cloudflare bill: $5/month flat, regardless of post volume up to ~3,000 posts/month.
Claude API cost table
This is where the real money goes. Here’s the breakdown per post.
EN generation (one post, ~1,500 words):
| Model | Input tokens | Output tokens | Cost |
|---|---|---|---|
| Claude Haiku 3.5 | ~2,000 | ~2,500 | $0.005 |
| Claude Sonnet 3.5 | ~2,000 | ~2,500 | $0.042 |
12 translations (each ~1,500 words):
| Model | Input tokens (×12) | Output tokens (×12) | Cost |
|---|---|---|---|
| Claude Haiku 3.5 | ~24,000 | ~30,000 | $0.054 |
| Claude Sonnet 3.5 | ~24,000 | ~30,000 | $0.500 |
Total per post (EN + 12 translations):
| Model | Total cost |
|---|---|
| Claude Haiku 3.5 | ~$0.059 |
| Claude Sonnet 3.5 | ~$0.542 |
I use Sonnet for EN generation (quality matters for the canonical post) and Haiku for translations (the source text is already written; the model just needs to faithfully translate). That blended approach costs about $0.10 per post in practice.
TypeScript Worker: the orchestrator
// src/workers/orchestrator.ts
import { Queue } from "@cloudflare/workers-types";
interface Env {
CONTENT_QUEUE: Queue;
POSTS_KV: KVNamespace;
}
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { slug, topic } = await request.json<{ slug: string; topic: string }>();
// Write initial status
await env.POSTS_KV.put(
`status:${slug}`,
JSON.stringify({ status: "pending", createdAt: Date.now() })
);
// Enqueue the EN generation job
await env.CONTENT_QUEUE.send({
type: "generate",
slug,
topic,
locale: "en",
});
return Response.json({ queued: true, slug });
},
};TypeScript Worker: the writer + translator
// src/workers/content-processor.ts
interface Env {
CONTENT_QUEUE: Queue;
POSTS_KV: KVNamespace;
ANTHROPIC_API_KEY: string;
}
const LOCALES = ["ar","de","es","fr","hi","it","ja","ko","nl","pt","ru","zh"];
export default {
async queue(batch: MessageBatch<ContentJob>, env: Env): Promise<void> {
for (const msg of batch.messages) {
const job = msg.body;
if (job.type === "generate") {
const post = await generatePost(job.topic, env.ANTHROPIC_API_KEY);
await env.POSTS_KV.put(`post:${job.slug}:en`, post);
// Fan out translations
for (const locale of LOCALES) {
await env.CONTENT_QUEUE.send({
type: "translate",
slug: job.slug,
locale,
sourceText: post,
});
}
}
if (job.type === "translate") {
const translated = await translatePost(
job.sourceText,
job.locale,
env.ANTHROPIC_API_KEY,
// Use Haiku for translations to cut cost
"claude-haiku-3-5-20241022"
);
await env.POSTS_KV.put(`post:${job.slug}:${job.locale}`, translated);
}
msg.ack();
}
},
};
async function generatePost(topic: string, apiKey: string): Promise<string> {
const res = await fetch("https://api.anthropic.com/v1/messages", {
method: "POST",
headers: {
"x-api-key": apiKey,
"anthropic-version": "2023-06-01",
"content-type": "application/json",
},
body: JSON.stringify({
model: "claude-sonnet-3-5-20241022",
max_tokens: 4096,
messages: [{ role: "user", content: `Write a detailed blog post about: ${topic}` }],
}),
});
const data = await res.json<{ content: Array<{ text: string }> }>();
return data.content[0].text;
}
async function translatePost(
text: string,
locale: string,
apiKey: string,
model: string
): Promise<string> {
const res = await fetch("https://api.anthropic.com/v1/messages", {
method: "POST",
headers: {
"x-api-key": apiKey,
"anthropic-version": "2023-06-01",
"content-type": "application/json",
},
body: JSON.stringify({
model,
max_tokens: 4096,
messages: [
{
role: "user",
content: `Translate the following blog post to ${locale}. Preserve all markdown. Keep code blocks in English.\n\n${text}`,
},
],
}),
});
const data = await res.json<{ content: Array<{ text: string }> }>();
return data.content[0].text;
}Alternatives comparison
| Platform | Base cost | Per-request cost | Cold starts | TypeScript DX |
|---|---|---|---|---|
| Cloudflare Workers | $0–$5/mo | Near zero | Sub-ms | Excellent |
| AWS Lambda | $0 (limited) | $0.20/1M req | 100ms–1s | Good |
| Vercel Functions | $0 (limited) | Usage-based | 200ms–2s | Good |
| Self-hosted VPS | $5–$20/mo | $0 | None (always on) | Any |
| Fly.io Machines | $0 (limited) | Per GB-sec | ~500ms | Good |
Self-hosted is the one case where Cloudflare loses. A $6/month Hetzner VPS runs Node 24/7 with no cold starts and no per-request cost. But you manage deploys, uptime, and scaling yourself. Workers handles all that for free at my volume.
The AWS Lambda comparison sounds competitive on paper, but Lambda’s 15-minute max execution time is a real limit for long LLM chains. Workers has a 30-second CPU limit on the paid plan — which sounds worse but is fine because you’re using Queues for async work, not blocking a single Lambda.
KV for state management
KV is Cloudflare’s globally distributed key-value store. Reads are fast everywhere (edge-cached). Writes propagate globally in under 60 seconds.
For a content pipeline, I use three key patterns:
// Job status
await kv.put(`status:${slug}`, JSON.stringify({ status, updatedAt: Date.now() }));
// Post content
await kv.put(`post:${slug}:${locale}`, markdownContent);
// Index for listing
const index = await kv.get<string[]>("index:posts", "json") ?? [];
await kv.put("index:posts", JSON.stringify([...index, slug]));KV costs at my scale: $0. The free tier gives 1K writes/day and 100K reads/day. I write ~300 KV entries per month (25 posts × 13 locales = 325). Nowhere near the limit.
If you’re publishing 100+ posts/month in 12 locales, you’d hit ~39,000 writes/month — still under the 1M/month included in the $5 paid plan.
The operator’s bottom line
My full content pipeline — generate, translate, store, deploy — costs $5.10/month in infrastructure ($5 Cloudflare Workers Paid + $0.10 Claude API per post at 10 posts/month). That’s it. No servers to manage, no container orchestration, no surprise bills from Lambda cold starts hitting a provisioned concurrency threshold.
The blended model strategy (Sonnet for EN, Haiku for translations) cuts Claude costs by 80% with no perceptible quality drop on translations. Run the numbers for your volume and you’ll find Cloudflare + Claude Haiku is genuinely the cheapest way to run this kind of pipeline at founder scale.
Related: The agent stack I use to run 30+ production agents · Event-triggered vs scheduled agents: which pattern for which job · How I measure whether an AI agent is actually working
Want to run a content pipeline like this for your site? Get in touch — I design and deploy production agent stacks for operator teams.
Every Wednesday. 28,400+ operators. Zero fluff.
✓ Check your inbox — click the confirmation link to complete sign-up.
✓ You're subscribed!
✓ You're already on the list.
Get the AI playbook in your inbox
Every Wednesday. 28,400+ operators. Zero fluff.
Check your inbox.
We sent you a confirmation email — click the link inside to complete your subscription. Check spam if you don't see it within a minute.
You're subscribed.
Welcome — the next edition lands in your inbox soon.
You're already on the list — look for it every Wednesday.