08 est. 50 min

Case studies: my 30+ agents

Module 08 · est. 50 min · You’ll walk away with: a teardown of every agent I run in production — its shape, trigger, tools, where the human gate sits, what broke, and what I’d steal for your own fleet.

TL;DR: Here’s my actual fleet, opened up. Each agent is one of the four shapes from Module 03, hardened with the memory/tools/evals from Module 04, prompted per Module 05, guarded per Module 06, and deployed per Module 07. The pattern across all of them: embarrassingly narrow scope, a human gate on anything customer- or money-facing, and a draft-then-approve rhythm for the high-stakes ones. The wins came from picking the right shape. The failures came from the times I forgot that and tried to make one agent do two jobs. Steal whatever maps to your business.

[Operator’s read] This is the module I’d have paid for alone. Everyone teaches “how to build an agent.” Almost nobody shows you thirty of them running real businesses, including the ones that failed and got rebuilt. No theater. Here’s what actually runs.

How to read these

For each agent: Shape · Trigger · Tools · Human gate · Win/failure/rebuild. Find the ones shaped like a problem you have and copy the structure. They all share the same skeleton from Module 02 — these are organ transplants, not ground-up builds.

My whole stack, for reference: TypeScript / shell / prompt (no Python), Claude Code + Cowork as the cockpit, the Claude API as the brain, Cloudflare Pages/Workers + GitHub Actions cron as the runtime, MCP servers for tool connections. Two main businesses behind these: alejandrorioja.com + a handful of content sites, and Pickleland, a pickleball facility in Pflugerville, TX.

The Pickleland fleet (a real-world local business)

1. Promote upcoming events

Shape: Scheduled (runs a few days ahead of events).
Trigger: cron, GitHub Actions.
Tools: read events from PlayByPoint (our booking system), match each event to the right Facebook group audiences, draft a bundled email, draft per-event FB posts.
Human gate: hard yes. Drafts only. Nothing sends without my approval. This talks to customers and fills (or fails to fill) seats — exactly the “talks to a customer” trigger from Module 06.
Win: turned a recurring “what should we promote this week?” scramble into a one-click review. The agent does the scan-and-draft; I do the judgment.
What I’d steal: the bundle-then-split structure — one email to my list, but event-by-event posts to matched audiences. Generic blasts underperform; matched ones convert. The agent does the matching I’d never bother to do by hand.

Shape: Polling.
Trigger: frequent cron, checks the Metricool inbox for new Google reviews and social comments across FB, IG, YouTube.
Tools: read inbox, draft reply in our voice, post (gated).
Human gate: drafts for anything spicy; the warm/easy ones I trust more. The few-shot examples from Module 05 are what make the voice right — warm, specific, accountable, never defensive.
Failure → rebuild: early version replied too generically (“Thanks for your feedback!”). Useless. The rebuild added few-shot examples of good replies and a rule to always reference a specific detail from the review. Tone fixed.
What I’d steal: the polling checkpoint (last-handled review ID) so it never double-replies, and the few-shot voice block. Any business with reviews needs this.

3. Find viral reels

Shape: Scheduled.
Trigger: cron.
Tools: search IG/TikTok for viral pickleball reels (300k+ views), filter, summarize what’s trending for content inspiration.
Human gate: none needed — output is a list to me, reversible, low stakes. Auto-runs.
What I’d steal: the threshold filter (300k+ views) baked into the prompt as a hard rule. The agent’s value is filtering, not finding — anyone can find reels; the agent finds the ones worth copying.

4. Quarterly team bonding

Shape: Scheduled (fires near quarter-end).
Trigger: cron, last-Saturday-of-quarter logic.
Tools: pick a conflict-free Saturday (verified against PlayByPoint for no tournaments), pick a restaurant from a rotating list, pull all employees + contractors from Gusto, send Google Calendar invites.
Human gate: light — I get a heads-up, but the logistics run themselves.
What I’d steal: the constraint-verification step — it doesn’t just pick a Saturday, it checks the booking system to confirm no conflict. That “verify the world is in the expected state” move (Module 03’s scheduled-agent failure mode) is what makes it trustworthy. And pulling the roster live from Gusto means it never invites someone who left.

The personal / operator fleet

5. Daily morning brief

Shape: Scheduled.
Trigger: cron, 6am Central (the UTC-offset dance from Module 07).
Tools: overnight summary, today’s calendar, top focus, review queue — cross-cutting, touches personal + work sources + Notion.
Human gate: none. It’s for me, reversible, low stakes — auto-sends. The textbook “auto-run” case from Module 06.
What I’d steal: this is the Module 02 digest agent, grown up. If you build one agent from this course, build this. It’s the gateway drug — low risk, daily value, and it teaches you the whole loop.

6. AI-trends digest

Shape: Scheduled, twice weekly.
Trigger: cron.
Tools: research new AI workflow techniques worth stealing, write up to 3 ideas (TL;DR / how it works / why it works / how to use it).
Human gate: none — output to me.
What I’d steal: the capped output (max 3 ideas). An uncapped “find everything new in AI” agent produces noise. Three good ones, formatted identically every time, is usable. Constraint is a feature.

7. Weekly marketing recap

Shape: Scheduled, Monday 7:30am.
Trigger: cron.
Tools: pull last week’s Pickleland marketing numbers, write a one-page report.
Human gate: none — it’s a report to me.
Failure → fix: early version ran Monday morning before some weekend data had finalized — a textbook “the world wasn’t ready” failure (Module 03). Fix: a guard that checks the data’s complete before writing, and stops-and-flags if not, instead of reporting half-numbers as if they were whole.
What I’d steal: the readiness guard. Scheduled agents assume the world is ready at run time. Make them check.

8. LMNT supplement reorder

Shape: Scheduled check.
Trigger: cron.
Tools: order LMNT Sparkling from drinklmnt.com via Claude in Chrome (browser automation), specified flavors/quantities, through checkout.
Human gate: spends money — so a small spend ceiling is the gate rather than per-order approval (Module 06: small, to a known vendor, capped). Below the ceiling it just runs.
What I’d steal: browser automation as a tool when there’s no API. Not everything has a clean API; sometimes the “tool” is driving a real browser. It’s less robust, so I gate it harder and keep the scope tiny.

9. WordPress multi-site manager

Shape: Conversational + scheduled hybrid.
Trigger: I invoke it conversationally for big changes; scheduled passes for routine audits.
Tools: update existing posts, research topics, create drafts across many sites (futuresharks.com, alejandrorioja.com, flux.la, and others) via the WordPress REST API.
Human gate: drafts for new content; the content-refresh passes (like the 2026 rewrite campaign) run more autonomously but reviewably.
What I’d steal: one tool layer, many sites. The same tools.ts (WordPress REST) points at a dozen sites by swapping a base URL. Build the integration once, reuse across every property. This is the “agents share tools.ts, not triggers” lesson from Module 03 made real.

10. Find-viral-reels / content-research siblings

Shape: Scheduled.
Note: I run several near-identical content-research agents that share a tool layer and differ only in their prompt (pickleball vs. other niches). This is the “copy the last one, change the prompt” philosophy from Module 02 at fleet scale. I don’t build new agents from scratch anymore — I clone a working one and re-aim it.

The patterns across all 30+

Step back from the individual agents and here’s what’s actually true across the whole fleet:

1. They’re all embarrassingly narrow. Not one of them is “an AI assistant for my business.” Each does one job. The narrowness is why they’re reliable. The moment I tried to make event-promo also handle DMs, it got worse at both. (Module 03’s trap, learned the hard way.)

2. Scheduled dominates. Most of my fleet is cron-triggered. Polling is second. Event-triggered and conversational are rarer. When in doubt about a new agent, it’s probably scheduled. Don’t reach for fancy event plumbing you don’t need.

3. The human gate maps cleanly to blast radius. For-me-and-reversible → autonomous (morning brief, trends, recap). Talks-to-customers → draft-then-approve (event promo, social reply). Spends-money → spend ceiling (LMNT). Irreversible/high-stakes → conversational forever (a human drives every turn). I never decide the gate by vibes; I decide it by blast radius (Module 06).

4. They share tool layers, not triggers. Many agents reuse the same integrations (WordPress REST, the booking system, Slack, Gusto) via shared code and MCP servers. The trigger is what makes them separate agents. Build tools once, reuse everywhere.

5. Every failure became an eval. Social-reply’s generic-reply failure, weekly-recap’s stale-data failure, event-promo’s wrong-audience matches — each one is now a test case that can’t recur. The fleet’s reliability is accumulated past failures, encoded. That’s not a metaphor; it’s literally what the eval folders contain.

6. I rebuild without shame. Several of these are version 2 or 3. The first version of an agent is a hypothesis. You ship it, watch it, and the rebuild is informed by reality instead of imagination. Don’t try to build the perfect agent. Build the v1, learn, rebuild. The “30+ agents” headline includes a graveyard of v1s I’m glad I shipped anyway.

Your fleet starts with one

You don’t start with thirty. You start with one — almost certainly your morning brief or your single highest-leverage yes/yes task from Module 01. You ship it (Module 07), watch it for a week, harden it from what breaks (Module 04), and then you clone it and re-aim it at the next task. That’s how thirty happens: not a grand plan, but one working agent copied thirty times, each re-pointed at a different narrow job. The first one is the hard one. After that, it’s repetition.

Hands-on lab — your fleet plan

Step 1 — Match your tasks to my agents. Take your classified task list (Module 03). For each, find the agent above it most resembles and note what you’d steal — the checkpoint, the few-shot voice, the readiness guard, the spend ceiling, the constraint-verify step.

Step 2 — Set your gates by blast radius. For each planned agent, write its human gate using the four buckets: autonomous / draft-then-approve / spend-ceiling / conversational-forever. Don’t decide by vibe; decide by what happens if it’s wrong.

Step 3 — Pick your fleet order. Sequence your first five agents. Rule: start with a for-you, reversible, autonomous one (morning brief energy) to build the muscle safely. Save the customer-facing and money-spending ones for after you trust your own building.

Step 4 — Identify your shared tool layer. Which integrations will more than one agent need (your Slack, your CRM, your calendar, your booking system)? Build or connect those first — via MCP servers where they exist — so later agents are prompt-swaps, not from-scratch builds.

Deliverable: a written fleet plan — your first five agents, each with a shape, a trigger, a human gate set by blast radius, and a note on which stolen pattern from my fleet it’s based on. You finished the course with a map of agents I’d run if they were my business. Now go ship the first one. You already know how — every piece is in Modules 01 through 07. The only thing left is to do it.

That’s the whole playbook. The same one running my businesses today. If something breaks while you build, that’s not failure — that’s Module 07’s iterate loop, working as designed. Ship the v1. Watch it. Rebuild. Repeat thirty times.