How to read this

Most agent budgets blow up in one of two places: sending a giant prompt on every request, or running a frontier model on steps a cheaper one handles fine. This calculator makes both visible. Drop your real per-request token counts in, then watch the comparison table — the gap between tiers is usually larger than people expect, and prompt caching often moves the needle more than switching models.

For the full reasoning behind these trade-offs, see AI agent cost math: when Haiku beats Sonnet and prompt caching without switching models.