GPT vs Claude vs Gemini: which AI API is cheapest in 2026?

Published 10 June 2026 · reference prices, verify before budgeting

"Which AI API is cheapest?" has no single answer — it depends entirely on your token mix. But once you look at the actual per-token prices, clear patterns emerge. Here's the honest, numbers-first comparison.

The headline prices ($ per 1M tokens)

Model	Input	Output	Tier
GPT-4o	$2.50	$10.00	Frontier
Claude Sonnet 4	$3.00	$15.00	Frontier
Claude Opus 4	$15.00	$75.00	Top-end
Gemini 2.5 Pro	$1.25	$10.00	Frontier
GPT-4o mini cheap	$0.15	$0.60	Small
Claude Haiku 3.5	$0.80	$4.00	Small
Gemini 2.0 Flash cheapest	$0.10	$0.40	Small

⚠️ Reference prices, June 2026. Always verify on the provider's page. Try your own numbers in the cost calculator. · Report outdated price →

The biggest lever isn't the brand — it's the tier

Notice the gap. Within the same provider, the small model is 15–40× cheaper than the frontier one. GPT-4o mini vs GPT-4o: about 16× cheaper on input. Gemini 2.0 Flash vs Gemini 2.5 Pro: more than 10× cheaper. That difference dwarfs any gap between providers. So the first question is never "GPT or Claude?" — it's "do I actually need a frontier model for this task?"

For classification, routing, extraction, tagging, short summaries and simple chat, a small model almost always passes the eval. Reserve the expensive models for genuinely hard reasoning, long-context analysis, or quality-critical writing.

A concrete example

Say you process 1,000 requests/day, each with 1,000 input tokens and 500 output tokens. Monthly cost:

Model	Cost / month
Gemini 2.0 Flash	$9
GPT-4o mini	$13.50
Claude Haiku 3.5	$84
Gemini 2.5 Pro	$187
GPT-4o	$225
Claude Sonnet 4	$315
Claude Opus 4	$1,575

Same workload, a 175× spread from cheapest to most expensive. If a small model does the job, you're choosing between ~$9 and ~$1,575/month. That's the whole game.

Who wins on what

Cheapest overall: Gemini 2.0 Flash — and Gemini also has the most usable free tier. Cheapest from OpenAI: GPT-4o mini, the safe default for most production tasks. Best free tier: Gemini (AI Studio), no credit card needed. Strongest reasoning per dollar: debatable, but Claude Sonnet and GPT-4o trade blows; output-heavy workloads favour whichever has the lower output price for your case.

Three rules to cut the bill

1. Default to the small model. Only escalate when an eval proves you need more.
2. Watch output, not input. Output costs 3–5× more — cap max tokens.
3. Measure before you scale. Run your real token counts through the calculator before switching a feature on for every user.

Want the exact number for your usage? Open the AI API Cost Calculator →

Reference estimates, June 2026. Not affiliated with OpenAI, Anthropic or Google.