HomeBlog › Cheapest AI API at 1M requests

The cheapest AI API in 2026 — real cost at 1 million requests

Published 2026-06-12 · reference prices, verify before budgeting

"Which AI API is cheapest?" has no answer until you fix a workload. So we did. Same job on every major model: 1,000,000 requests, each with 1,000 input tokens and 500 output tokens (a typical short chatbot or generation call). That's 1 billion input tokens and 500 million output tokens. Here's what each provider charges.

The numbers

ModelIn $/1MOut $/1MCost for 1M requests
Gemini 2.0 Flash cheapest$0.10$0.40$300
GPT-4o mini$0.15$0.60$450
DeepSeek-V3$0.27$1.10$820
Gemini 2.5 Flash$0.30$2.50$1,550
Claude Haiku 3.5$0.80$4.00$2,800
GPT-4o$2.50$10.00$7,500
Claude Sonnet 4$3.00$15.00$10,500

Same workload. $300 on Gemini 2.0 Flash, $10,500 on Claude Sonnet 4 — a 35× spread for the exact same number of requests. The model you pick is, by a wide margin, the biggest lever on your AI bill.

But cheapest isn't always right

The catch every honest comparison has to add: a cheaper model that needs two tries, or writes longer answers, can cost more than a pricier one that nails it first time. Quality matters. The smart pattern most teams land on is tiering: a cheap model (Gemini Flash, GPT-4o mini, DeepSeek) for routine, high-volume calls, and a frontier model (GPT-4o, Claude Sonnet) only for the hard requests. Even sending 20% of traffic to a frontier model keeps you near the cheap end of this table.

The other levers

After model choice, two things move the needle: output length (output is billed 4–5× input on every provider here — cap it) and prompt caching (repeated context billed at a steep discount on OpenAI, Anthropic and DeepSeek). Trim the system prompt, cap max_tokens, and cache what repeats.

Price your real workload

Your tokens aren't ours. Plug your real input/output lengths and volume into the AI API cost calculator — it ranks every model for your numbers. Building something specific? Use the AI app cost estimator, the chatbot cost calculator or the RAG cost calculator. Per-model deep dives: GPT-4o, Claude, Gemini, DeepSeek.

Open the AI API Cost Calculator → · OpenAI vs Claude vs Gemini →

Reference prices (June 2026). Verify on each provider's pricing page before budgeting. Not affiliated with any provider.