GPT-4o / month
per day
per 1k requests
vs GPT-4o mini / mo

GPT-4o vs cheaper models — same workload

Identical tokens and volume, priced on each model.

ModelInput $/1MOutput $/1MCost / month
⚠️ GPT-4o reference pricing ~$2.50 input / $10.00 output per 1M tokens (June 2026). Prices change and vary by tier, region, batch and cached input — confirm on OpenAI's pricing page. Need every model side by side? Use the full AI API cost calculator.

How GPT-4o pricing works

GPT-4o is billed per token, split into input (your prompt + any context) and output (what the model writes back). At roughly $2.50 / 1M input and $10.00 / 1M output, output costs four times as much as input — so the single biggest lever on your bill is how long the answers are. Capping max_tokens, asking for concise responses, and trimming system prompts all cut cost directly.

The second lever is model choice. GPT-4o is a frontier model; for routine classification, extraction and short replies, GPT-4o mini or Gemini Flash can be 15–25× cheaper for output that's good enough. A common pattern is a cheap model by default with GPT-4o only for the hard requests — the table above shows what that switch is worth at your volume. For the full picture of an app, try the AI app cost estimator; for a support bot, the chatbot cost calculator. Full setup and key steps are in the OpenAI guide.

FAQ

What's GPT-4o's cost per 1M tokens? About $2.50 input and $10.00 output (reference, June 2026).

How do I lower a GPT-4o bill? Shorten outputs, cache repeated context, and route easy requests to GPT-4o mini.