GPT-4o API Cost Calculator

How GPT-4o pricing works

GPT-4o is billed per token, split into input (your prompt + any context) and output (what the model writes back). At roughly $2.50 / 1M input and $10.00 / 1M output, output costs four times as much as input — so the single biggest lever on your bill is how long the answers are. Capping max_tokens, asking for concise responses, and trimming system prompts all cut cost directly.

The second lever is model choice. GPT-4o is a frontier model; for routine classification, extraction and short replies, GPT-4o mini or Gemini Flash can be 15–25× cheaper for output that's good enough. A common pattern is a cheap model by default with GPT-4o only for the hard requests — the table above shows what that switch is worth at your volume. For the full picture of an app, try the AI app cost estimator; for a support bot, the chatbot cost calculator. Full setup and key steps are in the OpenAI guide.

FAQ

What's GPT-4o's cost per 1M tokens? About $2.50 input and $10.00 output (reference, July 2026).

How do I lower a GPT-4o bill? Shorten outputs, cache repeated context, and route easy requests to GPT-4o mini.

How this calculator works

The GPT-4o API Cost Calculator estimates what you will pay to run GPT-4o at scale by combining the two things OpenAI meters separately: input tokens (your prompt) and output tokens (the model's reply), each billed at its own per-token rate. You enter the average input tokens per request, the output tokens per request, and your requests per day, and the tool multiplies those figures out to a per-request, daily, and monthly bill. It also factors in prompt caching, which discounts repeated input tokens, and places your total next to cheaper alternatives like GPT-4o mini and Gemini Flash so you can see the price gap at a glance.

This matters because token costs stay invisible until volume multiplies them into a real number, and by then the bill is already set. The calculator's biggest lever is usually output tokens: they typically cost several times more than input, so a verbose response repeated across thousands of requests dominates the total far more than a long prompt does. The practical move is to tighten your output length, lean on prompt caching for the fixed parts of your prompts, and only step down to a smaller model where accuracy allows. Run the numbers at your real daily volume rather than a single call, because the per-request difference between models looks trivial until it is multiplied by a month of traffic.

Frequently asked questions

How much does the GPT-4o API cost?

GPT-4o reference pricing is about $2.50 per million input tokens and $10.00 per million output tokens (July 2026). Your actual bill is input tokens × input price plus output tokens × output price, multiplied by your request volume. Output is four times the price of input, so long answers dominate the cost.

Is GPT-4o mini cheaper than GPT-4o?

Yes — dramatically. GPT-4o mini is around $0.15 input / $0.60 output per million tokens, roughly 16x cheaper than GPT-4o. For most high-volume, routine tasks the mini model is the right default, with GPT-4o reserved for the harder requests.

GPT-4o API cost calculator

GPT-4o vs cheaper models — same workload

How GPT-4o pricing works

FAQ

How this calculator works

Frequently asked questions

How much does the GPT-4o API cost?

Is GPT-4o mini cheaper than GPT-4o?