โ€”
per month
โ€”per user / month
โ€”per year
โ€”LLM calls / month

Cheapest models for this workload

Same usage, every model ranked by monthly cost.

ModelCost / monthPer user
โš ๏ธ Estimate using reference prices (June 2026) and list rates. Real bills vary with caching, batching, region, tiers and add-ons. Type your real token counts for the closest figure.

How to read this

Your AI bill is mostly users ร— calls ร— tokens ร— price. The biggest levers, in order: which model (small vs frontier is often 20โ€“50ร— cheaper), output length (output costs 3โ€“5ร— input โ€” cap it), and input length (trim system prompts and retrieved context). Caching repeated context and batching non-urgent jobs cut more.

Picking a model? Compare them head-to-head on the AI API cost calculator, or read GPT vs Claude vs Gemini. Building RAG or a chatbot? Those have their own cost drivers (embeddings, vector DB, retrieval size) โ€” guides coming.