Does Groq have a free API tier?

Yes. GroqCloud has a free tier with generous rate limits for development and testing, alongside paid on-demand pricing. It's one of the easiest ways to try fast open-model inference without a credit card upfront.

How much does the Groq API cost?

Reference pricing (July 2026): Llama 3.3 70B is around $0.59 input / $0.79 output per million tokens, Llama 3.1 8B around $0.05 / $0.08, and Mixtral around $0.24. Groq bills per token like other providers but is known for very high tokens-per-second speed.

Groq API — pricing, free tier & how to get a key

Q: How do I get a Groq API key?

Sign up at console.groq.com, open the API Keys page, create a new key and copy it once. The free tier works immediately within its rate limits; add billing for higher on-demand throughput.

Groq runs open models like Llama and Mixtral on its custom LPU hardware, and its claim to fame is speed: hundreds of tokens per second, far faster than typical GPU inference. Pricing is per token like everyone else, but the latency is the selling point — ideal for chatbots, voice agents and anything where users wait on the response. Here's what it costs and how to get your key.

Groq API pricing (reference, July 2026)

Model	Input $/1M	Output $/1M	Best for
Llama 3.3 70B	$0.59	$0.79	Quality + speed balance
Llama 3.1 8B cheapest	$0.05	$0.08	High volume, simple tasks
Mixtral 8x7B	$0.24	$0.24	Cheap mixture-of-experts

⚠️ Reference prices, July 2026 — Groq adds and retires models often and adjusts pricing. Confirm on groq.com/pricing before budgeting. Prices are per 1M tokens; output is billed separately. · Report outdated price →

✓ Last verified: 2026-07-15· Source: official provider pricing page· Auto-monitored — report change →

→ Estimate your bill on the AI API cost calculator or model a whole app with the AI app cost estimator.

Is there a free tier?

Yes — GroqCloud includes a free tier with generous per-minute and per-day rate limits, ideal for development and low-volume apps with no card required to start. For production throughput you move to paid on-demand. If a free quota matters, compare with Google Gemini and Mistral, which also offer real free tiers.

How to get a Groq API key (step by step)

1. Go to console.groq.com and create an account.
2. Open the API Keys page and click Create API Key; copy it once.
3. Use the free tier immediately, or add billing under Settings → Billing for higher limits.
4. The API is OpenAI-compatible; most OpenAI SDKs work by pointing the base URL at Groq.

Test it with a simple request:

# quick test (replace $GROQ_API_KEY)
curl https://api.groq.com/openai/v1/chat/completions \
  -H "Authorization: Bearer $GROQ_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"llama-3.1-8b-instant","messages":[{"role":"user","content":"hi"}]}'

Estimate your cost

Use the AI API Cost Calculator to plug in your token counts and request volume — it ranks Groq's open models against GPT-4o, Claude, Gemini and DeepSeek from cheapest to most expensive for your workload.

Cheaper alternatives

Groq's Llama 3.1 8B is already among the cheapest hosted models anywhere. For comparably cheap frontier-ish quality, DeepSeek-V3 and Mistral Small are the obvious comparisons. If you want one key across many providers and automatic routing, see OpenRouter.

FAQ

Why use Groq instead of OpenAI?

Speed. Groq serves open models at very high tokens-per-second, so responses feel instant — valuable for chat, voice and agent loops. You trade proprietary frontier models for raw latency and low cost.

Does Groq have a free tier?

Yes — the GroqCloud free tier has rate limits and no upfront cost, good for testing and low-volume use.

How do I get a Groq API key?

Is the Groq API OpenAI-compatible?

Yes — it exposes an OpenAI-style chat completions endpoint, so most OpenAI client libraries work by changing the base URL and key.

Not affiliated with Groq. Prices are reference estimates — always verify on the official pricing page.