AssemblyAI is a speech-to-text and audio-intelligence API: transcription, speaker diarization, summarization, sentiment and PII redaction, plus real-time streaming. Pricing is billed per hour of audio, so your cost scales with minutes processed, not requests. Here's the pricing, the free credit, and how to get a key.
| Product | Price | Billed by |
|---|---|---|
| Free credit free tier | $0 | ~$50 starter credit, no card |
| Async transcription (Universal) | ~$0.37 / hr | per hour of audio |
| Real-time streaming | ~$0.47 / hr | per hour of audio |
| Audio Intelligence add-ons | small per-hr add | summarization, sentiment, PII, etc. |
New accounts get free starting credit (around $50) you can spend across transcription and the audio-intelligence models — no credit card required to begin. At ~$0.37/hour that's well over 100 hours of audio to prototype with. After the credit runs out it's pure pay-as-you-go, billed by seconds of audio rounded up.
1. Sign up at assemblyai.com and verify your email.
2. Your API key is shown on the dashboard home / account page immediately — no separate key-creation step needed to start.
3. Add billing later only when you exhaust the free credit.
Transcribe a hosted audio file:
Deepgram (see our Deepgram guide) is often a touch cheaper per hour and very fast. OpenAI Whisper API bills ~$0.006/minute (~$0.36/hr) and is simple but has fewer built-in intelligence features. Google Speech-to-Text is enterprise-grade but pricier and fiddlier. AssemblyAI wins on the breadth of bundled audio-intelligence models and developer experience. For text-to-speech instead, see ElevenLabs.
Yes — free starting credit (~$50), no credit card, usable across transcription and audio-intelligence models. After that, pay-as-you-go per hour.
Sign up at assemblyai.com, verify your email, and copy the key from the dashboard home/account page. Pass it in the Authorization header.
They're close. Deepgram is often slightly cheaper per hour for plain transcription; AssemblyAI bundles more audio-intelligence (summaries, sentiment, redaction). Estimate your real cost by hours of audio × rate.
By duration of audio processed (per hour, charged by the second), not by number of API calls. Real-time streaming and add-on models cost a bit more than plain async transcription.
Not affiliated with AssemblyAI. Prices are reference estimates — always verify on the official pricing page.