Tool Hut
AI audio & voice tools

Whisper

OpenAI's open-source speech-to-text model — runs locally, best-in-class transcription accuracy.

Add Whisper to your hut →

Whisper is OpenAI's open-source speech-to-text model, widely regarded as the most accurate general-purpose transcription model available. It handles 100 languages, performs well on accented speech and noisy audio, and produces punctuated, readable transcripts. The weights are freely available — you can run it locally on your own machine with no per-request cost. Many popular tools (Descript, Krisp, Otter) use Whisper under the hood, and it's the default choice when developers build transcription into their own apps.

Self-hosted use is completely free. The OpenAI API charges $0.006 per minute of audio, which is among the cheapest transcription APIs available. Most often compared to Deepgram and AssemblyAI for production use — Whisper's edge is accuracy, open-source availability, and the ability to self-host; Deepgram's is lower latency for real-time transcription use cases.

Made byOpenAI
PricingOpen-source (free to self-host) · OpenAI API $0.006/minute
Best forTranscription, local speech-to-text, multilingual audio, developer API integration

Alternatives to Whisper