Speech Recognition

While ChatGPT garners the limelight, the capabilities of OpenAI's Whisper model in audio to text transcription embody a hidden power tool.

Jul 05, 2023

∙ Paid

We are used to interacting with computers by text, but language would often be a more efficient interface. While many programs these days have a voice interface, the quality is often questionable. In addition, the voice interface is normally tied to the software, but having a general-purpose voice transcriber would be incredibly useful.

Enter Whisper.

Whisper is OpenAI’s automatic speech recognition system which is accessible through an API. Released in late 2022, the system was trained on 680,000 hours of multilingual and multitask supervised data collected from the web.

In this post, I’ll show you the simple process that I have set up to record and transcribe text. I use this quite regularly, especially to record basic ideas for a text I have in mind - the transcribed text then serves as a way to interact with ChatGPT to piece together a stronger narrative.

Keep reading with a 7-day free trial

Subscribe to Engineering Prompts to keep reading this post and get 7 days of free access to the full post archives.