Engineering Prompts

Engineering Prompts

Share this post

Engineering Prompts
Engineering Prompts
Speech Recognition

Speech Recognition

While ChatGPT garners the limelight, the capabilities of OpenAI's Whisper model in audio to text transcription embody a hidden power tool.

Marcel Salathé's avatar
Marcel Salathé
Jul 05, 2023
∙ Paid

Share this post

Engineering Prompts
Engineering Prompts
Speech Recognition
Share

We are used to interacting with computers by text, but language would often be a more efficient interface. While many programs these days have a voice interface, the quality is often questionable. In addition, the voice interface is normally tied to the software, but having a general-purpose voice transcriber would be incredibly useful.

Enter Whisper.

Whisper is OpenAI’s automatic speech recognition system which is accessible through an API. Released in late 2022, the system was trained on 680,000 hours of multilingual and multitask supervised data collected from the web.

In this post, I’ll show you the simple process that I have set up to record and transcribe text. I use this quite regularly, especially to record basic ideas for a text I have in mind - the transcribed text then serves as a way to interact with ChatGPT to piece together a stronger narrative.

Keep reading with a 7-day free trial

Subscribe to Engineering Prompts to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Marcel Salathé
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share