Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Integrate speech recognition and machine learning

Access this AI accelerator on GitHub

This accelerator presents a workflow for transcribing audio files using OpenAI's Whisper model. Whisper is a state-of-the-art speech recognition system designed to handle a wide range of audio types and accents. It is highly effective for converting audio files' spoken language into written text.

The workflow includes steps to use Whisper to transcribe audio files, process them efficiently, and store the transcriptions in a structured format for further analysis or use. This can be particularly useful for tasks such as generating subtitles, transcribing meetings, or converting speech from various audio sources into text for machine learning.

In this example, you take transcribed data and build a classification model with DataRobot. You use DataRobot for model training, selection, deployment, and to evaluate data for insights.

This accelerator demonstrates how to use the Python API client to:

  • Set up the environment (install and import necessary libraries including Whisper and dependencies).
  • Securely connect to DataRobot.
  • Get data (publicly available audio files in this example).
  • Transcribe audio with Whisper.
  • Use the transcription to create a classification model in DataRobot.
  • Retrieve and evaluate model performance and insights.

Updated January 31, 2024