DEV Community

Sunder Kumar
Sunder Kumar

Posted on

Speech to Text using Assembly AI

This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text.

What I Built

I built a Speech-to-Text Application that showcases the power of Universal-2, AssemblyAI’s latest speech-to-text model. The application:

  1. Supports Multilingual Transcription Users can choose from multiple languages, ensuring global accessibility.
  2. Outputs with Formatting and Timestamps application delivers well-structured transcripts, complete with proper nouns, punctuation, and timestamps.
  3. User-Friendly Interface built using Streamlit, the app features an intuitive frontend for easy navigation and interaction.

Demo

Link to Github Repository

Journey

Incorporating Universal-2:
The application utilizes Universal-2 through AssemblyAI’s robust API. The backend:

  1. Uploads audio files using AssemblyAI's upload endpoint.
  2. Submits transcription requests, including optional parameters like language_code and punctuate.
  3. Polls transcription progress until completion and fetches the final transcript with timestamps, and word-by-word breakdown.

Screenshots

Home Page
Audio Processing
Final Results

Team Submission:
I worked on this project independently-Sunder Kumar

Top comments (0)