This is a submission for the AssemblyAI Challenge : Sophisticated Speech-to-Text.
What I Built
I developed an innovative note-taking software leveraging AssemblyAI's Universa-2 model, enabling users to transcribe audio into notes and store them securely. The platform offers a seamless experience for users who need accurate and versatile transcription tools with features tailored to specific use cases.
Core Features
Transcription Options
Users can choose from a range of transcription types to customize their audio processing:
Speech to Text: Converts spoken audio into a complete textual transcript.
Summarization: Provides a concise summary of the audio content.
Content Moderation: Identifies and flags sensitive or inappropriate content.
Auto Chapters: Automatically segments audio into summarized chapters.
Sentiment Analysis: Detects emotional tones within the audio (e.g., positive, negative, or neutral).
Entity Detection: Extracts named entities like people, organizations, or locations.
Topic Detection: Identifies key topics discussed in the audio.
Key Phrases: Highlights important phrases or points spoken in the audio.
Multi-Language Support
The software supports transcription in multiple languages, making it accessible to a global audience.
User Authentication
Users can sign up and log in to save and manage their transcriptions as notes securely in a database.
Note Management:
Saved notes can be easily retrieved, deleted, or shared.
Data Security
All notes are stored securely in a database, ensuring user privacy and data protection.
Tools:
- AssemblyAI
- NextJs
- Supabase
- Cloudinary
Demo
Transcrire - https://transcrire.vercel.app/
Journey
The application leverages Universal-2, AssemblyAI’s Speech-to-Text (STT) model, to power a note-taking platform with advanced transcription capabilities.
API Key: I acquired an API key from AssemblyAI and securely stored it in the environment variables for authentication.
The website uses the AssemblyAI’s SDK's to transcribe audio files and retrieve additional insights like summarizations, sentiment analysis, and more.
Once the audio file is uploaded using cloudinary, the app initiates a transcription request by sending the upload_url to AssemblyAI’s transcription SDK. The request also includes optional parameters like language detection, speaker diarization, or content moderation, depending on the selected feature.
Users can sign up to:
Save transcribed notes to a Supabase database.
View, or delete their transcriptions.
Copy or share transcriptions directly from the app.
The integration of AssemblyAI's Universal-2 model allows the website to handle diverse audio transcription needs effectively, transforming audio into actionable text while enabling users to manage and utilize their notes seamlessly. This robust implementation showcases how Universal-2’s advanced capabilities can create an intuitive and feature-rich note-taking experience.
Top comments (0)