DEV Community

Cover image for Decifer — Generate transcripts from audio using Flutter and Deepgram
Souvik Biswas
Souvik Biswas

Posted on • Edited on

Decifer — Generate transcripts from audio using Flutter and Deepgram

Overview

Decifer is a cross-platform mobile app that helps to generate transcripts either from a voice recording or by uploading an audio file.

Transcription Playback

Try out the app: https://play.google.com/store/apps/details?id=com.souvikbiswas.deepgram_transcribe

Typically, for using Deepgram API you would require to maintain a server but I have made this project totally serverless. To know more continue reading.

Here's a brief demo of the entire app in action:

Submission Category:

Analytics Ambassadors

Link to Code on GitHub

The entire app is open sourced - try it out and also feel free to contribute to this project 😉 :

GitHub logo sbis04 / decifer

Generate your audio transcripts with ease.

Decifer Codemagic build status

Blog post about this project: https://dev.to/sbis04/decifer-generate-transcripts-with-ease-5hl3

Try out the app: https://appdistribution.firebase.dev/i/a57e37b2fda28351

A cross-platform mobile app that helps you to generate transcripts either from a voice recording or by uploading an audio file. The project uses a totally serverless architecture.

Architecture

The mobile app is created using Flutter which is integrated with Firebase. Firebase Cloud Functions is used to deploy the backend code required for communicating with the Deepgram API.

App overview

The Flutter application consists of the following pages/screens:

  • Login Page
  • Register Page
  • Dashboard Page
  • Record Page
  • Upload Page
  • Transcription Page

For authenticating the user inside the app -- Login and Register pages are used. Authentication is required to generate unique accounts for users required for storing the generated transcripts to Firestore and facilitate cloud-sync.

Register Page

The Dashboard Page displays a list of all the transcripts currently present on the user's account. It also has two buttons -…

Project Description

The primary features of the app are as follows:

  • Generate transcript from audio recording & audio file using Deepgram API.
  • Cloud-sync for syncing across multiple devices using the same account.
  • Transcribe confidence map view.
  • Export as PDF and share with anyone.

Architecture

I'm using a totally serverless architecture for this project 🤯, let's have a look how it works:

Decifer architecture

The mobile app is created using Flutter which is integrated with Firebase. I have used Firebase Cloud Functions to deploy the backend code required for communicating with the Deepgram API.

Firebase Cloud Functions lets you run backend code in a severless architecture.

I have deployed the following function to Firebase:



const functions = require("firebase-functions");
const {Deepgram} = require("@deepgram/sdk");

exports.getTranscription = functions.https.onCall(async (data, context) => {
  try {
    const deepgram = new Deepgram(process.env.DEEPGRAM_API_KEY);
    const audioSource = {
      url: data.url,
    };

    const response = await deepgram.transcription.preRecorded(audioSource, {
      punctuate: true,
      utterances: true,
    });

    console.log(response.results.utterances.length);

    const confidenceList = [];
    for (let i =0; i < response.results.utterances.length; i++) {
      confidenceList.push(response.results.utterances[i].confidence);
    }

    const webvttTranscript = response.toWebVTT();

    const finalTranscript = {
      transcript: webvttTranscript,
      confidences: confidenceList,
    };

    const finalTranscriptJSON = JSON.stringify(finalTranscript);
    console.log(finalTranscriptJSON);

    return finalTranscriptJSON;
  } catch (error) {
    console.error(`Unable to transcribe. Error ${error}`);
    throw new functions.https.HttpsError("aborted", "Could not transcribe");
  }
});


Enter fullscreen mode Exit fullscreen mode

The getTranscription function takes an audio URL, generates the transcripts using Deepgram API along with the respective confidences, and returns the data in a particular JSON format (that can be parsed within the app).

App screens

The Flutter application consists of the following pages/screens:

  • Login Page
  • Register Page
  • Dashboard Page
  • Record Page
  • Upload Page
  • Transcription Page

For authenticating the user inside the app -- Login and Register pages are used. Authentication is required to generate unique accounts for users required for storing the generated transcripts to Firestore and facilitate cloud-sync.

Register Page

The Dashboard Page displays a list of all the transcripts currently present on the user's account. It also has two buttons - one for navigating to the Record Page and the other for navigating to the Upload Page.

Dashboard Page

Record Page lets you record your audio using the device microphone and the transcribe it using Deepgram. You always have an option to re-record if you think the last recording wasn't good.

Record Page

From the Upload Page, you can choose any audio file present on your device and generate the transcript of it.

Upload Page

Transcription Page is where the entire transcript can be viewed. It has an audio-transcript synchronized playback that highlights the text transcript part with respect to the audio that is playing.

Transcription Page

You can also see the confidence map of each of the parts of the transcript (it shows how much accurate is that part of transcript generation - darker is higher confidence).

Confidence Map

You can also easily print or share the generated transcript in the PDF format.

Export transcript

Deepgram

Overview of my Deepgram dashboard (completed the mission, Get a Transcript via API or SDK):

Deepgram Overview

Usage analytics of the Deepgram API:

Deepgram Usage Analytics

Log of one of the API calls for transcribing from audio:

Deepgram Logs

References

Top comments (10)

Collapse
 
valentinesean22 profile image
Valentine Sean Chanengeta

great piece of work 🔥🔥🔥

Collapse
 
sbis04 profile image
Souvik Biswas

Thanks! 😊

Collapse
 
bekahhw profile image
BekahHW

This is awesome! Great job on this.

Collapse
 
sbis04 profile image
Souvik Biswas

Thanks! 😊

Collapse
 
orimdominic profile image
Orim Dominic Adah

This is great man!
I love how you explained your process.
I wish you the best and I've starred the repo!
Awesome!

Collapse
 
sbis04 profile image
Souvik Biswas

Thanks! 😊

Collapse
 
sbis04 profile image
Souvik Biswas

Let me know in the comments what do you think about the project 🙂

Collapse
 
giuseppecrbrandi profile image
GiuseppeCrBrandi

Useless blog since you don't explain how the code works. This post and nothing is the same thing.

Collapse
 
sbis04 profile image
Souvik Biswas

Actually this is not an explanatory blog, this was just a hackathon submission. I'll try to write a proper blog post with the code explanation when I get some time.

Collapse
 
circlenaut profile image
Phillip Strefling

Useless reply since you don't explain your critique. This comment adds nothing to the conversation.