In this blog post we will be creating transcripts for YouTube videos using Deepgram's Speech Recognition API. First, we will download videos and convert them to mp3 audio files. Then, we will use Deepgram to generate a transcript. Finally, we will store the transcript in a text file and delete the media file.
We need a sample video, so I am using a Shang-Chi and The Legend of The Ten Rings teaser trailer - if that is a spoiler for you please go ahead and grab another video link.
Before We Start
You will need:
- Node.js installed on your machine - download it here.
- A Deepgram project API key - get one here.
- A YouTube Video ID which is part of the URL of a video. The one we will be using is
ir-mWUYH_uo
.
Create a new directory and navigate to it with your terminal. Run npm init -y
to create a package.json
file and then install the following packages:
npm install @deepgram/sdk ffmpeg-static youtube-mp3-downloader
Create an index.js
file, and open it in your code editor.
Preparing Dependencies
At the top of your file require these four packages:
const fs = require('fs')
const YoutubeMp3Downloader = require('youtube-mp3-downloader')
const { Deepgram } = require('@deepgram/sdk')
const ffmpeg = require('ffmpeg-static')
fs
is the built-in file system module for Node.js. It is used to read and write files which we will be doing a few times throughout this post. ffmpeg-static
includes a version of ffmpeg in our node_modules directory, and requiring it returns the file path.
Initialize the Deepgram and YouTubeMp3Downloader clients:
const deepgram = new Deepgram('YOUR DEEPGRAM KEY')
const YD = new YoutubeMp3Downloader({
ffmpegPath: ffmpeg,
outputPath: './',
youtubeVideoQuality: 'highestaudio',
})
Download Video and Convert to MP3
Under the hood, the youtube-mp3-downloader
package will download the video and convert it with ffmpeg
on our behalf. While it is doing this it triggers several events - we are going to use the progress
event so we know how far through the download we are, and finished
which indicates we can move on.
YD.download('ir-mWUYH_uo')
YD.on('progress', (data) => {
console.log(data.progress.percentage + '% downloaded')
})
YD.on('finished', async (err, video) => {
const videoFileName = video.file
console.log(`Downloaded ${videoFileName}`)
// Continue on to get transcript here
})
Save and run the file with node index.js
and you should see the file progress in your terminal and then have the file available in your file directory.
Get Transcript from Deepgram
Where the comment is above, prepare and create a Deepgram transcription request:
const file = {
buffer: fs.readFileSync(videoFileName),
mimetype: 'audio/mp3',
}
const options = {
punctuate: true,
}
const result = await deepgram.transcription
.preRecorded(file, options)
.catch((e) => console.log(e))
console.log(result)
There are lots of options which can make your transcript more useful including diarization which recognizes different speakers, a profanity filter which replaces profanity with nearby terms, and punctuation. We are using punctuation in this tutorial to show you how setting options works.
Rerun your code and you should see a JSON object printed in your terminal.
Saving Transcript and Deleting Media
There is a lot of data that comes back from Deepgram, but all we want is the transcript which, with the options we provided, is a single string of text. Add the following line to access just the transcript:
const transcript = result.results.channels[0].alternatives[0].transcript
Now we have the string, we can create a text file with it:
fs.writeFileSync(
`${videoFileName}.txt`,
transcript,
() => `Wrote ${videoFileName}.txt`
)
Then, if desired, delete the mp3 file:
fs.unlinkSync(videoFileName)
Summary
Transcribing YouTube videos has never been easier thanks to Deepgram's Speech Recognition API and the Deepgram Node SDK. Your final code should look like this:
const fs = require('fs')
const YoutubeMp3Downloader = require('youtube-mp3-downloader')
const { Deepgram } = require('@deepgram/sdk')
const ffmpeg = require('ffmpeg-static')
const deepgram = new Deepgram('YOUR DEEPGRAM KEY')
const YD = new YoutubeMp3Downloader({
ffmpegPath: ffmpeg,
outputPath: './',
youtubeVideoQuality: 'highestaudio',
})
YD.download('ir-mWUYH_uo')
YD.on('progress', (data) => {
console.log(data.progress.percentage + '% downloaded')
})
YD.on('finished', async (err, video) => {
const videoFileName = video.file
console.log(`Downloaded ${videoFileName}`)
const file = {
buffer: fs.readFileSync(videoFileName),
mimetype: 'audio/mp3',
}
const options = {
punctuate: true,
}
const result = await deepgram.transcription
.preRecorded(file, options)
.catch((e) => console.log(e))
const transcript = result.results.channels[0].alternatives[0].transcript
fs.writeFileSync(
`${videoFileName}.txt`,
transcript,
() => `Wrote ${videoFileName}.txt`
)
fs.unlinkSync(videoFileName)
})
Check out the other options supported by the Deepgram Node SDK and if you have any questions feel free to reach out to us on Twitter (we are @DeepgramDevs).
Top comments (0)