I.Introduction
Shazam is the popular music recognition application made by Chris Barton and Philip Inghelbrecht both student of university of california. Shazam identifies songs based on an audio fingerprint based on a time-frequency graph called a spectrogram. It uses a smartphone or computer's built-in microphone to gather a brief sample of audio being played. Shazam stores a catalogue of audio fingerprints in a database. The user tags a song for 10 seconds and the application creates an audio fingerprint. Shazam works by analyzing the captured sound and seeking a match based on an acoustic fingerprint in a database of millions of songs. If it finds a match, it sends information such as the artist, song title, and album back to the user. Some implementations of Shazam incorporate relevant links to services such as iTunes, Apple Music, Spotify, YouTube, or Groove Music.
I.1.Prerequisites
To follow this article, you need to have Node.js installed and to understand the basics such as how to set up a simple server and to configure it, knowledge real time communication with socket.io and http requests, you have also to understand basics of javascript. Here we are ,so let start, and clone the shazam app. In our case we are going to get all data we need about the song which can be used and displayed as we want,but in this article were going only to fetch data.
II. Server side
II.1.Setting up our server
Start by creating a new Node.js project. Create a folder with the name of your choice and run the following commands from the terminal.
npm init -y
npm install axios ejs express socket.io http
npm init initializes a new node.js application whereas the second command installs express ,ejs ,axios and soceket.io,http With that done, let’s go ahead and create our server. Create an index.js file and write the following code in it.
const express = require('express');
const app = express();
const server = require('http').createServer(app);
const io = require('socket.io')(server, { cors: { origin: "*" } });
app.use(express.json());
app.use('/static', express.static(path.join(__dirname, './src')));
app.set('view engine', 'ejs');
app.set('socketio', io);
app.set("views", path.join(__dirname, "views"));
function socket() {
io.on('connection', (socket) => {
socket.on('song', (blob) => {
console.log(blob);
const buffer = blob64);
const base64 = buffer.toString('base64');
const options = {
method: 'POST',
url: 'https://shazam.p.rapidapi.com/songs/detect',
headers: {
'content-type': 'text/plain',
'x-rapidapi-key':
`${process.env.SHAZAM_RAPID_API_KEY}`,
'x-rapidapi-host': 'shazam.p.rapidapi.com'
},
data: `${base64}`
};
axios.request(options).then(function (response) {
console.log(response.data);
const matches = response.data.matches.length;
console.log(matches)
});
}
}
socket();
const port = process.env.PORT || 7070
server.listen(port, () => { console.log("server is running
on 7070")
});
In the code above we initialize our server with express, and the socket.io ,this package allows a real time communication between the sever and the client without refreshing the browser.
const app = express();
const server = require('http').createServer(app);
const io = require('socket.io')(server, { cors: { origin: "*" } });
Then we set middlewares ,get documentation about middleware in express.js here
app.use(express.json());
app.use('/static', express.static(path.join(__dirname, './src')));
app.set('view engine', 'ejs');
app.set('socketio', io);
app.set("views", path.join(__dirname, "views"));
After that ,in the function called “socket” we create a connection between the server and the client .
function socket() {
io.on('connection', (socket) => {
...
}}
Then we catch the event called “song” sent from the client with the data “blob”, as we said, the shazam API from rapid API,accepts only “base64” as the body ,that mean we converted this data like a “blob” on this line :
const buffer = blob;
const base64 = buffer.toString('base64');
here we are ,let make the request to the end point. Before that go to the rapid API(link here) and subscribe to the API to get the API KEY. That done,let make the request :
const options = {
method: 'POST',
url: 'https://shazam.p.rapidapi.com/songs/detect',
headers: {
'content-type': 'text/plain',
'x-rapidapi-key': `${process.env.SHAZAM_RAPID_API_KEY}`,
'x-rapidapi-host': 'shazam.p.rapidapi.com'
},
data: `${base64}`
};
axios.request(options).then(function (response) {
console.log(response.data);
const matches = response.data.matches.length;
console.log(matches)
})
III. Client side
The shazam API from RapidAPI accepts only base64 string as a body of the request. That means ,we are going to record a piece of song , and get his blob then convert that blob to the base64 string. Note that,the song recorded must be a mono channel. (put the ref link)
II.1.Create an basic "ejs" template in "views" folder,named "record.ejs"
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<script src="https://cdn.socket.io/3.1.3/socket.io.min.js" integrity="sha384-cPwlPLvBTa3sKAgddT6krw0cJat7egBga3DJepJyrLl4Q9/5WLra3rrnMcyTyOnh" crossorigin="anonymous"></script>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Shazam</title>
</head>
<body class="main-body">
<button id=’record’>Record</button>
<script src="/static/record.js"></script>
</body>
</html>
The code above as you can see, it’s a simple ejs template that contains a button called record. After that let create the in the src folder the javascript file called record.js,that is imported in record.ejs by the script tag.
var gumStream;
//stream from getUserMedia()
var rec;
//Recorder.js object
var input;
//MediaStreamAudioSourceNode we'll be recording
// shim for AudioContext when it's not avb.
const AudioContext = window.AudioContext || window.webkitAudioContext;
const audioContext = new AudioContext;
const record = document.getElementById('record');
record.addEventListener('click', startRecording);
function startRecording() {
console.log('recoding started');
var constraints = {
audio: true,
video: false
}
navigator.mediaDevices.getUserMedia(constraints).then(function (stream) {
/* assign to gumStream for later use */
gumStream = stream;
/* use the stream */
input = audioContext.createMediaStreamSource(stream);
/* Create the Recorder object and configure to record mono sound (1 channel) Recording 2 channels will double the file size */
rec = new Recorder(input, {
numChannels: 1
});
//start the recording process
rec.record()
// console.log("Recording started");
setTimeout(stopRecording, 3000);
}).catch(function (err) {
console.log(err);
});
}
function stopRecording() {
//tell the recorder to stop the recording
rec.stop();
//stop microphone access
gumStream.getAudioTracks()[0].stop();
//create the wav blob and pass it on to createDownloadLink
rec.exportWAV(createDownloadLink);
}
function createDownloadLink(blob) {
// console.log(blob)
socket.emit('song', blob);
}
Let explain what happens in the code above: we created gloabals variables that is going to store our media stream.
var gumStream;
//stream from getUserMedia()
var rec;
//Recorder.js object
var input;
//MediaStreamAudioSourceNode we'll be recording
To work with web audio API, we created the audio context that allows us to get all the fonctionnalities of the audio API
const AudioContext = window.AudioContext || window.webkitAudioContext;
const audioContext = new AudioContext;
An other way to create a context is :
const audioContext = new AudioContext();
Now we get our principle variable,and also we have access to all fonctionnalities of dealing with the audio API. To start recording is launched by the ‘’click’’ event on the ‘’record button’’.
record.addEventListener('click', startRecording);
Then we created the ‘’startRecoding function”.
function startRecording() {
...
}
First in this function we define which kind of media stream we need to record,in this case we need only audio.
function startRecording() {
…
var constraints = {
audio: true,
video: false
}
}
Then navigator.mediaDevices.getUserMedia(constraints) , prompts the user for the permission to use media input which produces a media stream with tracks specified in the constraints object . Note that,this method returns a “Promise” that resolves to a MediaStream object. If the user denies permission, or matching media is not available, then the promise is rejected with NotAllowedError or NotFoundError.
The promise resolved,we get the stream which will be stored in var gumStream .
function startRecording() {
…
navigator.mediaDevices.getUserMedia(constraints).then(function (stream) {
/* assign to gumStream for later use */
gumStream = stream;
...
}
}
By default the song recorded by the media device has two channels. So,to manipulate our stream, for that we used the createMediaStreamSource() method of the audioContext. A new MediaStreamAudioSourceNode object representing the audio node whose media is obtained from the specified source stream is stored in the input variable.
function startRecording() {
…
navigator.mediaDevices.getUserMedia(constraints).then(function (stream) {
/* assign to gumStream for later use */
gumStream = stream;
. input = audioContext.createMediaStreamSource(stream);
...
}
}
Then we created a Recorder object and configure to record mono sound (1 channel).
function startRecording() {
…
navigator.mediaDevices.getUserMedia(constraints).then(function (stream) {
/* assign to gumStream for later use */
gumStream = stream;
. input = audioContext.createMediaStreamSource(stream);
. rec = new Recorder(input, {
numChannels: 1
})
...
}
}
After that, now we are able to start the recording with rec.record()
,but we don’t need to record all the song,just few second that can be useful to the shazam API.let take 3 seconds. To handle that, we used the setTimeout(stopRecording, 3000)
method; stopRecording()
,stop the record process with rec.stop()
;and stop the access to the record media device (in this case the microphone). Also,inside this function we created the WAV(Waveform Audio File Format is an audio file format standard for storing an audio bitstream ) blob and pass it to the createDownloadLink()
function,then we sent it to the server using socket.io
.
function stopRecording() {
//tell the recorder to stop the recording
rec.stop();
//stop microphone access
gumStream.getAudioTracks()[0].stop();
//create the wav blob and pass it on to createDownloadLink
rec.exportWAV(createDownloadLink);
}
function createDownloadLink(blob) {
// console.log(blob)
//send the blog to the server .
socket.emit('song', blob);
}
Conclusion
Thanks for reading.
You can find a full code here.
Top comments (2)
Nice article, well explained 🫡
thank you bro
Some comments may only be visible to logged-in visitors. Sign in to view all comments.