Transition Guide Agora Web Video SDK to Zoom Web Video SDK
Quick Snapshot of steps:
Install the Zoom Web Video SDK
Import Zoom Video SDK
-
Replace Agora function join logic with Zoom function join logic
- Implement backend logic for handling token generation the using both SDKs REST APIs
Implement camera and audio setup using Zoom SDK methods. Also implement methods for controlling the media peripherals (mute and unmute)
Implement callbacks for joining, leaving, and video state changes SDK listeners
Implement logic for controlling remote user video for each sdk listeners
Implement meeting leave logic
Features of this app:
- Dynamic selection between sdks in the same app
- Join session unique to each service
- Muted/Unmuted media peripherals
- Leave meeting
- Check for meeting limit exceeded using REST API
Installing the SDK
Replace Agoras RtcEngine with Zooms SDK by either inputting the CDN link into the index.html file via
Access Zoom Video SDK in the Zoom Marketplace
- Go to marketplace.zoom.us to login to your Zoom Video SDK account
- Click Build Video SDK on the right corner and you are navigated to the credentials page of your app where you can see your Video SDK credentials and API credentials
- Store these credentials in your .env folder for local development
ZOOM_SDK_KEY=
ZOOM_SDK_SECRET=
ZOOM_API_KEY=
ZOOM_API_SECRET=
AGORA_APP_ID=
AGORA_APP_CERTIFICATE=
AGORA_CUST_ID=
AGORA_CUST_SECRET=
Import Zoom SDK
Replace Agora’s RtcEngine with Zooms SDK by either inputting the CDN link into the index.html file via
$ npm install @zoom/videosdk
In zoom.service.ts, import the Zoom Video SDK via this statement:
import ZoomVideo from '@zoom/videosdk'
Frontend changes
Zoom SDK uses canvas to paint and render video data to. Replace the div element with a canvas element with the desire styling and #videoCanvas reference as shown (in this case we add, the canvas element as an option to be selected based off ngIf):
From:
<div class="video-canvas">
<div *ngIf="mode === 'agora'" class="center-video-div" #videoCanvas></div>
</div>
To:
<div class="video-canvas">
<canvas *ngIf="mode === 'zoom'" width="1920" class="center-video-canvas" height="1080" #videoCanvas></canvas>
<div *ngIf="mode === 'agora'" class="center-video-div" #videoCanvas></div>
</div>
Backend
We use a local NodeJs server for handling the token generation, credential fetching, and secure REST api calls for both services. Credentials are stored in a .env file not tracked with git and imported into the server using the dotenv library. We are running the express framework and a fetch library for parity in fetching with our frontend.
Our endpoints are:
/zoom-session-count - uses the Zoom REST api to fetch the amount of users in the session. We then return this number to the frontend. Utilizes the utility function generateJWTToken. This uses the /videosdk/sessions/ Zoom REST API Endpoint
/zoomtoken - generates a Zoom Video SDK jwt token to use when joining a session. The utility function generateJWTToken is called here. This uses the /videosdk/sessions/ Zoom REST API Endpoint
/agora-token - generates an Agora Video SDK jwt token to use when joining a session.
/agora-appid - returns the Agora appID stored with the .env file
/agora-channel-count - uses the Agora REST api to fetch the amount of users in the channel. We then return this number to the frontend. This uses the /channel/user/ Agora REST API Endpoint
Joining the meeting
We use the same information given in the form to join the Agora session. In ui.service.ts, the mode is set to Agora so the logic for joining an Agora session will be executed:
ui.service.ts
async joinSession(name: string, sessionId: string, password: string): Promise<boolean> {
let exceeded: boolean = false;
switch (this.toggle.mode) {
case "zoom":
console.log("joining zoom");
await this.zoomService.joinSession(name, sessionId, password).then( (e) => { exceeded = e; } );
break;
case "agora":
console.log("joining agora");
await this.agoraService.joinSession(name, sessionId, password).then( (e) => { exceeded = e; } );
break;
}
return exceeded;
}
First, we check if the limit of 4 users a session is exceeded via the Agora REST API:
agora.service.ts
let count!: number;
let url: string = "http://localhost:3001/agora-channel-count/?channelname=" + sessionId;
await fetch(url).then( async res => {
await res.text().then(data => {
count = parseInt(data);
console.log(count, data)
});
});
if (count >= 4) return true;
If the limit is exceeded, the exceeded flag is set to true which is seen by the frontend and the session is not joined. An alert is thrown to notify the user of the limit being reached. If not, the Agora appID an and Agora token are retrieved from our backend server and the session is joined using the Agora SDK join function:
agora.service.ts
this.sessionId = sessionId;
this.agoraEngine = AgoraRTC.createClient({ mode: "rtc", codec: "vp8" });
url = "http://localhost:3001/agora-appID";
let settings: Headers = {
mode: "cors",
method: 'GET',
headers: {
"Content-Type": "text/plain"
}
};
await fetch(url, <Object>settings).then( async res => {
await res.text().then(data => {
this.appID = data.toString().trim();
});
});
url = "http://localhost:3001/agora-token?name=" + name + "&topic=" + sessionId + "&password=" + password;
await fetch(url, <Object>settings).then( async res => {
await res.text().then(data => {
this.token = data.toString().trim();
});
});
await this.agoraEngine!.join(this.appID, this.sessionId, this.token, this.uid);
return false;
}
Remote users in Agora will have the userPublished event triggered for all new users that publish their media data.
The Zoom SDK join logic operates the same way. First, given the mode is set to Zoom, the logic for joining a Zoom session is executed in ui.service.ts.
First, we check if the limit of 4 users a session is exceeded via the Agora REST API:
zoom.service.ts
let count!: number;
let check: string = "http://localhost:3001/zoom-session-count/?sessionname=" + sessionId;
await fetch(check).then( async res => {
await res.text().then(data => {
count = parseInt(data);
console.log(count, data)
});
});
if (count >= 4) return true;
If the limit is exceeded, the exceeded flag is set to true which is seen by the frontend and the session is not joined. An alert is thrown to notify the user of the limit being reached. If not, the Zoom Video SDK JWT token is retrieved from our backend server and the session is joined using the Zoom SDK join function:
zoom.service.ts
let url: string = "http://localhost:3001/zoomtoken?name=" + name + "&topic=" + sessionId + "&password=" + password;
let settings: Headers = {
mode: "cors",
method: 'POST',
headers: {
"Content-Type": "text/plain"
}
};
await fetch(url, <Object>settings).then( async res => {
await res.text().then(data => {
token = data.toString().trim();
});
});
await this.client.join(sessionId, token, name, password).then(() => {
this.stream = this.client.getMediaStream();
this.populateParticipantList();
}).catch((error: any) => {
console.log(error);
});
return false;
For remote users, the userAdded event is triggered once a user joins and that user is added to the participantList. Their video and/or audio can be consumed once they setup their peripherals later in the process:
private userAdded = ()=>{
let participantList: any = this.client.getAllUser();
participantList.forEach((participant: any, i: number) => {
this.participants[i].userId = participant.userId;
});
console.log(this.participants);
};
Setting up Media
Media setup in Agora uses a pub/sub model where users publish their video and audio to the channel and other user subscribe to receive that video and audio. In this app, we use a local user object and a remote user array objects, each containing identical objects with different property names, to keep track of the user and their peripherals
private localUser: AgoraParticipant = {
userId: '',
PlayerContainer: null,
AudioTrack: null,
VideoTrack: null,
};
private remoteParticipantGrid: AgoraRemoteParticipant[] = [
{
userId: '',
PlayerContainer: null,
AudioTrack: null,
VideoTrack: null
},
{
userId: '',
PlayerContainer: null,
AudioTrack: null,
VideoTrack: null
},
{
userId: '',
PlayerContainer: null,
AudioTrack: null,
VideoTrack: null
}
]
After joining the session, we have to setup the local user to publish their video to the channel:
async setupLocalUserView(): Promise<void> {
console.log(this.videoCanvas, this.appID, this.token, this.uid, this.sessionId);
this.localUser.PlayerContainer = this.videoCanvas;
this.localUser.PlayerContainer!.id = this.uid.toString();
this.createGrid();
await AgoraRTC.createMicrophoneAndCameraTracks().then( async (audioVideoTracks) => {
this.localUser.AudioTrack = audioVideoTracks[0];
this.localUser.VideoTrack = audioVideoTracks[1];
});
await this.agoraEngine!.publish([<ILocalTrack>this.localUser.AudioTrack, <ILocalTrack>this.localUser.VideoTrack]);
this.localUser.VideoTrack!.play(<HTMLElement>this.localUser.PlayerContainer);
await this.localUser.AudioTrack!.setMuted(true);
console.log("publish success!");
}
The local user media is then played (this is an injected video element so the .play method is used to play the video)
We then use the Agora SDK listeners with callback function, userPublished to publish this media for other users to see and to subscribe to each of the existing participants media:
private userPublished = async (user: IAgoraRTCRemoteUser, mediaType: any) => {
await this.agoraEngine!.subscribe(user, mediaType);
let userFound: number = this.remoteParticipantGrid.findIndex( (participant: AgoraRemoteParticipant) => { return (participant.userId === user.uid.toString()) });
let newUser: number = this.remoteParticipantGrid.findIndex( (participant: AgoraRemoteParticipant) => { return (participant.userId === '') });
let userIndex = (userFound != -1) ? userFound : newUser;
if (mediaType == "video") {
this.remoteParticipantGrid[userIndex].VideoTrack = <IRemoteVideoTrack>user.videoTrack;
this.remoteParticipantGrid[userIndex].AudioTrack = <IRemoteAudioTrack>user.audioTrack;
this.remoteParticipantGrid[userIndex].userId = user.uid.toString();
this.remoteParticipantGrid[userIndex].PlayerContainer!.id = user.uid.toString();
this.remoteParticipantGrid[userIndex].VideoTrack!.play(<HTMLElement>this.remoteParticipantGrid[userIndex].PlayerContainer);
}
if (mediaType == "audio") {
this.remoteParticipantGrid[userIndex].AudioTrack = <IRemoteAudioTrack>user.audioTrack;
this.remoteParticipantGrid[userIndex].AudioTrack!.play();
}
console.log("SUBSCRIBE SUCCESS", this.remoteParticipantGrid);
}
Each remote user is added to the remoteUser array so we can quickly access their data as well as control their video element in the DOM
Zoom handles this in an easier way where we can use only a single canvas element to paint the user video by giving xy-coordinates to the Zoom Video SDK render method. After joining the session, a mediastream is retrieved by the SDK to control participant video and audio data given to the client by the SDK:
await this.client.join(sessionId, token, name, password).then(() => {
this.stream = this.client.getMediaStream();
this.populateParticipantList();
}).catch((error: any) => {
console.log(error);
});
We use a single participants array to keep track of all users and their data within the session:
private participants: Participant[] = [ //placeholder values
{userId : '', X : 0, Y : 540},
{userId : '', X : 960, Y : 540},
{userId : '', X : 0, Y : 0},
{userId : '', X : 960, Y : 0}
];
In ui.service.ts, we call the setupCamera and setupMicrophone methods. The SDK finds our video and audio peripherals automatically and starts them upon calling .startVideo and .startAudio, respectively:
async setupCamera(): Promise<void> {
await this.stream.startVideo().then( () => {
this.stream.renderVideo(this.videoCanvas, this.participants[0].userId, 960, 540, this.participants[0].X, this.participants[0].Y,3);
});
console.log("camera setup");
}
async setupMicrophone(): Promise<void> {
await this.stream.startAudio().then( () => {
setTimeout(() => { //I get error "no audio joined" if I dont have settimeout. not sure why this is a race condition even though then is being used. id rather not use this
this.stream.muteAudio();
console.log("audio muted");
}, 150);
});
console.log("microphone setup");
}
We also start our SDK event listeners to listener for user-added, user-removed, and peer-video-state-changed events during the session. Finally we render all participant videos of all users within the session on the canvas:
async renderParticipantsVideo(): Promise<void> {
console.log("rendering participant videos");
this.participants.forEach( async (participant) => {
if (participant.userId !== '') {
console.log("rendering participant:", participant.userId, participant.X, participant.Y);
await this.stream.stopRenderVideo(this.videoCanvas, participant.userId);
await this.stream.renderVideo(this.videoCanvas, participant.userId, 960, 540, participant.X, participant.Y,3);
}
});
}
This method offers minimum access to the DOM and lets the SDK handle painting on the backend. You only have to give the location of the video to the SDK.
Controlling Audio and Video
Both Agora and Zoom offer easy ways to control the audio and video by using their mute and unmute functions. There are also methods for checking the state of the audio and video which we use to check if either peripheral is muted or not. Here we can see the similarities in the functions for Zoom and Agora:
agora.service.ts
async toggleAudio(): Promise<void> {
if (this.localUser.AudioTrack!.muted) {
await this.localUser.AudioTrack!.setMuted(false);
} else {
await this.localUser.AudioTrack!.setMuted(true);
}
}
async toggleVideo(): Promise<void> {
if (this.localUser.VideoTrack!.muted) {
await this.localUser.VideoTrack!.setMuted(false);
} else {
await this.localUser.VideoTrack!.setMuted(true);
}
}
isMutedAudio() {
return this.localUser.AudioTrack!.muted;
}
isMutedVideo() {
return this.localUser.VideoTrack!.muted;
}
zoom.service.ts
async toggleVideo(): Promise<void> {
if (!this.client.getCurrentUserInfo().bVideoOn) {
await this.stream.startVideo().then( () => {
this.stream.renderVideo(this.videoCanvas, this.participants[0].userId, 960, 540, this.participants[0].X, this.participants[0].Y,3);
});
} else {
await this.stream.stopVideo();
}
}
async toggleAudio(): Promise<void> {
if (this.client.getCurrentUserInfo().muted) {
await this.stream.unmuteAudio();
} else {
await this.stream.muteAudio();
}
}
isMutedAudio(): boolean {
return this.client.getCurrentUserInfo().muted;
}
isMutedVideo(): boolean {
return this.client.getCurrentUserInfo().bVideoOn;
}
The major difference between the SDKs is how listeners function. For Agora, the user video and/or audio is unpublished if the video and/or audio is muted. This event is picked up by the userUnpublished listeners and since it is unpublished, subscribed users do not see and/or hear the affected media:
private userPublished = async (user: IAgoraRTCRemoteUser, mediaType: any) => {
await this.agoraEngine!.subscribe(user, mediaType);
let userFound: number = this.remoteParticipantGrid.findIndex( (participant: AgoraRemoteParticipant) => { return (participant.userId === user.uid.toString()) });
let newUser: number = this.remoteParticipantGrid.findIndex( (participant: AgoraRemoteParticipant) => { return (participant.userId === '') });
let userIndex = (userFound != -1) ? userFound : newUser;
if (mediaType == "video") {
this.remoteParticipantGrid[userIndex].VideoTrack = <IRemoteVideoTrack>user.videoTrack;
this.remoteParticipantGrid[userIndex].AudioTrack = <IRemoteAudioTrack>user.audioTrack;
this.remoteParticipantGrid[userIndex].userId = user.uid.toString();
this.remoteParticipantGrid[userIndex].PlayerContainer!.id = user.uid.toString();
this.remoteParticipantGrid[userIndex].VideoTrack!.play(<HTMLElement>this.remoteParticipantGrid[userIndex].PlayerContainer);
}
if (mediaType == "audio") {
this.remoteParticipantGrid[userIndex].AudioTrack = <IRemoteAudioTrack>user.audioTrack;
this.remoteParticipantGrid[userIndex].AudioTrack!.play();
}
console.log("SUBSCRIBE SUCCESS", this.remoteParticipantGrid);
}
For Zoom, user media is tracked using the stream object retrieved from the SDK. Media state changes are seen by the userVideoStateChange listener where it determines whether or not to render a user's video based on the event state:
private userVideoStateChange = (payload: any) => {
let userIndex = this.participants.findIndex( (participant) => { return (participant.userId === payload.userId) });
console.log("video state change:", payload.userId);
if (payload.action === 'Start') {
this.stream.renderVideo(this.videoCanvas, payload.userId, 960, 540, this.participants[userIndex].X, this.participants[userIndex].Y,3);
} else if (payload.action === 'Stop') {
this.stream.stopRenderVideo(this.videoCanvas, payload.userId);
}
};
Leaving the session
Leaving the session and channel are similar for both SDKs. The media peripherals have to be stopped first and then the session can be left which triggers the respective leaving meeting listener (userRemoved for Zoom, userLeft for Agora) The sequence is shown for each SDK below:
Agora
async leaveSession(): Promise<void> {
this.localUser.AudioTrack!.close();
this.localUser.VideoTrack!.close();
await this.agoraEngine!.leave();
console.log("You left the channel");
}
userLeft function:
private userLeft = async (user: IAgoraRTCRemoteUser, reason: string) => {
let userIndex: number = this.remoteParticipantGrid.findIndex( (participant: AgoraRemoteParticipant) => { return (participant.userId === user.uid.toString()) });
let parent: HTMLElement|null = document.querySelector("body > app-root > div > app-meeting > div > app-video-client > app-video-canvas > div");
this.renderer.removeChild(parent, this.remoteParticipantGrid[userIndex].PlayerContainer);
this.remoteParticipantGrid.splice(userIndex, 1);
let PlayerContainer: HTMLElement = this.renderer.createElement('div');
this.renderer.setStyle(PlayerContainer, "width", "550px");
this.renderer.setStyle(PlayerContainer, "height", "380px");
this.renderer.setStyle(PlayerContainer, "display", "inline-block");
this.renderer.appendChild(parent, PlayerContainer);
let newSlot: AgoraRemoteParticipant = {
userId: '',
PlayerContainer: PlayerContainer,
AudioTrack: null,
VideoTrack: null
}
this.remoteParticipantGrid.push(newSlot);
console.log("new grid", this.remoteParticipantGrid);
};
Zoom
async leaveSession(): Promise<void> {
await this.stream.stopVideo();
await this.stream.stopAudio();
console.log("audio and video stopped");
if (this.client.isHost()){
console.log("ending session");
await this.client.leave(true);
} else {
await this.client.leave();
}
}
userRemoved function:
private userRemoved = async (payload: any)=>{
console.log("user left,", payload);
this.participants = [ //reset array
{userId : '', X : 0, Y : 540},
{userId : '', X : 960, Y : 540},
{userId : '', X : 0, Y : 0},
{userId : '', X : 960, Y : 0} ];
let participantList: any = this.client.getAllUser();
console.log("user removed", participantList);
participantList.forEach((participant: any, i: number) => {
this.participants[i].userId = participant.userId;
});
console.log(this.participants);
await this.renderParticipantsVideo();
};
Now we have a fully functional video conferencing app powered by the Zoom Video SDK!
Top comments (0)