Introduction
I am pretty sure you guys might have used OTT/Streaming platforms like Netflix, Hotstar or youtube. Then you might have also experienced the UI that appears when you try to hover on the seek bar of the video of any of these platform like below:
This user experience is excellent. It allows you to quickly access video insights from later or previous parts of the video. I refer to this component as the FrameTooltip
component because it functions as a tooltip that displays the video frame at a specific time period.
In this blog post, we will build this component. So, without any further ado, let's get started.
Prerequisites
- Basic to Intermediate understanding of react ecosystem, and context APIs.
- Version control: Git
- Javascript and Typescript
- Basic knowledge of running python scripts
The What?
We are trying to build a tooltip component that displays the video frame at a specific period during the following interaction:
- When hovering on the seek bar of the video.
- While dragging the thumb of the seek bar of the video.
The Why?
We are doing this because:
- To explore real-world scenarios and its complexities.
- To enhance our mindset from a system design point of view.
- It’s fun 😀
Some backstory
This blog is a part of this series which I recommend you guys to have a read. But to give a quick heads-up on what’s the project architecture let me give you a brief:
So this project is where I am trying to implement a youtube's video player clone with react. The project architecture is explained in the following GIF:
Different controls are available on the video such as seek bar, play button, etc. I call this control
components. Whenever these control component changes a global state is updated which in turn updates the Video’s actual state. You can read more about this here.
The How?
Now let us dive into what it takes to build up this component:
Below are the steps that are required:
- Video Preprocessing steps: whatever the video sample that you choose needs to go through the below steps
- Generating Image Sprite of the video
- Generating VTT file of the video based on the image sprite
- Adding Image Coordinates in the VTT.
- Loading the VTT file in the video
- Extracting the data from the VTT to the project
I know there are too many jargons here but please bear with me, I will be explaining all of it in the coming sections.
Video Preprocessing Step: Generating Image Sprite
It is imperative for us to first understand what the hell are Image Sprites.
- It’s an image that consists of multiple images.
- It is efficient to use a single image that consists of many images rather than fetching every image.
Consider this GIF for an example:
In GIF, at first, we have multiple images but later all of them are placed in the grid. That grid here represents the image sprites.
So based on this concept, we extrapolate our first step i.e. generating image sprites of the video:
- We first convert all the frames of the video into an image.
- And then put all of them into a single JPEG file.
Video Preprocessing Step: Generating VTT file for the Video
So what is a VTT file?
- VTT stands for Video Text Tracks. It is a file format for displaying timed text tracks using the track element.
- The entire purpose of the VTT is to place an overlay text on videos just like subtitles.
- You can read more about VTT on mdn.
Any VTT file would have the following format:
WEBVTT
00:01.000 --> 00:04.000
- [subtitle 1]: Lorem Ipsum has been the industry's standard dummy text
00:02.000 --> 00:03.000
- [subtitle 2]: when an unknown printer took it to make a type specimen book.
The above text here represents that from 1st second till the 4th second, display the subtitle: [subtitle 1]: Lorem Ipsum has been the industry's standard dummy text
.
As you can see, each entry in this file corresponds to text that would be presented to the video at the specified period. Have a look at this in action below:
In the above GIF, you can also see that we load this VTT file with the help of the track
element.
For our example, we will do the following:
- Our .VTT file will contain for each second the name of the image sprite file along with current seconds video frame co-ordinates
- We store this data in the cues because we can refer to the respective sprite file in that time span of the video
- We later prepend this cue with the host url that store the sprite file e.g. Dropbox
We make sure that our VTT file will look like below:
WEBVTT
Img 1
00:00:00.000 --> 00:00:01.000
tears-of-steel-battle-clip-medium_sprite.jpg#xywh=0,0,200,83
Img 2
00:00:01.000 --> 00:00:02.000
tears-of-steel-battle-clip-medium_sprite.jpg#xywh=200,0,200,83
Img 3
00:00:02.000 --> 00:00:03.000
tears-of-steel-battle-clip-medium_sprite.jpg#xywh=400,0,200,83
Img 4
00:00:03.000 --> 00:00:04.000
tears-of-steel-battle-clip-medium_sprite.jpg#xywh=600,0,200,83
Img 5
00:00:04.000 --> 00:00:05.000
tears-of-steel-battle-clip-medium_sprite.jpg#xywh=800,0,200,83
Each cue here presents the following thing:
All these steps can be done manually with online tools but while implementing this component I found this cool python script: https://github.com/vlanard/videoscripts that combines all these steps and generates the .vtt
file along with the image sprite .jpeg file name.
Each entry in this .vtt
will be similar to what we saw just above i.e. name of the image sprite file along with the coordinates of the video frame in the sprite.
Now we know that we need these things to proceed forward, so let us start with the implementation:
- First, choose the sample video file that you want to work with. I choose tears of steel open source video file for the project.
- Next, clone this project.
-
The script comes with certain defaults that need to be tuned to our requirements. These defaults are as follows:
THUMB_RATE_SECONDS=45 THUMB_WIDTH=100
Where,
-
THUMB_RATE_SECONDS
- Tells the script to take screenshots on every Nth second. -
THUMB_WIDTH
- Tells the script that generates the screenshot with a width of N pixels
We need to tune these defaults with the following values:
THUMB_RATE_SECONDS=1 THUMB_WIDTH=200
Here we tell the script that it should take a screenshot of the video every second and generate the image with a width of
200px
-
-
Now save the script and run the below command:
python makesprites.py <path-to-.mp4-video-file>
Make sure that you are inside the project.
-
Once the command gets executed, it generates a
thumbs
output folder that contains:- Screenshots of each frame
- Image sprite
-
.vtt
file
Our image sprite and
.vtt
file will look like this:
WEBVTT
Img 1
00:00:00.000 --> 00:00:01.000
tears-of-steel-battle-clip-medium_sprite.jpg#xywh=0,0,200,83
Img 2
00:00:01.000 --> 00:00:02.000
tears-of-steel-battle-clip-medium_sprite.jpg#xywh=200,0,200,83
Img 3
00:00:02.000 --> 00:00:03.000
tears-of-steel-battle-clip-medium_sprite.jpg#xywh=400,0,200,83
Img 4
00:00:03.000 --> 00:00:04.000
tears-of-steel-battle-clip-medium_sprite.jpg#xywh=600,0,200,83
Img 5
00:00:04.000 --> 00:00:05.000
tears-of-steel-battle-clip-medium_sprite.jpg#xywh=800,0,200,83
We now have all the things we need. Now let us start with loading and extracting this data into the project.
Loading and Extracting VTT in the project
To load the above VTT, follow these steps:
- Add a
track
element inside thevideo
element of the Video.tsx file like below:
- Once the track element is added we make use of a
ref
:trackMetaDataRef
to access this track element in the belowuesEffect
:
Since we want the video frame to be displayed every second during the hover of the seek bar so for we run the above effect whenever the global state hoveredDuration
changes.
💡 NOTE:
hoveredDuration
andhoveredUrl
is a states that gets updated whenever we hover on the seek bar. You can understand about them here in the given sequence:
Changes in context, reducer, actions, triggering action on mousemove to updatehoveredDuration
In this effect, we access all the available cues and index the cue array with the help of truncated hoveredDuration
value. We then update the hoveredThumbnailUrl
global state with the currentCue
. Here is how these both states update when we hover the seekbar:
Loading the image sprite into the project
Now we have got the current cue of that second in the hoveredThumbnailUrl
we can easily access this context wherever we want. Since we want to show this frame in a tooltip, therefore we create a FrameTooltip
component.
Add the below code in the file: src/components/Seekbar/FrameTooltip.tsx
. If this file doesn’t exists then create one at the given location:
The code is pretty simple,
- The component accepts the current frame duration:
duration
, the current text cue from the VTT file:thumbnailUrl
and the dimensions and the coordinates indims
. - We call this component in the
[Seekbar](https://github.com/keyurparalkar/react-youtube-player-clone/blob/1bb248b86d3771d1551d71b2ef30090fb38f8e69/src/components/Seekbar/index.tsx#L114)
component in the following way:
-
We split the
hoveredThumbnailUrl
such that we get the sprite name and the coordinates like below:hoveredThumbnailUrl = tears-of-steel-battle-clip- medium_sprite.jpg#xywh=0,0,200,83 /* After splitting we get: */ spriteName = tears-of-steel-battle-clip-medium_sprite.jpg coords = 0,0,200,83
Next, we pass this component to the custom tooltip component that I have implemented here. We pass our FrameTooltip
as a content
prop to the tooltip:
All things set now, let us see how our changes looks like:
Summary
Well, folks we learned a lot in this blog post. To summarize,
- We learned what are Image sprites, and VTTs.
- We learned how to generate the image sprite out of a video.
- We implemented our way to create a custom VTT file.
- We also saw how to load and extract the VTT into our existing project.
- We saw the usage of the
track
element.
That’s all folks! In the next blog post, we are going to implement the chapter's functionality. So stay tuned guys!!
The entire code for this tutorial can be found here.
Thank you for reading!
Top comments (4)
Wonderfully explained, thank you for tailoring things with such details.
May I know, how normal YT clones that we see on tutorials do this? Does it(frame tooltip) come with YT api or some library.
Thanks for reading the blog. Can you share any tutorial for YT clones that you saw ? Would like to see if they are using any APIs or not
Yea sure,
JS Mastery - Github
JS Dev - Github
Let me have a look at them. Thanks for sharing