DEV Community

ndesmic
ndesmic

Posted on • Edited on

Building an Extension to Record Videos

Update: If you are newly viewing this, it seems like changes in Chrome might have broken the video recording functionality. If you record you'll probably get a black screen with audio only, canvas recording will still work. Very unfortunate.


I wanted to do a series on WebExtensions because honestly there's not enough of them. I was thinking about what something useful could be and settled on an extension that can record a video element on the page. Kinda like a VCR. Lots of websites don't let you download or obscure the ability to get at the source through arcane magic (cough Youtube). But if we can capture content straight from the displayed source then that seems pretty useful. Note that Google and Youtube being the same company means they may also not like such extension being listed in the Chrome Web Store but with a recipe you can make your own!

Manifest

The first thing we need is a manifest.json. This is has no relation to the standard web manifest.json, it was created earlier specifically for Chrome extensions but the idea is the same. It's a JSON file with application metadata, also like package.json.



{
  "name": "web-vcr",
  "version": "1.0",
  "description": "Save Videos",
  "manifest_version": 3,
   "content_scripts": [
   {
     "matches": ["*://*/*"],
     "css": ["css/styles.css"],
     "js": ["js/content-script.js"]
   }
 ]
}


Enter fullscreen mode Exit fullscreen mode

The first view fields are obvious. manifest_version is the version of WebExtensions you are using. At this time most extensions use 2 but this has been deprecated and someday will be removed. 3 has major API changes, most of which are for the better (though the ability to inspect requests is a lot weaker which is good for privacy and bad for standard ad-blocking). There's a lot of other stuff you might want to add here but we'll keep it simple for now. We'll start with a "content-script". A content script is basically a script that gets injected into pages that match the selector. It should be noted that this script will have it's own separate execution environment, you cannot access javascript variables or object from the page but you will share the DOM. The pattern I give allows access to all pages, a very big permission to ask but also necessary if this is usable everywhere. We'll inject 2 things, some styles and a script.

A simple content script

We'll start with a simple script that finds all video elements and adds a record button next to it:



//content-script.js
const videos = Array.from(document.querySelectorAll("video"));

videos.forEach(video => {
    const recordBtn = document.createElement("button");
    recordBtn.textContent = "πŸ”΄";
        recordBtn.classList.add("web-vrc-btn");
    video.parentElement.appendChild(recordBtn);
});


Enter fullscreen mode Exit fullscreen mode

This won't do anything but we can at least add a test simple content script (and if you're building something else this can be a stepping off place).

Also create css/styles.css. It can be empty but it needs to exist to pass validation.

Loading a local extension

First go to the url chrome://extensions, this url will be the same no matter what browser you are using. I'll be using Chrome but steps will mostly be similar for others. Enable developer mode to confirm you know what you are doing. Then you can click "load unpacked extension". Extensions you submit to stores will be packed up into what's basically a zip file, unpacked extension like those we use for development will point to folders and "pack extension" will pack them for you when you're ready to submit it. Find your project folder and load it. You'll get errors if there's something wrong, otherwise it will load it and you'll see it appear in the list with a default icon.

Now you have an extension! To test try navigating to a page with video elements like Youtube. To inspect what's going on you can open up dev tools like usual and under "script" find the tab "content scripts". You'll see all loaded extensions and you can browser to your extension and inspect the scripts, set breakpoints and all that like usual. You should see the button appear somewhere near the video. This part is not an exact science. We don't know how every page lays out videos and therefore it will be really hard to place it correctly on every site. This is probably good enough for now though, we can always fine tune for the bigger sites and ones we think are important later.

In fact one hint I'll give for Youtube is to set the z-index in order to get around some of the annoying overlay elements:



/* css/styles.css */
.web-vcr-btn {
  z-index: 999999;
}


Enter fullscreen mode Exit fullscreen mode

Any time you make changes be sure to update at the extension page.

Recording

I actually did a post on this before (https://dev.to/ndesmic/how-to-record-a-canvas-element-and-make-a-gif-4852) and it's the same for video elements as it is for canvas elements. But we can do a quick recap.



//js/content-script.js
const videos = Array.from(document.querySelectorAll("video"));

videos.forEach(video => {
    const recordBtn = document.createElement("button");
    recordBtn.textContent = "πŸ”΄";
    recordBtn.classList.add("web-vcr-btn");

    let recording = false;
    let mediaRecorder;
    let recordedChunks;

    recordBtn.addEventListener("click", e => {
        e.preventDefault();
        e.stopPropagation();
        recording = !recording;
        if (recording) {
            recordBtn.textContent = "⬜";
            const stream = video.captureStream();
            mediaRecorder = new MediaRecorder(stream, {
                mimeType: 'video/webm;codecs=vp9',
                ignoreMutedMedia: true
            });
            recordedChunks = [];
            mediaRecorder.ondataavailable = e => {
                if (e.data.size > 0) {
                    recordedChunks.push(e.data);
                }
            };
            mediaRecorder.start();
        } else {
            recordBtn.textContent = "πŸ”΄"
            mediaRecorder.stop();
            setTimeout(() => {
                const blob = new Blob(recordedChunks, {
                    type: "video/webm"
                });
                const url = URL.createObjectURL(blob);
                const a = document.createElement("a");
                a.href = url;
                a.download = "recording.webm";
                a.click();
                URL.revokeObjectURL(url);
            }, 0);
        }
    });

    video.parentElement.appendChild(recordBtn);
});


Enter fullscreen mode Exit fullscreen mode

We use video.captureStream() to create a stream. Unlike the GIF version I don't add a framerate so it takes the native one. We create a new MediaRecorder with the appropriate format, in this case VP9 which is one of the more modern ones, but not supported in all browsers. Chunks will come into the recorder and we need to compile them with an array (this is all done in memory for now) and finally we start the media recorder. When stop is clicked/toggled we stop the media recorder, turn the array of chunks into a blob and turn that into a download of the appropriate mimetype. Then we do some cleanup.

Also note the stopPropagation and preventDefault. Some pages like Youtube use delegated event listeners and this will stop those from triggering.

Dealing with mutation

Some sites use a SPA architecture or otherwise load video after the fact with scripts. We'll want to deal with this as well.



const videoObserver = new MutationObserver((mutationList, observer) => {
    for (mutation of mutationList) {
        if (mutation.type === "childList") {
            for(const child of mutation.addedNodes){
                if(child.nodeName === "VIDEO"){
                    addRecorder(child);
                } else if(child.nodeType === Node.ELEMENT_NODE) {
                    Array.from(child.querySelectorAll("video")).forEach(video => addRecorder(video));
                }
            }
        }
    }
});

videoObserver.observe(document.body, {
    subtree: true,
    childList: true
});


Enter fullscreen mode Exit fullscreen mode

addRecorder is what used to be the callback in the video.forEach loop, I've just extracted it. This is the most reliable way to watch for changes. It might be a little more efficient to poll the DOM at long intervals but this will pick up all changes and do so immediately. We create the mutation observer first. Then we observe the body with subtree and childList as true. This means the observer will pick up both new elements (childList) and extends to all subtrees (subtree) so anything new added under the body will be fed into the callback. The callback get a list of changes so we need to iterate over them. We're only looking for childList mutations (this is also the only kind that will be fed to us due the observer configuration but it's always good the check). Of those mutations there can be multiple elements added at a single time so we iterate over those elements. If the added element was a video we add a recorder button to it. If it was not, we need to check if the added element contains videos in its subtree because the observer will not pick these up. An element with children counts as a single addition. If there are, we add recorders to them too. If the mutation was not for an element (nodeType 1) then it has no children and querySelectorAll will fail so we ignore it.

I also added one small protection:



const buttonMap = new WeakMap();

function addRecorder(video) {
    if(buttonMap.get(video)) return;
    console.log(`Attaching Video`, video);
    const recordBtn = document.createElement("button");
    recordBtn.textContent = "πŸ”΄";
    recordBtn.classList.add("web-vcr-btn");

    buttonMap.set(video, recordBtn);

    let recording = false;
    let mediaRecorder;
    let recordedChunks;

    recordBtn.addEventListener("click",  //yada yada);

    video.parentElement.appendChild(recordBtn);
}


Enter fullscreen mode Exit fullscreen mode

Everytime a button is added we keep track of which video it was associated with in a WeakMap. If an entry exists we skip it. This is so if an element got moved around we won't create duplicated buttons.

It should be noted that this is watching changes on the whole DOM and therefore is very expensive especially on large pages. It's fine for now, but likely something we might want to reconsider at some point.

Encrypted Media

Some media you might want to record uses encrypted media extensions. This will cause the browser to throw an error when you make a media stream from it. So we should support this case too. This will be different since we cannot record directly. Instead we will use a separate but very similar API called getDisplayMedia. getDisplayMedia is designed for screen sharing applications so it will capture the whole screen, application or tab and only the user is allowed to choose for security purposes. This means that they have to manually select the source.



//when trying to connect the stream
try {
    stream = video.captureStream();
} catch(ex){
    if (ex.name === 'NotSupportedError') {
        stream = await navigator.mediaDevices.getDisplayMedia({
            video: {
                cursor: "never"
            },
            audio: true
        });
    }
}


Enter fullscreen mode Exit fullscreen mode

I couldn't find an easy way to tell if a video uses EME (there might be an attribute but I couldn't find a good way to look it up) so I have this ugly try block instead. If the stream errors with a NotSupportedError it probably means it's using EME or something else is wrong. In this case we fall back getDisplayMedia to record the whole screen. getDisplayMedia takes a set of constraints. We need to set 2. On the video we want cursor="never" so that we don't show the cursor (this was probably for security but it works great here!). We also need to set audio=true so that we capture audio from the system. According to notes this will only get audio from the tab instead of all system audio which is fine because that's what we want. Now the flow is slightly different. In the case of an error the user will be given a prompt to select which thing they want to share and it's important they select the tab they want to record. With this we can even capture EME protected video but we'll also grab all the UI elements too.

Basic browser action

Browser actions are little button that appear next to the url bar. Previously Chrome supported 2 types: one next to the url bar (browser action) and one in the url bar (page action) but these have since been combined in manifest v3 to just "actions" which are the same as the old browser actions. These are typically used in extensions to open drop downs with settings or actions. We'll want to make use of this too. One of the problems I found testing some video sites is that they have overlays that react to the mouse. In the unencrypted case that's fine but if we're recording via getDisplayMedia we want to avoid triggering overlays. So what I want to do is have a button that can start capture without triggering user interaction in the page.

First lets add the action to the extension in the manifest. Just below the content_scripts at the root add:



  "action" : {
    "default_icon": "img/icon.png",
    "default_popup": "html/popup.html"
  }


Enter fullscreen mode Exit fullscreen mode

This has two references, one to an icon which will be the icon that appears in the browser and an html page for the popup. Create both. Here's my most basic popup page:



<!-- html/popup.html -->
<!doctype html>
<html lang="en">
    <head>
        <title>Popup</title>
        <link rel="stylesheet" href="./css/popup.css">
        <link rel="icon" href="./img/icon.png" type="image/png">
        <meta charset="utf-8">
        <meta name="theme-color" content="#ff6400">
        <meta name="viewport" content="width=device-width">
        <meta name="description" content="Web VCR Settings">
    </head>
    <body>
        <button id="capture-btn">Capture the screen</button>
        <script src="js/popup.js" type="module"></script>
    </body>
</html>



Enter fullscreen mode Exit fullscreen mode

You'll need to create the css/popup.css file, the js/popup.js file but really all this is is a blank page with a button.

Screenshot 2021-09-18 092141

In the popup.js we just need to wire up the click handler to send a message to the content script:



const captureBtn = document.getElementById("capture-btn");
let recording = false;

captureBtn.addEventListener("click", () => {
    recording = !recording;
    chrome.tabs.query({ active: true, currentWindow: true }, tabs => {
        chrome.tabs.sendMessage(tabs[0].id, { command: "record" });
    });
    captureBtn.textContent = recording ? "Stop Recording" : "Record Screen";
});


Enter fullscreen mode Exit fullscreen mode

We query the active tab (since we don't care about others) with chrome.tabs.query passing in a query for tabs that are active and main window (there should only be one). Then we send a message with chrome.tabs.sendMessage. The content script can listen for this. I also changed the button so it toggles but realistically this will not work well. If the popup closes or the tab is reloaded we will lose the current recording state. We can fix this by doing more complex message passing shenanigans when the popup opens but for simplicity let's just leave it broken for now.

Back in the content script I had to refactor everything to make the record not reliant on button clicks.



async function record(video) {
    let mediaRecorder;
    let recordedChunks;
    let stream;
    try {
        try {
            stream = video.captureStream();
        } catch (ex) {
            if (ex.name === 'NotSupportedError') {
                stream = await navigator.mediaDevices.getDisplayMedia({
                    video: {
                        cursor: "never"
                    },
                    audio: true
                });
            }
        }
        mediaRecorder = new MediaRecorder(stream, {
            mimeType: 'video/webm;codecs=vp9',
            ignoreMutedMedia: true
        });
        recordedChunks = [];
        mediaRecorder.ondataavailable = e => {
            if (e.data.size > 0) {
                recordedChunks.push(e.data);
            }
        };
        mediaRecorder.start();
    } catch (ex) {
        console.log(ex);
    }

    return () => {
        mediaRecorder.stop();
        setTimeout(() => {
            const blob = new Blob(recordedChunks, {
                type: "video/webm"
            });
            const url = URL.createObjectURL(blob);
            const a = document.createElement("a");
            a.href = url;
            a.download = "recording.webm";
            a.click();
            URL.revokeObjectURL(url);
        }, 0);
    }
}


Enter fullscreen mode Exit fullscreen mode

record will start recording and then return a function that stops recording. If you are familiar with react hooks it's a bit like that. But now we can abstract the UI from the recording state functionality.



async function addRecorder(video) {
    if(buttonMap.get(video)) return;
    console.log(`Attaching Video`, video);
    const recordBtn = document.createElement("button");
    recordBtn.textContent = "πŸ”΄";
    recordBtn.classList.add("web-vcr-btn");
    buttonMap.set(video, recordBtn);

    let recording = false;

    recordBtn.addEventListener("click", async e => {
        e.preventDefault();
        e.stopPropagation();
        recording = !recording;
        if (recording) {
            recorderMap.set(video, await record(video));
            recordBtn.textContent = "⬜";
        } else {
            recorderMap.get(video)?.();
            recordBtn.textContent = "πŸ”΄"
        }
    });

    video.parentElement.appendChild(recordBtn);
}


Enter fullscreen mode Exit fullscreen mode

This also works in our favor because the button will handle errors better by changing the UI state after success.

Now that we have that we can actually hook up the main video recorder:



let mainVideoRecorder;
chrome.runtime.onMessage.addListener(async (request, sender, sendResponse) => {
    const videos = Array.from(document.querySelectorAll("video"));

    if(videos.length === 0) return;

    let maxVideo;
    for(const video of videos){
        if(!maxVideo || (video.width * video.height) > (maxVideo.width * maxVideo.height)){
            maxVideo = video;
        }
    }

    if (request.command === "record") {
        if(mainVideoRecorder){
            mainVideoRecorder();
            mainVideoRecorder = null;
        } else {
            mainVideoRecorder = await record(maxVideo);     }
    }
});


Enter fullscreen mode Exit fullscreen mode

When we get a message we'll look at all the present videos. If there are none we bail out because we can't do anything. Otherwise we'll find the video with the largest screen area and assume that is the main video. We'll then attach a recorder to it and record. If there is a recorder then we stop recording.

Conclusion

So there is a non-trivial and useful extension using modern web APIs.

There are some improvements I could see. Maybe the badge itself is the record button. Maybe we can add video export settings for things like animated gifs. Maybe we could try to full screen or remove UI to make recording EME videos better. The UI and branding definitely sucks so we could clean that up. Maybe we don't even need the inline button, what if we just take the biggest and most centered video in the viewport to avoid UI fighting? Maybe there could be an audio only mode? Sometimes the bitrate will suffer when recording, maybe we can find a way to fix that? All just some thoughts in my head. Maybe I'll develop this into something releasable on the Chrome web store some day.

But in either case you should now have the skills to make your own and if Google decides such an extension is not allowed for whatever reason then you can side load and improve it.

Source code is here: https://github.com/ndesmic/web-vcr/tree/v1

Top comments (2)

Collapse
 
0leg53 profile image
Oleg

hey, looks like the code from repo doesn't work at current chrome version!

Collapse
 
ndesmic profile image
ndesmic

You're right. It doesn't seem to work with video elements anymore. It seems like there might have been a change in chrome that either accidentally broke it or perhaps deliberately prevents this.