Introduction
Welcome to my updated article on capturing screenshots with Puppeteer! Since my last blog in 2021, I've gained more experience with Puppeteer and discovered new techniques that can make your screenshot-capturing tasks even more efficient. In this article, I will provide an overview of Puppeteer, walk you through the installation process, and share code examples for capturing screenshots of web pages, including multiple screenshots in one go.
What is Puppeteer.
Puppeteer is a powerful Node.js library backed by Google that offers a high-level API for controlling headless Chrome or Chromium using DevTools protocols. It enables developers to perform various tasks such as capturing screenshots and PDFs of web pages, running end-to-end test cases, diagnosing performance-related issues, and much more.
Installation
Before we dive into the code, let's ensure you have Puppeteer properly installed. You can install it via npm or yarn using the following command:
npm install puppeteer
Please note that when you install Puppeteer, it downloads a recent version of Chromium, which may take some time depending on your network speed.
Capture a GitHub Profile Screenshot
Let's start by capturing a screenshot of a GitHub profile. I've updated the code to make it more concise and added some error handling.
const fs = require("fs");
const puppeteer = require("puppeteer");
async function captureScreenshot() {
const targetUrl = "https://github.com/sagar-gavhane";
const screenshotPath = "screenshots/github-profile.jpeg";
try {
// Launch headless Chromium browser
const browser = await puppeteer.launch({ headless: true });
// Create a new page
const page = await browser.newPage();
// Set viewport width and height
await page.setViewport({ width: 1440, height: 1080 });
// Navigate to the target URL
await page.goto(targetUrl);
// Capture screenshot and save it
await page.screenshot({ path: screenshotPath });
await browser.close();
console.log("\nπ GitHub profile screenshot captured successfully.");
} catch (err) {
console.log("β Error: ", err.message);
}
}
captureScreenshot();
Capture Multiple Screenshots
In real-world scenarios, capturing multiple screenshots might be a common requirement. Let me show you how you can do that with Puppeteer. We'll use a pages.json file to store the URLs and names of the web pages we want to capture.
Here's an updated version of the function:
[
{
"id": "c1472465-ede8-4376-853c-39274242aa69",
"url": "https://github.com/microsoft/vscode",
"name": "VSCode"
},
{
"id": "6b08743e-9454-4829-ab3a-91ad2ce9a6ac",
"url": "https://github.com/vuejs/vue",
"name": "vue"
},
{
"id": "08923d12-caf2-4d5e-ba41-3019a9afbf9b",
"url": "https://github.com/tailwindlabs/tailwindcss",
"name": "tailwindcss"
},
{
"id": "daeacf42-1ab9-4329-8f41-26e7951b69cc",
"url": "https://github.com/getify/You-Dont-Know-JS",
"name": "You Dont Know JS"
}
]
const fs = require("fs");
const puppeteer = require("puppeteer");
const pages = require("./pages.json");
async function captureMultipleScreenshots() {
if (!fs.existsSync("screenshots")) {
fs.mkdirSync("screenshots");
}
try {
// Launch headless Chromium browser
const browser = await puppeteer.launch({ headless: true });
// Create a new page
const page = await browser.newPage();
// Set viewport width and height
await page.setViewport({
width: 1440,
height: 1080,
});
for (const { id, name, url } of pages) {
await page.goto(url);
await page.screenshot({ path: `screenshots/${id}.jpeg` });
console.log(`β
${name} - (${url})`);
}
await browser.close();
console.log(`\nπ ${pages.length} screenshots captured successfully.`);
} catch (err) {
console.log("β Error: ", err.message);
}
}
captureMultipleScreenshots();
Conclusion
Puppeteer is an excellent tool for capturing screenshots and automating browser-related tasks. In this updated article, I've shown you how to capture a single screenshot as well as multiple screenshots with Puppeteer, making your web scraping and testing workflows more efficient and productive.
Feel free to explore Puppeteer's extensive API documentation on GitHub to discover even more features and capabilities.
Happy coding! π
Top comments (5)
Make this into a serverless function and viola! Nice and handy API to get screenshots
Yes, I built one serverless function for my website but running this function takes time so it will drastically increase the cost.
What about a 5$ digital ocean droplet? How many users can it handle (concurrently or otherwise) if you optimize it for this purpose?
Yep, like the author says, it will take time and increase cost.
Awesome! This is what I needed. Thanks