This article will take your pdf verification skills to the next level using Playwright.
The pdf document we will use for this example will be the "Tesla Powerwall 2 Datasheet". This file is hosted at following location https://oedtrngbj.wpengine.com/wp-content/uploads/Powerwall_2_AC_Datasheet_EN_NA.pdf
Observe that this pdf document contains multiple pages, some illustrations and a table.
Caveats
Let me be upfront with the following issues I have encountered with this approach:
- This solution only seems to work when the test is run using chromium in headed mode
- The elements contained within pdf viewer component are not accessible to Playwright, this means you will not be able to mask/hide dynamic elements in the pdf. For example, customer id or dynamic date/time stamps
- We are using Playwrights' built in visual comparison library. It is advisable that you get familiar with the maintenance required to keep the baseline images up to date. See the Visual comparisons page on the Playwright documentation
If you are happy with these compromises, read on!
page.setContent() + toMatchSnapshot() = 🤩
Using the setContent()
, load the pdf into an iframe
like so:
const pdfResource =
'https://oedtrngbj.wpengine.com/wp-content/uploads/Powerwall_2_AC_Datasheet_EN_NA.pdf';
let iframe = `<iframe src="${pdfResource}#zoom=60" style="width: 100%;height:100%;border: none;"></iframe>`;
await page.setContent(iframe);
await page.waitForTimeout(5000);
NOTE: You may need to experiment with the zoom level, width, height attributes to suit your needs
ANOTHER NOTE: The
waitForTimeout
function is used here to wait for the pdf contents to be loaded into theiframe
We will make use of Playwrights' assertion expect(screenshot).toMatchSnapshot(name[, options])
=> https://playwright.dev/docs/test-assertions#screenshot-assertions-to-match-snapshot-1, to capture a screenshot of a particular element matching a locator, in our case - we will need to take a screenshot of the iframe above with the PDF file fully loaded to particular page.
Our solution will make use of this function:
expect(await page.locator('iframe').screenshot()).toMatchSnapshot();
The completed test will look like this ...
import { test, expect } from '@playwright/test';
test('validate a complex pdf', async ({ page }) => {
const pdfResource =
'https://oedtrngbj.wpengine.com/wp-content/uploads/Powerwall_2_AC_Datasheet_EN_NA.pdf';
let iframe = `<iframe src="${pdfResource}#zoom=60" style="width: 100%;height:100%;border: none;"></iframe>`;
await page.setContent(iframe);
await page.waitForTimeout(5000);
expect(await page.locator('iframe').screenshot()).toMatchSnapshot();
});
Run the test. It should fail with the following error.
Playwright has not found a golden snapshot of the element and hence on the very first test execution, it will automatically generate this file for you. You will need to commit these files into your repo.
Rerun the test again, this time it should pass since it will already have a baseline image to compare against.
Ok, that's nice. We managed to validate the first page of the pdf.
But most pdfs you will encounter out in the wild will contain multiple pages. Let us ammend our test to cater for multiple pages.
test('validate a complex pdf II, all pages', async ({page}) => {
const numberOfPages = 2;
for (let i = 1; i < numberOfPages + 1; i += 1) {
let pdfResource =
'https://oedtrngbj.wpengine.com/wp-content/uploads/Powerwall_2_AC_Datasheet_EN_NA.pdf';
let iframe = `<iframe src="${pdfResource}#zoom=60&page=${i}" style="width: 100%;height:100%;border: none;"></iframe>`;
await page.setContent(iframe);
await page.waitForTimeout(5000);
expect(await page.locator('iframe').screenshot()).toMatchSnapshot({
name: `pdf_validation_page_${i}.png`,
});
}
});
Job done.
Other improvements for you to consider
(all totally optional and it entirely up to you to implement)
- Dynamically determine the number of pages, the example above uses a predefined value for the expected number of pages.
- Remove the hard coded
waitForTimeout
and implement a better way of waiting for the contents to be loaded.
Top comments (0)