DEV Community

Cover image for Writing a validator for Gatsby's Site Showcase
Benjamin Lannon
Benjamin Lannon

Posted on

Writing a validator for Gatsby's Site Showcase

This past weekend, I made a simple node script to run through the site showcase for GatsbyJS and was very happy with the end results and wanted to talk through the process.

Why do this?

As Gatsby continues to grow with currently over 450 sites in the showcase as a place to show off sites that use Gatsby, being able to maintain and see occasionally that the sites still are around and using Gatbsy is a good sanity check.

As well, this could be good to add into a CI process so when someone submits a site to the showcase, the check could be automated rather than needing to manually check if it is actually a Gatsby site.

Dig into the code

The entire source for this is located on my GitHub:

GitHub logo lannonbr / gatsby-site-showcase-validator

Node script to validate links on Gatsby Site Showcase are actually Gatsby sites

Gatsby Site Showcase Validator

Notice: This Repo is archived as it has been merged into Gatsby's core repo here: https://github.com/gatsbyjs/gatsby/tree/master/.github/actions/gatsby-site-showcase-validator

A simple node script that visits and checks all of the sites in the Site Showcase for Gatsby.

Instructions

  1. Clone down the repository
  2. Install the dependencies with npm or yarn
  3. run node index.js

Run in Docker

You can also run it in Docker with the following command:

docker run --rm lannonbr/gatsby-site-showcase-validator:1.0.0



Starting out, I pull down the sites.yml file from Gatsby's repo which is used to build the showcase and parse it with js-yaml. Now I have an array of sites and I loop through each and fetch the main_url for each site.

Then when I have the HTML from an individual site, I pass it into cheerio which is a DOM parser for NodeJS that has a jQuery like API.

let $ = cheerio.load(siteHtml);
Enter fullscreen mode Exit fullscreen mode

The main part of the code is to check if it has an element with the "___gatsby" id. which is the default container that is built as part of Gatsby.

let gatsbyContainer = $("#___gatsby")

if (gatsbyContainer.length !== 0) {
  // The page is a gatsby site
} else {
  // The page is not a gatsby site, print out an error message
}
Enter fullscreen mode Exit fullscreen mode

Otherwise, in total the file is only about 70 lines of code but overall it did a fairly good job.

Results

So from the stats, there were only ~20 sites that were either not loading or have moved off of Gatsby as a framework. Doing the math that is less than 5% which is a low percentage so it shows that the site showcase is reputable in showing that the sites that say they are a Gatsby site actually are.

Some cases for why there is a number of sites which return up being invalid likely end up being because the site used to be a Gatsby site but since was rewritten and then was forgotten to be removed from the showcase. Now that a tool like this is around, the number can decrease further and stay fairly low. It's a large task to maintain a large site showcase, but with a little bit of automation and scripts, the task can be made easier.

Top comments (0)