TL;DR
I just released generate-sitemap 1.10.0, a GitHub Action for generating XML sitemaps for static websites. The generate-sitemap GitHub Action is implemented in Python, and generates an XML sitemap by crawling the GitHub repository containing the html of the site, using commit dates to generate <lastmod>
tags in the sitemap.
This release, generate-sitemap 1.10.0, introduces an option to specify a list of directories and/or individual files to exclude from the sitemap. The Action already automatically excluded individual html files from the sitemap if a noindex directive to robots was specified in the head of the page, as well as exclusions based on contents of the site's robots.txt. This release adds the ability to specify additional paths to exclude from the sitemap. The motivating case came from a feature request from a user who wanted to be able to exclude a directory of content common across multiple pages from the sitemap (e.g., the pages that depend upon that content should be in sitemap, but not necessarily the shared html).
Changelog 1.10.0 - 2023-11-15
Added
- Ability to specify list of paths to exclude from sitemap, via new input
exclude-paths
.
Dependencies
- Bump cicirello/pyaction from 4.25.0 to 4.26.0
More Information
Please consider starring generate-sitemap's GitHub repository:
cicirello / generate-sitemap
Generate an XML sitemap for a GitHub Pages site using GitHub Actions
generate-sitemap
Check out all of our GitHub Actions: https://actions.cicirello.org/
About
The generate-sitemap GitHub action generates a sitemap for a website hosted on GitHub Pages, and has the following features:
- Support for both xml and txt sitemaps (you choose using one of the action's inputs).
- When generating an xml sitemap, it uses the last commit date of
each file to generate the
<lastmod>
tag in the sitemap entry. If the file was created during that workflow run, but not yet committed, then it instead uses the current date (however, we recommend if possible committing newly created files first). - Supports URLs for html and pdf files in the sitemap, and has inputs to control the included file types (defaults include both html and pdf files in the sitemap).
- Now also supports including URLs for a user specified list of additional file extensions in the sitemap.
- …
For more information, see my earlier post about generate-sitemap here on DEV, as well as its webpage.
Generate an XML Sitemap for a Static Website in GitHub Actions
Vincent A. Cicirello ・ Nov 23 '22
Where You Can Find Me
Follow me here on DEV and on GitHub:
Or visit my website:
Top comments (0)