I just moved this site's content over to a headless Ghost instance. As a part of that move, instead of processing Markdown, I'm retrieving raw HTML. I like that. It means fewer dependencies and a less complex build process.
But it also means I need to do a little more work in setting up syntax highlighting myself, rather than letting Remark do it for me.
Prism has two implementation options: process markup on a server, or do it with a client-side JavaScript package. As I was digging in, I was a little startled to find so many tutorials suggesting you just go with the latter - slap another script onto onto your site and call it good.
One one hand, I get it. It's certainly the quickest, simplest approach. But I'm a front-end performance stickler, and I wasn't satisfied with that approach.
The Downsides of Client-Side Prism.js
If you're keen on site performance, there are a couple of reasons you might avoid running Prism fully in the browser.
First, it'll impact your bundle size. Prism's client-side JavaScript package isn't huge, adding about 7kb of (gzipped) page weight:
Not too bad. But that alone won't give you support for all the languages you might need to support. You'll just get the defaults: markup
, css
, clike
and javascript
.
To handle that, Prism offers an autoloader which will automatically load the necessary languages based on your code snippet markup. This is handy in accounting for the languages you eventually want to support, without loading all of them up front 100% of the time. (Don't even consider that route... it's ~2.7mb of non-gzipped JavaScript.)
Even so, those little bundles start to add up. If you have snippets from five different languages on a page, each of them needs to be independently lazily loaded, gradually increasing the amount of code being shipped to and executed within the browser.
For your typical blog post, this probably doesn't amount to anything substantive. But page weight isn't the only thing impacted.
Second, it'll cause a flash of un-formatted snippets. In order for your snippets to be styled correctly, Prism needs to transform your code into a particular form of markup. For example, this:
<pre>
<code class="language-js">
const greeting = 'Hello, world!';
function sayGreeting() {
console.log(greeting);
}
</code>
</pre>
... is transformed into this:
<pre class="language-js">
<code class="language-js">
<span class="token keyword">const</span> greeting <span class="token operator">=</span> <span class="token string">'Hello, world!'</span><span class="token punctuation">;</span>
<span class="token keyword">function</span> <span class="token function">sayGreeting</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
<span class="token console class-name">console</span><span class="token punctuation">.</span><span class="token method function property-access">log</span><span class="token punctuation">(</span>greeting<span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>
</code>
</pre>
But it takes work to download, parse, execute that code, and then allow the browser to paint it to the screen. If your readers are running on a slow connection, it can lead to some ugly layout shift.
And that problem only compounds when your content has several snippets rendering at once.
Run It on the Server Instead
Prism's support for processing markup in Node is right on the front page of it's documentation. It's straightforward to set up, and simple to load any language you need to support. Here's how you'd parse a single PHP snippet, for example.
const Prism = require("prismjs");
const loadLanguages = require("prismjs/components/index");
// Load all languages.
loadLanguages();
const snippet = `
<pre><code class="language-php">
$greeting = 'Hello, world!';
echo $greeting;
</code></pre>
`;
const html = Prism.highlight(snippet, Prism.languages.php, 'php');
But you're probably receiving the entirety of a page's content at once, and in order to process snippets embedded within a bunch of other text, you need to get a little more creative.
Processing Code Snippets within Text Content
Depending on your preferences, you've got a couple of options.
Option #1: RegEx & a Replacement Function
If you're committed to keeping your dependencies as minimal as possible, you could run all of your content through a .replace()
callback, relying a regular expression to extract & process snippets. I spent way too much time tinkering with this, and this pattern appears to work reliably.
const Prism = require("prismjs");
const loadLanguages = require("prismjs/components/index");
// Load all languages.
loadLanguages();
const processedContent = content.replace(
/(<pre>\n\s*<code class="language-(.*)">)([\s\S]*?)(<\/code>\n\s*<\/pre>)/g,
(_wrapper, openingTags, language, codeSnippet, closingTags) => {
const snippet = Prism.highlight(
codeSnippet,
Prism.languages[language],
language
);
return `${openingTags}${snippet}${closingTags}`;
}
);
Using that pattern, each snippet is parsed based on the language noted in the the class β represented by language-(.*)
.
But many of us have scars from working with regular expressions that appear to work reliably, and then unexpectedly bite us. So, there's a more predictable approach you could leverage as well, so long as you're running an a Node environment that supports it.
Option #2: JSDOM
If a RegEx feels too risky, JSDOM is also an option, allowing you to manipulate snippets as though they were in the browser DOM.
const JSDOM = require("jsdom").JSDOM;
const content = getContentFromWherever();
// Pop markup into a JSDOM instance.
const dom = new JSDOM(content);
// Query for every code snippet.
const codeBlocks = dom.window.document.querySelectorAll("pre code");
// Parse each snippet with Prism.
codeBlocks.forEach((block) => {
// Identify the language of the code block.
const language = block.classList[0].replace("language-", "");
// Add language class to parent <pre> tag.
block.parentElement.classList.add(`language-${language}`);
// Extract the code to be processed.
const code = block.textContent;
// Process the code according to the specific language.
const html = Prism.highlight(
code,
Prism.languages[language],
language
);
// Replaced the DOM with the processed snippet.
block.innerHTML = html;
});
// Spit out the result.
const processedContent = dom.window.document.body.innerHTML;
No matter which of theese you choose, you'll get those key benefits: the ability to leverage all the languages you like with no performance cost to the user, and no flash of unstyled snippets. All that's left for the browser is a small amount of CSS for styling:
You Make the Call
It's not always going to be prudent to set up Prism server-side. Maybe your server or site's build process doesn't rely on Node (for example, if your site runs WordPress or Hugo). Or, maybe your back end is just too opaque and/or complicated to warrant investing in such a change. That's fine.
But definitely consider it if you're able. It's a small thing you can do to keep your site's user experience as optimal as possible.
Top comments (0)