As always, the full solution is available on Github. Scroll to the bottom of the article for the link.
Introduction to CSP
As a reminder, CSP stands for Content Security Policy: a security standard that helps prevent cross-site scripting (XSS), clickjacking, and other code injection attacks by controlling what resources a user agent is allowed to load for a given page using the Content-Security-Policy
header.
Perhaps the most critical directive of CSP is the script-src
directive, which controls what JS code the browser can load and execute.
The highest level of protection is achieved by using the so-called 'strict' CSP. If you want to know more about strict vs non-strict CSPs, including examples and rationales, please visit these excellent resources:
There are two methods for implementing a strict CSP: nonces or hashes.
Defining the requirements
Let's first list the features of the ideal CSP solution to help us pick the right approach:
- It should provide the highest level of security (ie: strict CSP)
- It should work for both author and publish instances (ie: it should not rely on a publish-only dispatcher)
- It should be easy to maintain
- It should be cacheable
- It should work for
<script>
tags that:- point to internal clientlibs
- contain inline JS
- point to external clientlibs (eg: external analytics script)
Hashes vs Nonce-nse
While there are certainly cases where the nonce approach makes sense, in my opinion, hashes are a superior solution for most CMS use cases. That's because using nonces presents the following disadvantages:
- Because it requires the HTML document to contain these nonces that are unique to each request, it makes the result impossible to cache and fails requirement n.4. Since caching is a critical performance optimization, this is usually disqualifying.
- It's not a useful mechanism for protection against untrusted external scripts, so it fails requirement 5.iii. This kind of protection would require an integrity check which is, you guessed it, a hash (which we can re-use for our CSP!)
Hashes, by comparison, can be cached because they are specific to the script content, which typically only changes with a software release and can be used to validate the integrity of scripts from untrusted sources.
Therefore, we can conclude that a hash-based CSP is the best solution for most AEM use cases.
If the above rationale doesn't apply to your use case, then this Medium article by Saravana Prakash can show you how to achieve nonce-based CSP this in AEM.
Solution design
Now that we know what approach to take, let's design the solution.
Where do scripts come from anyway?
There are typically 3 ways in which <script>
tags are added to a page's HTML:
- Added directly via HTL files that make up the Page component (eg:
customfooterlibs.html
orcustomheaderlibs.html
) - Added indirectly via the Page Policy:
- Added as dependencies to clientlibs defined in points 1 and 2.
So the sequence of events should be:
- Let AEM add all the
<script>
tags in the page HTML - For each
<script>
tag:- Calculate the hash using one of the following:
- The inline content of the script
- The content of the references clientlib
- The integrity attribute of the untrusted script
- Calculate the hash using one of the following:
- Add it to the tag
- Add it to the CSP header
Transformers to the rescue!
Photo by Aditya Vyas on Unsplash
Unfortunately I'm not talking about Optimus Prime, but rather a SAX output pipeline that will use a Transformer to add the CSP hashes to the <script>
tags on the HTML page.
Implementation
In this section I will highlight the most important parts of the solution. For a complete solution, see the Github link at the bottom of the article.
Creating the transformer
The transformer handles the following cases:
Inline scripts
Example:
<script>
console.log('Hello, World!');
</script>
If the <script>
tag is inline, the hash can be calculated using the innerText
of the element.
Clientlib scripts
Example:
<script src="/etc.clientlibs/demo/clientlibs/clientlib-site.min.js">
If the <script>
tag points to a clientlib served from AEM, the hash can be calculated using the content of the clientlib.
External scripts
Example:
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/js/bootstrap.bundle.min.js" integrity="sha384-YvpcrYf0tY3lHB60NNkmXc5s9fDVZLESaAA55NDzOxhy9GkcIdslK1eN7N6jIeHz" crossorigin="anonymous"></script>
Here is the code for the transformer. I've been very generous with the comments to explain the rationale behind each step. This snippet refers to some POJOs and configuration that you can see in the Github diff at the end of the article.
@RequiredArgsConstructor
@Slf4j
public class CspHashTransformer extends DefaultTransformer {
@Getter
private final String hashingAlgorithm;
private final HtmlLibraryManager htmlLibraryManager;
private SlingHttpServletRequest request;
private SlingHttpServletResponse response;
private ContentSecurityPolicy csp;
private TransformerElement currentElement;
@Override
public void init(final ProcessingContext context, final ProcessingComponentConfiguration config) throws IOException {
super.init(context, config);
request = context.getRequest();
response = context.getResponse();
csp = new ContentSecurityPolicy();
// We initialize the CSP with a strict-dynamic directive to allow for our trusted scripts to
// load other scripts without being blocked by the browser.
csp.addScriptSrcElem("'strict-dynamic'");
}
/**
* Process the start of an element. If the element has a src attribute pointing to a clientlib, calculate the hash
* and add it to the element as an integrity attribute and to the CSP header.
*
* @param namespaceUri the namespace URI of the element
* @param localName the local name of the element
* @param qualifiedName the qualified name of the element
* @param attributes the attributes of the element
* @throws SAXException if an error occurs during processing
*/
@Override
public void startElement(final String namespaceUri, final String localName,
final String qualifiedName, final Attributes attributes) throws SAXException {
currentElement = new TransformerElement(namespaceUri, localName, qualifiedName, attributes);
log.debug("Start processing element {}", currentElement);
addIntegrityAttributeAndCspForSrc();
super.startElement(currentElement.namespaceUri(), currentElement.localName(), currentElement.qualifiedName(), currentElement.attributes());
}
/**
* Called by the SAX parser when it encounters character data. Used to append the character data to the inner text
* of the current element.
*
* @param ch the character array being read
* @param start the start index of the character array
* @param length the length of the character array
* @throws SAXException if an error occurs during processing
*/
@Override
public void characters(final char[] ch, final int start, final int length) throws SAXException {
if (currentElement != null) {
currentElement.innerText().append(ch, start, length);
}
super.characters(ch, start, length);
}
/**
* Process the end of an element. If the element has inner text, calculate the hash and add it to the CSP header.
*
* @param namespaceUri the namespace URI of the element
* @param localName the local name of the element
* @param qualifiedName the qualified name of the element
* @throws SAXException if an error occurs during processing
*/
@Override
public void endElement(final String namespaceUri, final String localName,
final String qualifiedName) throws SAXException {
if (currentElement == null) {
return;
}
log.debug("End processing element {}", currentElement);
addCspForInnerText();
super.endElement(namespaceUri, localName, qualifiedName);
}
private void addIntegrityAttributeAndCspForSrc() {
// Get the source of the script
final String src = currentElement.attributes().getValue("src");
if (src == null) {
log.debug("No src attribute found for element {}", currentElement);
return;
}
// Attempt to find a clientlib associated with the src attribute
final HtmlLibrary clientlib = getHtmlLibrary(src);
// Attempt to read the integrity attribute
final String integrity = currentElement.attributes().getValue("integrity");
// If no clientlib can be found using the src, then assume the src is external
final boolean isExternal = clientlib == null;
// For security reasons, we consider that an external script without an integrity attribute is invalid. It will
// not be added to the CSP and therefore will fail to load/execute in the browser.
if (isExternal && integrity == null) {
log.error("Integrity attribute missing from external src <{}>. Hash cannot be calculated.", src);
return;
}
// Re-use the integrity hash if possible, else calculate the hash from the clientlib content
final String hash = isExternal
? integrity
: getHashFromClientlib(clientlib);
// If no hash can be calculated, then the script will not be added to the CSP and therefore will fail to load
if (hash == null) {
log.debug("No clientlib or external hash found for found for src <{}>. Hash cannot be calculated.", src);
return;
}
// For internal script, add the integrity attribute containing the hash. Security-wise this does not provide any
// benefit as the CSP will already enforce the hash, but it is good practice to include it so that you can
// easily identify which script corresponds to which hash for debugging puposes.
if (!isExternal) {
addIntegrityAttribute(hash);
}
// Finally, add the hash to the CSP
addHashToCsp(hash);
}
private String getHashFromClientlib(final HtmlLibrary clientlib) {
try (final InputStream inputStream = clientlib.getInputStream(true)) {
final String hash = calculateHashAndEncodeBase64(inputStream);
log.debug("Hash for <{}>: <{}>", clientlib.getPath(), hash);
return hash;
} catch (final IOException e) {
log.error("Could not read clientlib <{}>", clientlib.getPath(), e);
return null;
}
}
private void addCspForInnerText() {
final String innerText = currentElement.innerText().toString();
if (innerText.isEmpty()) {
log.debug("Element {} has no inner text", currentElement);
return;
}
final String hash = calculateHashAndEncodeBase64(innerText);
addHashToCsp(hash);
}
/**
* Adds the hash as an integrity attribute to the current element.
*
* @param hash the hash to add
*/
private void addIntegrityAttribute(final String hash) {
final AttributesImpl attributes = new AttributesImpl(currentElement.attributes());
attributes.addAttribute(currentElement.namespaceUri(), "integrity", "integrity", "0", hash);
currentElement.attributes(attributes);
}
/**
* Adds the hash to the Content-Security-Policy header.
*
* @param hash the hash to add
*/
private void addHashToCsp(final String hash) {
csp.addScriptSrcElem("'" + hash + "'");
response.setHeader("Content-Security-Policy", csp.toString());
}
/**
* Find the clientlib associated with the src attribute if such a clientlib exists.
*
* @param src the src attribute of the element
* @return the clientlib associated with the src attribute, or null if no such clientlib exists
*/
private HtmlLibrary getHtmlLibrary(final String src) {
final String path;
try {
path = new URI(src).getPath();
} catch (final URISyntaxException e) {
log.error("src attribute element {} is not a valid URI", currentElement, e);
return null;
}
// Find true path of clientlib in /apps (or /libs, via overlay)
final String appsPath = path
.replace("etc.clientlibs", "apps")
.replace(".min.js", "");
final Resource resource = request.getResourceResolver().resolve(appsPath);
if (resource instanceof NonExistingResource) {
log.error("Could not find resource using path <{}>", path);
return null;
}
return htmlLibraryManager.getLibrary(LibraryType.JS, resource.getPath());
}
private String calculateHashAndEncodeBase64(final InputStream inputStream) {
try {
final MessageDigest digest = MessageDigest.getInstance(hashingAlgorithm);
final byte[] buffer = new byte[4096];
int bytesRead;
while ((bytesRead = inputStream.read(buffer)) != -1) {
digest.update(buffer, 0, bytesRead);
}
final byte[] hash = digest.digest();
final String hashString = Base64.getEncoder().encodeToString(hash);
return hashAndAlgorithm(hashString);
} catch (final IOException e) {
log.error("Error reading input stream for hashing", e);
return null;
} catch (final NoSuchAlgorithmException e) {
log.error("Encryption algorithm not found", e);
return null;
}
}
private String calculateHashAndEncodeBase64(final String string) {
try {
final MessageDigest digest = MessageDigest.getInstance(hashingAlgorithm);
final byte[] hash = digest.digest(string.getBytes(StandardCharsets.UTF_8));
final String hashString = Base64.getEncoder().encodeToString(hash);
return hashAndAlgorithm(hashString);
} catch (final NoSuchAlgorithmException e) {
log.error("Encryption algorithm not found", e);
return null;
}
}
private String hashAndAlgorithm(final String hash) {
return hashingAlgorithm.toLowerCase().replace("-", "") + "-" + hash;
}
}
Adding the transformer to the pipeline
To create a pipeline that includes our transformer, we need to create a Rewriter by adding node at /apps/demo/config/rewriter/links-pipeline
with the following properties:
<?xml version="1.0" encoding="UTF-8"?>
<jcr:root xmlns:jcr="http://www.jcp.org/jcr/1.0" xmlns:nt="http://www.jcp.org/jcr/nt/1.0"
jcr:primaryType="nt:unstructured"
contentTypes="text/html"
generatorType="htmlparser"
order="1"
paths="[/content]"
serializerType="htmlwriter"
transformerTypes="[csp-hash-transformer]">
<generator-htmlparser
jcr:primaryType="nt:unstructured"
includeTags="[SCRIPT]"/>
</jcr:root>
The result
Now, if we load scripts onto our page using customfooterlibs.html
:
<!-- This HTL include demonstrates the loading of a clientlib -->
<sly data-sly-use.clientlib="core/wcm/components/commons/v1/templates/clientlib.html">
<sly data-sly-call="${clientlib.js @ categories='demo.base', async=true}"/>
</sly>
<!-- This script element demonstrates the loading of an external script -->
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/js/bootstrap.bundle.min.js"
integrity="sha384-YvpcrYf0tY3lHB60NNkmXc5s9fDVZLESaAA55NDzOxhy9GkcIdslK1eN7N6jIeHz"
crossorigin="anonymous"></script>
<!-- This script element demonstrates the inline script hashing -->
<script>
console.log('This was logged from an inline script!');
</script>
<!-- This inline script loads an external script to demonstrate the 'dynamic' principle -->
<script>
const script = document.createElement('script');
script.src = 'https://cdn.jsdelivr.net/npm/jquery@3.7.1/dist/jquery.min.js';
document.body.appendChild(script);
addEventListener("load", (event) => console.log('jQuery version:',$().jquery));
</script>
We should receive a CSP like this (yours will vary depending on your clientlibs/dependencies):
Content-Security-Policy: script-src-elem 'strict-dynamic' 'sha256-47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=' 'sha256-XjA+iLg5j0FvhFkZc7LcXfbQJ0b3gvw2c2jj9vv65q0=' 'sha256-Uv8dzRTPRZ2++L/ZWgfN9lPjdvzsDYVS4rEfvWCA0x0=' 'sha256-3dfW5u+XJXRPqcC3F8wewmnAr6oxejxP7ArjOE38P2Q=' 'sha256-5hrKOpQWBa1NuajxV3udxJCgNMQMD/lUApbmGxMmpuM=' 'sha256-wlCSQBL9yeqVFrMGUIlSAc0Wfb1JydFIkk8wiBq/o5M=' 'sha256-WJ3od+zqoblT5apcuXdUh4o1UWwVnb5AjQhmHWIu2OY=' 'sha384-YvpcrYf0tY3lHB60NNkmXc5s9fDVZLESaAA55NDzOxhy9GkcIdslK1eN7N6jIeHz' 'sha256-QFYsdZ/eGhCq89XHZ7IOsy5A9dTKeyvduAx2RnCqvAA=' 'sha256-Fb8sXaPGzkkQJOoKLIpn0I5s+VOOiBlZMYGgv8wHxZI=' 'sha256-g2cCN9gX44Hp5lFL/iomg3hI3LeG/LRkzeNJfQZjJGI=';
And you should see that the demonstration scripts have executed as expected in the browser console:
This was logged from an inline script!
jQuery version: 3.7.1
What about the other CSP directives?
Good question! There are indeed dozens of other directives you can use to fine-tune your CSP.
This article will not give you a comprehensive strategy for dealing with all your CSP requirements, it only shows you how to automate the configurationscript-src-elem
directive.
Thankfully, using multiple CSP headers is a valid approach, so you can add the rest of the directives anywhere you like as additional CSP headers. Just make sure you understand how multiple CSP headers interact with each other to avoid surprises.
Conclusion
As promised, all the code is available in one easy-to-read diff on Github. You can find it here.
If you have any comments or ideas about this article, the topic matter or the format, don't hesitate to leave a comment or to reach out to me on LinkedIn!
Top comments (0)