DEV Community

Cover image for Modern and robust hotlink protection in 2022
Ricardo Iván Vieitez Parra for Apeleg Limited

Posted on • Updated on • Originally published at apeleg.com

Modern and robust hotlink protection in 2022

What a hotlink is

Hotlinking refers to the practice of third-party web properties loading resources (most commonly images) directly from your own server.

For example, if you operate the website yourbusiness.example, you may have an image at https://yourbusiness.example/infographic.png that you use within your website. A hotlink is when an unrelated property (for example, the site anotherbusiness.test) embeds that image directly on their website by reference to your server, for example using the following HTML code:

<img
    src=https://yourbusiness.example/infographic.png
    alt="Widget purchases per capita"
>
Enter fullscreen mode Exit fullscreen mode

Unauthorised hotlinks are generally undesirable, not only because they can facilitate reproducing your content without permission but also because, since the resources are being loaded directly from your server, they can burden you with additional server costs in computing and bandwidth.

TL;DR: If your website all your resources are in the same domain, add the Cross-Origin-Resource-Policy: same-site response header to your resources. If you use a CDN or serve some resources from an external domain, add the Cross-Origin-Resource-Policy: same-origin and Access-Control-Allow-Origin: https://yourbusiness.example response headers to your (external) resources and force a CORS request by using the crossorigin attribute.

Older approach to hotlink protection

Hotlink protection has historically relied on the HTTP Referer header, which indicates the source of the document loading the resource.

When a request is made to https://yourbusiness.example/infographic.png from your website, this referrer header would look something like Referer: https://yourbusiness.example/statistics.html, whereas when it's embedded in a third-party website, it might look like Referer: https://anotherbusiness.test/widgets-consumption/.

This approach is usually implemented in various webservers as follows.

Example configuration for Referer-based protection

Apache: .htaccess

RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^https?://(.+\.)?yourbusiness\.example [NC]
RewriteRule \.(jpe?g|png|gif)$ - [NC,F,L]
Enter fullscreen mode Exit fullscreen mode

nginx

location ~ \.(jpe?g|png|gif)$ {
    valid_referers
        none blocked
        server_names
            *.yourbusiness.example
            yourbusiness.example;

    if ($invalid_referer) {
        return 403;
    }
}
Enter fullscreen mode Exit fullscreen mode

Microsoft IIS

<rule
    name="Hotlinking Prevention"
    stopProcessing="true"
>
    <match
        url=".*\.(jpe?g|png|gif)"
    />
    <conditions>
        <add
            input="{HTTP_REFERER}"
            pattern="^$"
            negate="true" />
        <add
            input="{HTTP_REFERER}"
            pattern="^https?://(.+\.)?yourbusiness\.example/.*$"
            negate="true" />
    </conditions>
    <action
        type="CustomResponse"
        statusCode="403"
    />
</rule>
Enter fullscreen mode Exit fullscreen mode

Issues with Referer-based protection

Hotlink protection using the Referer has a number of issues that make it inappropriate in many scenarios.

Hotlinking vs direct linking

Perhaps the most compelling argument against relying on the Referer server is that doing so also breaks regular direct links, which may not always be desirable. This is because the Referer header is not only sent when loading resources but also when a navigation action happens. Hence, a citation like From YourBusiness' <a href=https://yourbusiness.example/infographic.png>widget purchases per capita</a> have increased steadily over the last century would result in a Referer header being sent from an external domain, which will trigger hotlink protection and result in blocked content. This is in most cases a bad user experience.

Referer omission

Another reason why this way of protection is ineffective is that the Referer header is not sent along requests in all cases.

One common case is that generally resources loaded from a plain HTTP location (i.e., with the URL starting with http://) will not include a Referer when they are being loaded from an HTTPS site. Therefore, an image loaded from http://yourbusiness.example/infographic.png at https://example.com will typically not have a Referer header.

Although the above scenario is not as relevant nowadays as most sites use HTTPS, that is just one example of how the Referer header can be omitted. A much more relevant argument is that the embedding site is in full control of whether a Referer header is sent along, which is as simple as including the referrerpolicy="no-referrer" attribute to the img tag.

It may be tempting to address this issue by requiring that a Referer header be present. However, there are other reasons why a Referer header may not be sent, such as direct navigation by a user, browser configuration and extensions that block this header or proxies that remove it. Thus, simply blocking incoming requests missing this header could result in poor user experience for some users, as your page will look 'broken' since resources won't load.

Additional processing

A third disadvantage of Referer-based protection is that it's dynamic by nature, meaning that every incoming request needs to be evaluated by your server to check whether the Referer contains an allowed value. This is a relatively small concern, but it introduces a tiny amount of latency to the response and results in more complex cache management.

As the web is moving more and more towards static or pre-rendered content and serverless platforms, effective protection that involves as little server logic as possible is ideal.

A more robust approach

Web standards and browsers have come a long way in the last few decades, and all of the tools for effective and robust protection against hotlinking in the most common scenarios. Specifically, developments in the fetch standard regarding cross-origin requests arm us with request and response headers that we can use to implement browser-enforced hotlink protection.

Some relevant headers

Origin

The Origin HTTP request header is in many respects similar to the Referer header with some enhancements that make it more suitable for requests across different web properties.

One key difference between Referer and Origin is that the former typically is a full URL (for example, https://www.example.com/page/) whilst the latter is always just an origin, or the first part of the URL, like https://www.example.com.

The Origin request header is sent for requests with methods other than GET or HEAD, or for requests that are explicitly marked as cross-origin (i.e., from one site to another).

Cross-Origin-Resource-Policy

The Cross-Origin-Resource-Policy HTTP response header is used to define a policy for cross-origin requests made in no-cors mode.

In practical terms, this header alone is sufficient in most cases for effective hotlink protection.

Cross-Origin-Resource-Policy can take one of three values: same-origin, same-site or cross-origin. All we need to do is include the header Cross-Origin-Resource-Policy with either of same-origin or same-site, and embedding of your resources by external websites will be blocked.

Difference between same-origin and same-site

The value same-origin specifies stricter policy than same-site.

Consider the site at https://example.com and the resources https://example.com/a.png and https://images.example.com/b.png. While both resources are part of the same site (i.e., example.com), each resource has a different origin: https://example.com and https://images.example.com, respectively. While both same-origin and same-site will allow for the site to use the first resource, only same-site will allow it to embed the second resource, as the origins are different.

Access-Control-Allow-Origin

The Access-Control-Allow-Origin HTTP response header is relevant for so-called CORS requests, which are requests that include an Origin header. Furthermore, the CORS protocol defines some versatile mechanisms that allow sites to define policies for cross-origin resource sharing.

Normally, most resources that we would like to protect against hotlinking (for example, images) are not loaded using CORS requests. However, CORS requests can be made explicitly by including the crossorigin or crossorigin="anonymous" attribute to the resource tag, for example like this: <img crossorigin alt=Example src=sample.jpg>.

The Access-Control-Allow-Origin instructs the browser that a certain origin actually allowed to make a cross-origin request, and it should take either of two values: *, indicating that the resource can be requested by all origins, or the value of the Origin header included with the request. Other values or absence of this header tell the browser that the request is not allowed in the context of the CORS protocol.

Hotlink protection for sites using a single origin

Many smaller sites serve all of their content from a single origin. For example, if you use WordPress, you may have the site https://yourbusiness.example and have most of your images under the https://yourbusiness.example/wp-content/uploads directory.

For this simple case, an effective way to implement hotlink protection is to include the header Cross-Origin-Resource-Policy: same-origin along with your responses, and this will prevent hotlinking by any other sites.

It is important that the Access-Control-Allow-Origin header not be sent, or if it is, that it be set to the origin of your site (e.g., https://yourbusiness.example). Sending Access-Control-Allow-Origin with any other value, especially * or replying back with the Origin header included in the request will allow hotlinking if a CORS request is made.

Configuration

Apache .htaccess

This policy can be implemented in Apache by using the .htaccess file (or alternatively the main configuration file) with something along these lines:

<FilesMatch "\.(jpe?g|png|gif)$">
    <IfModule mod_headers.c>
        Header set Cross-Origin-Resource-Policy "same-origin"
    </IfModule>
</FilesMatch>
Enter fullscreen mode Exit fullscreen mode

The FilesMatch directive can be adjusted as needed depending on the files that require hotlink protection. Because this technique does not have many of the limitations of the Referer-based one, it is even possible to skip this check and include the header with all responses.

nginx

In the relevant server block, this policy can be implemented as follows, adjusting the location part as needed:

location ~ \.(jpe?g|png|gif)$ {
    add_header cross-origin-resource-policy same-origin;
}
Enter fullscreen mode Exit fullscreen mode

Hotlink protection for sites using subdomains

Oftentimes sites serve resources from a subdomain. For instance, the main site could be available at https://yourbusiness.example while images and other static resources are hosted at https://static.yourbusiness.example. It may also be the case that certain parts of the site reside in a subdomain (for example, https://store.yourbusiness.example) and subdomains share resources.

For these scenarios, hotlink protection can still use the Cross-Origin-Resource-Policy header, except that the same-site value (instead of same-origin) is likely the most appropriate choice.

Hotlink protection for sites using external resources

It is increasingly common for sites to use CDNs to serve static resources, which often are accessed through a separate domain. For example, https://yourbusiness.example might load images from the origin https://yourbusiness.cdnprovider.example.

Hotlink protection in this scenario is slightly more involved than when all resources are part of the same site because then requests are by definition cross-origin and cross-site and CORS policies are only enforced for requests that use the CORS protocol.

While want to load resources from a different origin, we can't rely on the Cross-Origin-Resource-Policy alone. This is because the only value that would seem appropriate is cross-origin, which does not provide any hotlink protection whatsoever: adding the header Cross-Origin-Resource-Policy: cross-origin to CDN responses would result in anyone being able to load the resource in question from any origin, which is exactly the situation that we are trying to avoid.

Fortunately, we can force requests to use the CORS protocol and this way have more granular control over whom has access.

Counter-intuitively, an appropriate value for the Cross-Origin-Resource-Policy header in CDN responses is same-origin. By using this value, non-CORS requests to the CDN will fail, which leaves only CORS requests as a way to load resources, which equips us with more granular ways of defining an access policy through the Access-Control-Allow-Origin header.

For the CDN or external domain case, hence we need three elements for hotlink protection:

  • Cross-Origin-Resource-Policy: same-origin in the external resource response. This blocks non-CORS requests
  • Access-Control-Allow-Origin: https://yourbusiness.example in the external resource response (where https://yourbusiness.example is the origin the resource will be loaded from). This tells the browser that the response is intended for use by this origin and this origin only.
  • crossorigin (or crossorigin="anonymous") attribute in the document referencing the resource. This means that <img src=https://yourbusiness.cdnprovider.example/image.webp> becomes <img crossorigin src=https://yourbusiness.cdnprovider.example/image.webp>.

This approach will effectively block hotlinking from origins other than https://yourbusiness.example, which is exactly what we are looking after. As an added bonus, this same configuration also works for the single-origin case discussed earlier and, with the caveats that follow, for the single-site case.

Advantages, caveats and limitations

The solution presented is simple (requires adding a few HTTP headers to the response and a small change to the HTML markup), robust, has good browser support1 and because the policy is enforced by the browser itself, it degrades gracefully, meaning that the resources will still load normally in the few browsers still in use that don't support these headers.

Moreover, this solution for hotlink protection is in many cases stateless meaning that no conditional logic is required in the server, as the values for the headers are predetermined in advance.

The main caveat that applies, which is most relevant to this sites that make use of multiple domains or subdomains, and

  • load content served from an external domain (such as a CDN), or
  • use the crossorigin attribute

Since the Access-Control-Allow-Origin can only contain a single origin (and there is no syntax for allowing subdomains), these sites will require some server-side logic give an appropriate value to the Access-Control-Allow-Origin header. The Access-Control-Allow-Origin must contain the value of the Origin sent in the original request, and this value must be validated first to ensure that it's an allowed value. Moreover, it's likely that these sites will need to add Access-Control-Allow-Origin to their Vary response header to allow for proper caching.

Direct linking protection

In certain scenarios, it may be desirable to prevent direct access or direct linking to certain resources. The new Sec- headers allow for controlling these actions.

Images and other embeddable content

The Sec-Fetch-Dest request header is useful in these cases to have fine-grained control over when a browser is allowed to download a certain resource.

For example, to prevent direct access to an image, you can check if this header is set to image. To allow direct access and embedding as an image, but not other uses (for example, a fetch or XHR request), only the document and image values could be allowed. To block direct access, just image would be allowed.

Note that this header is not yet supported by all browsers (most notably Safari), so it's advised that, if you decide to make decisions based on this header, you allow requests that do not have it set.

External direct links

You may want to prevent or discourage external websites directly linking to certain files on your site (for instance, a large PDF file) while still allowing your users to access this content by directly linking files internally.

The Sec-Fetch-Site request header can be helpful for taking different actions based on where a user came from. For example, if this header is set to cross-site, then you might decide to issue a 303 See Other redirect to a page discussing the resource in question.

Like Sec-Fetch-Dest, Sec-Fetch-Site as of yet does not have wide enough support and may not be present in all requests. It's recommended to allow through normally requests without this header.


  1. As of the time of writing, Cross-Origin-Resource-Policy is supported by over 93% of global users

Top comments (0)