In my previous post, I explained how straightforward it is to host a static website on S3 with HTTPS support and a custom domain. Naturally, this should include Single Page Applications (SPAs) since they're essentially bundles of HTML, JS, CSS, and other assets. However, when combined with S3, SPAs pose unique challenges that manifest under specific circumstances. Let's explore the problem and its solutions.
Problem with SPA
Single Page Applications, as the name suggests, comprise just one HTML file and one or more JS files. The HTML file loads on the initial request, and all the magic happens on the client side through DOM manipulation by the JS code.
Moreover, SPAs often feature a routing component, such as React Router in React's case, which intercepts navigation requests when a link is clicked. Instead of sending an HTTP request to the server to navigate to a sub-page, the router stops the browser from navigating and simply replaces the current content. Despite this, it appears as though you're navigating since the router updates the URL.
For instance, consider this React Router demo app deployed on S3 via static website hosting:
If you navigate to its sub-pages, everything should work smoothly. However, try accessing a sub-page directly or refreshing while on a sub-page like /about
:
spa-hosting-example.s3-website-us-east-1.amazonaws.com/about
Oops, an error occurred! But why now and not earlier?
The crux of the issue is this: when you access a resource like /about
on a S3 website hosting bucket, it will first attempt to fetch the about
object from the bucket's root. If this object isn't found, it'll look for about/index.html
. If this object isn't found either, it will return 404 Not Found error.
On a side note: It's possible to host a website on S3, but without static website hosting. In this case, CloudFront accesses the S3 bucket via its REST endpoint instead of its HTTP endpoint. In this scenario, the same error would occur, but you would get a 403 Access Denied instead of a 404 Not Found error. This AWS document explains the various reasons for 403 errors.
Solution(s)
This problem can be fixed either via S3 or CloudFront.
S3 Error Document
If you're using S3's static website hosting feature, you can configure an error document for 404 Not Found errors. But, instead of a different document, you'd return the index.html
from your SPA bundle. This means unresolved S3 requests will return the index.html
document.
It's worth noting that despite responding with the correct index.html
, S3 will still send a 404 HTTP response code. The AWS docs highlights that some browsers might override the S3 error document for 404 errors, displaying their own error page instead.
CloudFront Custom Error Response
CloudFront offers functionality similar to S3, enabling customized responses to HTTP error codes. For a 404 Not Found, CloudFront can be set to return the default index.html
document. Unlike S3, CloudFront lets us adjust the HTTP response code to 200 OK.
CloudFront Functions
This advanced solution involves CloudFront Functions or Lambda@Edge to inspect all incoming requests and rewrite the URLs, similar to the rewrite-rules from Apache or Nginx.
When this function is assigned to a CloudFront distribution, requests to /about
or /about/
will yield the root index.html
document. However, users won't perceive this change as it affects only the communication between CloudFront and S3, not between the user and CloudFront.
Be aware that this function is quite rudimentary and will probably require some adjustments and review before it is used in production. For example, it only checks for a dot in the URL, and if absent, changes the URL.
Two official AWS resources delve deeper into CloudFront or Lambda@Edge Functions:
- URL rewrite to append index.html to the URI for single page applications
- Implementing Default Directory Indexes in Amazon S3-backed Amazon CloudFront Origins Using Lambda@Edge
Conclusion
Now, which option should you choose? As is often the case, it depends on your specific needs. S3's error document is the simplest, eliminating the need for additional services like CloudFront. But if you're already using CloudFront for HTTPS and custom domains, it would be logical to employ its custom error response feature. If your SPA isn't limited to a single index.html but has several files located in various subfolders, this method won't work. In this case, it may make sense to opt for the most powerful option and use CloudFront and Lambda@Edge Functions.
I hope you found this post helpful. If you have any questions or comments, feel free to leave them below. If you'd like to connect with me, you can find me on LinkedIn or GitHub. Thanks for reading!
Top comments (13)
Thanks for the great post! I'd like to confirm if what I understood is correct or not. I'd really appreciate if you reply it.
I can deploy websites using S3 with static website hosting, but when it comes to SPA, all routes(actually, it depends on the app) are supposed to be handled with index.html(the app). If a user first enter the index, the other routes work because the user doesn't actually navigate. But if a user first enter the other routes, it will display an error page because files that correspond on the routes don't exist in the storage. There is a way to change the content when S3 get requested for a file that doesn't exist, but it still responses 404 status code, basically, it is correct to send the code since the file doesn't exist in the storage. Some browsers display their own error pages when they get the 404 status code.
There are two ways to solve this problem.
CloudFront - It can override the response status code and the content. If I change it the status code 200, web browsers will accept it as a normal request and if it sends
index.html
, all the routes will be handled by the appindex.html
.CloudFront Functions - It catches a request and change the
uri
toindex.html
to display the app.Hi @lico, yes you got it! :-)
My post was mainly related to React SPAs with React Router, but the principle applies to other frameworks as well. The first HTTP request from the user is the crucial step here: if the URL contains a path like
/about
, S3 will try to find an object namedabout
. If it doesn't find it, it will continue looking for a folder namedabout
and try to find the objectindex.html
in that folder. If it doesn't find that either, it returns a 404.With CloudFront, we can handle this case on the server side without the user being aware of it. In the case of custom error responses, CloudFront receives the 404 from S3 and simply requests the
index.hmtl
file from S3. This is then returned to the user like a normal 200 response. CloudFront functions go a step further and let us intercept the request to S3 and modify it directly. So in this case we don't process a 404 from S3, but directly request the correct "index.html".I got it! Thank you for the answer and the post!! đđđ
Thanks Chris, I'd like to take this a step further with another use case for a cloudfront function that I wrote about. Also read thru all three parts of this comprehensive series about challenges overcome during cloudfront migration. Let me know what u and others think, thanks
Entitled - Enabling AWS S3 to behave more like a Web Server
dev.to/rickdelpo1/enabling-aws-s3-...
Thank you Rick! I'll take a look at your post, the title sounds promising! :-)
I stumbled around a bit along the way but had some critical takeaways that now have me sold on AWS.
What were these?
Most Importantly I learned about what I can do with Cloudfront functions using the request object: conditionals, redirects, subfolders, security headers at response level. Then I learned about some obstacles encountered along the way during my migration and how to overcome them. My 3 part series is a deep dive into all these details.
Great
Thank you for the great post,
But I belive there are better alternatives to host SPAs like Netlfiy that takes all the extra steps to make S3 work.
Hi @hussam22 thank you!
But how would you better define?
I won't argue that Netlify and Vercel are a valid choice. My personal blog zirkelc.dev itself runs directly on Vercel because it's just super convenient. However, I still want to emphasize the value of understanding how these things work. Or as in this post, why they don't work the way we expect them to.
Hey, Why not simply use Aws-amllify that will automatically do all these under the hood with advance capabilities.
Hi @imdkbj
of course, you could simply use Amplify for hosting and it will take care of all these things.
However, I think it's always good to understand what's going on behind the scenes. Services like S3, CloudFront, Route53, etc. are the low-level building blocks, while Amplify is more of a toolchain that puts these building blocks together in the right combination.
If Amplify meets all your needs, that's perfect, then stick with it. But there are many users, myself included, who either already have a specific infrastructure in place, need a specific feature that is not currently supported, or simply want full control over their resources.