Caching is one of the things that is key in improving the performance of any system. Doesn't matter if it is frontend, backend, CPUs or our day to day devices, you will find them using caching in one way or the other. And that's what we are going to learn in this two post series.
First one will cover the basics of caching & second one will cover caching related concepts and advance stuff.
What we will be covering?
- What & why of caching
- Different types of cache
- Caching in different layers with example
What is Caching?
Caching is saving the data temporarily, that you frequently access, in a storage i.e. cache, so that you don't need to make a complete request again to get the same data.
Key Points
- Temporary data
- Frequent Access
- Avoid complete requests for same data
Cache storage are usually faster and available closer to the processing unit e.g. frontend, backend.
Why we need Caching?
We can understand the need of caching if we understand the key points above.
Let's understand the problem:
- Our primary databases are usually slow and it takes time to retrieve data from them.
- Request for frequently accessed data adds unnecessary overhead to our server.
- In mission critical systems, loads on our server & database needs to be reduced.
- Some queries & API calls are expensive and computing them again and again is not optimal.
Therefore, to keep above points in check, we add a layer of caching.
Advantages of caching
- Faster access of data results in better user experience
- Reduces latency: No need to access main DB again for same data
- Reduces workload on main DB & server making it cost effective
- In many cases, because data is cached, it is available offline.
- Saves bandwidth & leads to low network traffic.
Different types of caches
There are different types of caches that cater to different use cases. Example, a CPU needs different type of cache than a Frontend application.
Browser Caching
Starting with frontend, we have browser level caching. Very useful for frontend application, as it reduces the need to make API calls to get the same for the user.
When a user visits a page and gets the data, we save that data in browser cache and when same page is visited by the user, we read data from cache instead of making API call. We cache multiple things in frontend, e.g. API responses, Images, Videos, assets, user info etc.
Apart from browser cache storage, we can also leverage Local Storage in browser.
CDN: Content Delivery Networks
You can understand CDN as storage servers that are spread across globe and we can access data from the nearby CDN available. This avoids the need for making request to another region just to get the data.
CDNs are mostly used for storing data like images, videos, audio, static files such as css, js etc, streaming content, and large files. E.g. AWS Cloudfront, Cloudinary, Akamai
Application Server Cache
This is the cache we add on our backend side of application. When we receive a request for some data, we fetch the data from our main database, save the data in cache (redis, memcached) for future users and return the response to user.
When request for same data comes to server, we check if it exists in cache, if it does, we return the data directly. This helps reduce load on database and also makes requests faster.
Database caching
Databases also implement caching to improve query performance. Suppose there is a query which is frequently hit by multiple users. So, databases caches the result & plan for the query and returns the cached data in subsequent requests.
This results in faster queries and reduces some load on DB.
Other types of caches
There are other caches that are used in different scenarios:
- CPU Cache: L1 Cache (closest to CPU), L2 Cache (shared among CPU cores), L3 Cache (shared and has more size than L1 & L2)
- Memory Cache: RAM Cache to speed up I/O operations. Disk Cache to speed up Reads/Writes
- DNS Caching: DNS also cache the IP addresses and other info related to Domain Name requests. E.g. AWS Route53
Caching in different layers
As we can see, caching can be utilized in various places. Let's discuss some examples and how we can leverage caching to improve our system. For simplicity, we will focus on Frontend scenarios and backend scenarios
Here is our example:
Components: We have our client, a load balancer to forward request to our servers, 2 API servers & 1 media server to send static content like images, files etc. and finally our database.
Let's say we are trying to get a long list of products for our users and it's taking a good amount of time to receive the data from our database.
Issues in our current setup
- Because requested data is a long list of products, it is adding an overhead on our database.
- Servers can reach their limit if too many users start making requests.
- Not a good experience for client, because requests are taking time.
Let's start solving these issues
Backend
First up, it's our backend side of application.
We can add a caching layer to save the data retrieved from DB in cache storage. Then, when other users make the request for the list of products, instead of hitting our DB again, we will utilize our cached data and return that.
Benefits:
- Reduces load on database
- Makes future requests faster i.e. minimizes latency
Frontend
Next, we can cache data in frontend on browser level. This way, if a user revisits the products page again, we don't need to refetch the products.
There are libraries like Redux, React Query and other solutions, that help us store data locally so that we don't need make unnecessary requests to our server.
Benefits:
- Reduces load on our servers
- Better user experience
Static Files
We can also improve our requests for static files. Currently, they are stored on our server. Let's assume users from different region are making requests for product images. Users from nearby region will be able to get the image faster than the users from another region.
To handle this, we can utilize CDN (Content Delivery Networks). Because they are spread across globe, our data will be available close to our users.
Thus, we will upload our static files in CDN.
Other Improvements
- Using pagination to reduce the amount of data requested
- Rate limiter to prevent excessive traffic or abuse by limiting the number of requests that can be processed within a specific time period
- Using multiple read replicas for our database
Some Popular Cache Databases
- Redis
- Memcached
- Google Cloud Memorystore
- AWS ElatiCache
- Amazon DynamoDB Accelerator (DAX)
- Azure Cache for Redis
CDN Services
- Cloudinary (with a nice free plan for hobby projects)
- AWS CloudFront
- Azure CDN
So, this was it. I hope you learned something new about caching. Share your suggestions or any improvements in comments. In the next post, I will cover different concepts related to caching like cache eviction policies, cache invalidation strategies, challenges, scaling cache storage etc.
Top comments (0)