What is cache?
The cache is an in-memory data store that allows us to store repetitive data for the client.
The cache acts as a near key-value data store to the client applications.
Why is it needed? What about DBMS?
In-memory cache systems like Redis allow us to store data in the physical RAM. Unlike external databases, the data is stored on hard disks or SSDs. With that being said, accessing and retrieving data from cache will be much faster than retrieving it from external data stores.
Hitting the main datastore every time a client makes a request to the API might cause a lot of latency. Specially queries that make a lot of aggregations, lookups, and response time.
In addition to that, it will increase the load and connections on the main database, which will be a reason to increase the resources and allocations of the server running that database.
How does Redis cache work?
Redis becomes very handy because it provides different data types to store such as strings, lists, and hashes.
The whole idea behind how it works is to store the data we get from the main database inside Redis as key-value pair.
The next time we try to get this data, we will check if the data is available in the Redis cache. If yes, the data will be retrieved from Redis without doing further calculations and external Database connections. If the data is not inside the cache, the data will be retrieved from the main DB and later cached inside Redis.
This layer of check can be obtained in multiple ways. One of them is making a global middleware that checks for cache availability as shown below using Express.js:
When an entry in Redis gets expired, the key and value will no longer exist. We can check for the availability of the cache based on the key we store.
const cacheAvailability = (req, res, next) => {
const { id } = req.params; // or from body or any input we can modify the middleware based on
redis.get(id, (error, result) => {
if (error) throw error;
if (result !== null) {
return res.json(JSON.parse(result)); // Parsed because there is a stringified object as a value
} else {
return next();
}
});
};
When it should be considered?
There are two terms that should be considered when it comes to caching:
Hit rate
: the number of times we actually get the data from the cache and not the external data source.
Miss Rate
: The number of times we access the expired cache and get data from the main database.
Caching is meant to store frequently repetitive requested data. Data that is repetitive and not frequently changing. With that being said, the more the hit rate is, the more performance and efficiency we will get in our applications.
Constantly checking for cache, finding it empty, and then getting the data from the main Db and then store in cache is useless.
There should be a persistent relationship between the date we need to cache, how long it will be stored in the cache, and how expensive the operation is if we directly get it from the main database.
Top comments (0)