Outline:
- What is a cache
- Why should you cache data in your project?
- What data should you cache?
- How to use Redis to cache data in your project
- References
- Conclusion
What is a cache
In computing, a cache is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere.
-- [Wikipedia (https://en.wikipedia.org/wiki/Cache_(computing)]
Why should you cache data in your project?
Caching data in your project can offer several benefits, including:
Improved Performance: Caching allows you to store frequently accessed data in a faster-to-access location, such as memory, making data retrieval significantly faster. This can lead to faster response times and a better user experience.
Reduced Resource Usage: By serving cached data instead of repeatedly generating or fetching it from a slower source (e.g., a database or an external API), you can reduce the load on your server and conserve resources, which can lower operational costs.
Scalability: Caching helps your project scale more efficiently by reducing the demands on your data sources. It allows your application to handle more requests and users without overloading your backend systems.
Better Responsiveness: Cached data can provide near-instantaneous responses to user requests, making your application feel more responsive, particularly for frequently accessed content or resources.
Lower Latency: Caching reduces the time it takes to retrieve data, which is especially crucial for real-time applications, online games, financial systems, and any application where low latency is essential.
Load Balancing: Caching can be used to distribute the load evenly across multiple servers in a load-balanced environment, ensuring that each server doesn't have to repeatedly process the same requests.
Offline Availability: Cached data can provide a fallback mechanism, ensuring that your application can continue to function even if the primary data source (e.g., an API or database) becomes temporarily unavailable.
Cost Savings: By reducing the need for frequent and resource-intensive data access, caching can lead to cost savings in terms of hardware, bandwidth, and operational expenses.
Traffic Handling: Caching can help absorb sudden spikes in traffic by serving cached content during peak periods, reducing the risk of server overload or downtime.
What data should you cache?
The data you should cache in your project depends on your specific use case, performance requirements, and the characteristics of your application. Caching is a valuable technique for improving performance, but it's essential to cache the right data to achieve the desired benefits.
Here are some guidelines to help you determine what data to cache:
Frequently Accessed Data: Cache data that is frequently accessed but doesn't change frequently. This includes data that is read more often than it is written. Examples include static content like images, stylesheets, and JavaScript files, as well as database query results for frequently used queries.
Expensive Computations: Cache the results of computationally expensive operations or calculations. If your application performs complex calculations or generates reports, caching the results can save CPU cycles and reduce response times.
Session Data: If your application relies on user sessions, consider caching session data in memory. This can include user authentication status, user preferences, and shopping cart contents. Caching session data can reduce database load and improve session access times.
API Responses: If your application serves APIs, consider caching API responses for read-heavy endpoints. Responses that don't change frequently or depend on static data can be cached to reduce the load on your API servers.
Rendered Templates: Cache parts of rendered templates that are computationally intensive or don't change frequently. Template fragment caching can be used to store specific sections of a template, such as a sidebar or a product listing, while still rendering dynamic content.
Database Query Results: Cache the results of database queries that are resource-intensive and return relatively static data. Use Django's caching framework or an appropriate caching library to store and retrieve query results.
External API Responses: If your application relies on external APIs, cache the responses from these APIs to reduce the number of requests to external services. Be mindful of the cache expiration to ensure you always have up-to-date data.
Real-Time Data: For real-time applications, cache real-time data to reduce the overhead of frequent data updates. Examples include stock prices, weather data, or live sports scores.
Metadata: Cache metadata or configuration data that is read frequently but doesn't change often. This can include information about application settings, user roles, or available product categories.
Frequently Computed Results: Cache the results of frequently computed data that can be reused across multiple requests. For example, if your application generates recommendations based on user behavior, cache the recommendation results.
User-Generated Content: Be cautious when caching user-generated content, as it can change frequently. If you decide to cache user-generated content, implement cache expiration and invalidation mechanisms to ensure freshness.
Custom Business Logic: Identify custom business logic in your application that can benefit from caching. This could involve caching custom calculations, business rules, or specific data transformations.
How to use Redis to cache data in your project
Let say we have the following function in our views.py of our Django application
def index(request):
subjects = Subject.objects.all()
context = {
'subjects' : subjects,
}
return render(request, 'cachedjangoapp/index.html', context)
Above index function retrieves data from the Subject model database table using Django's Object-Relational Mapping (ORM) system. Assign the data to the context dictionary and pass the context dictionary to index.html template. Let say the data in the Subject model is frequently accessed but doesn't change frequently. To cache the data using redis we will:
- Install redis-py to our environment using the following command:
pip install redis==4.3.4
- Then, edit the settings.py file of our django project and modify the CACHES setting, as follows:
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.redis.RedisCache',
'LOCATION': 'redis://127.0.0.1:6379',
}
}
The project is using the RedisCache cache backend. The location is defined in the format redis://[host]:[port]. You use 127.0.0.1 to point to the local host and 6379, which is the default port for Redis
- Initialize the Redis Docker container using the following command:
docker run -it --rm --name redis -p 6379:6379 redis
If you want to run the command in the background (in detached mode) you can use the -d option.
- We will do the Low-level cache. Basically low-level cache allows you to cache specific queries or calculations. To cache
Subject.objects.all()
Query in index function, We will import cache and use its set and get method to set and get cached data respectively. ```
from django.core.cache import cache
def index(request):
subjects = cache.get('all_subjects')
if not subjects:
subjects = Subject.objects.all()
cache.set('all_subjects', subjects, 20)
context = {
'subjects' : subjects,
}
return render(request, 'cachedjangoapp/index.html',context)
Above, you try to get the all_students key from the cache using cache.get(). This returns None if the given key is not found. If no key is found (not cached yet or cached but timed out), you perform the query to retrieve all Subject objects, and you cache the result using cache.set(). We use set(key, value, timeout) to store a key named 'all_subjects' with a value that is the query 'subjects' for 20 seconds. If you don’t specify a timeout, Django uses the default timeout specified for the cache backend in the CACHES setting.
## We will use Django Debug Toolbar to the project to check the cache queries.
- First install Django Debug Toolbar with the following command:
`pip install django-debug-toolbar==3.6.0`
- Edit the settings.py file of your project and add debug_toolbar to the INSTALLED_APPS setting as follows.
INSTALLED_APPS = [
# ...
'debug_toolbar',
]
- In the same file, add the following line to the MIDDLEWARE setting. It should be placed before any other middleware except for middleware that encodes the response’s content, such as GZipMiddleware, which, if present, should come first:
MIDDLEWARE = [
'debug_toolbar.middleware.DebugToolbarMiddleware',
----
]
- Add the following lines at the end of the settings.py file:
INTERNAL_IPS = [
'127.0.0.1',
]
Django Debug Toolbar will only display if your IP address matches an entry in the INTERNAL_IPS setting.
- Edit the main urls.py file of the project and add the following URL pattern to urlpatterns:
path('debug/', include('debug_toolbar.urls')),]
- Run the development server and open the url path to the index function in your browser (for me, its: http://127.0.0.1:8000/).
You should now see Django Debug Toolbar on the right side of the page. Click on Cache in the sidebar menu. You will see the following panel:
![cache panel of django debug toolbar](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/cp331gqr81ecp3ha9p8k.PNG)
Under Total calls you should see 2. The first time the index view is executed there are two cache requests. Under Commands you will see that the get command has been executed once, and that the set command has been executed once as well. The get command corresponds to the call that retrieves the all_subjects cache key. This is the first call displayed under Calls. The first time the view is executed a cache miss occurs because no data is cached yet. That’s why there is 1 under Cache misses. Then, the set command is used to store the results of the subjects QuerySet in the cache using the all_subjects cache key. This is the second call displayed under Calls.
In the SQL menu item of Django Debug Toolbar, you will see the total number of SQL queries executed in this request. This includes the query to retrieve all subjects that are then stored in the cache:
![SQL menu](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/z2l7tvr825watipv7y9j.PNG)
Reload the page in the browser and click on Cache in the sidebar menu:
![cache menu on reload](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/x2k1tujcz50cn1ksbx1n.PNG)
Now, there is only a single cache request. Under Total calls you should see 1. And under Commands you can see that the cache request corresponds to a get command. In this case there is a cache hit (see Cache hits) instead of a cache miss because the data has been found in the cache. Under Calls you can see the get request to retrieve the all_subjects cache key.
Check the SQL menu item of the debug toolbar. You should see that there is one less SQL query in this request. You are saving one SQL query because the view finds the data in the cache and doesn’t need to retrieve it from the database:
![SQL menu on reload](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/xyh93q6m4xtnz68u4bun.PNG)
Successive requests to the same URL will retrieve the data from the cache. When the timeout is reached (for us 20sec), the next request to the URL will generate a cache miss, the QuerySet will be executed, and data will be cached for another 20 seconds. You can define a different default timeout in the TIMEOUT element of the CACHES setting.
**Note** above we did Low-level cache. You can also do:
- **Template cache**: Allows you to cache template fragments. You need to load the cache template tags in your template using `{% load cache %}`. Then, you will be able to use the `{% cache %}` template tag to cache specific template fragments.
<br>
- **Per-view cache**: Provides caching for individual views. You use the cache_page decorator located at `django.views.decorators.cache.` The decorator requires a timeout argument (in seconds).
<br>
- **Per-site cache**: The highest-level cache. It caches your entire site. To allow the per-site cache, edit the settings.py file of your project and add the UpdateCacheMiddleware and FetchFromCacheMiddleware classes to the MIDDLEWARE setting, as follows:
MIDDLEWARE = [
'debug_toolbar.middleware.DebugToolbarMiddleware',
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.cache.UpdateCacheMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.cache.FetchFromCacheMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
]
## References
Antonio Melé: Django 4 By Example. 4th ed., Packt, 2022.
Cache (computing). In Wikipedia. https://en.wikipedia.org/wiki/Cache_(computing)
## Conclusion
We have touched various aspect about cache and caching. However, it's important to note that caching should be implemented thoughtfully. Not all data or resources should be cached, and cached data needs to be managed carefully to ensure it remains consistent with the source data and doesn't become stale.
<br>
Thanks for reading
Top comments (0)