Caching is a great way to speed up slow pages and to make your API faster in general.
Let's say we want to cache the response of our API.
Key to cache invalidation
First thing to think when adding a cache is the cache invalidation.
Rails have handy methods for that: ActiveRelation
and ActiveRecord
methods cache_key
and cache_key_with_version
.
While cache_key
returns only the id of a model, e.g.
Product.where("name like ?", "%Game%").last.cache_key
=> "products/124"
cache_key_with_version
additionally returns the timestamp of the last change, e.g.
Product.find(124).cache_key_with_version
=> "products/124-20240624103815954181"
By default it works by the updated_at
column, but you can customize it like this:
Product.find(124).cache_key(:last_reviewed_at)
For relation it works a bit differently, returning the key based on the SQL query hash. It takes into consideration params too.
Product.where("name like ?", "%Game%").cache_key
# => "products/query-1850ab3d302391b85b8693e941286659"
And cache_key_with_version
additionally returns the id and timestamp of the last entity
Product.where("name like ?", "%Game%").cache_key_with_version
# => "products/query-e0db51fbb1a07ab9545d84d80aac3d16-124-20240628093023387346"
Two caching strategies
Adding caching is easy, but the problem is the cache invalidation. There are two stategies for doing that
Push method - by some triggers (e.g., once an hour, or when a product changes), we prepare data and store it in the cache.
The positives are that data always will be in the cache when a user calls an API.
The negatives - we need to prepare data in all places where there are triggers, and be careful with cache invalidation (it's very easy to miss a place that occasionally changes your model).Pull method - the moment we need data from the cache, we check it, and if there is no data - we get the data in the usual way and store it in the cache for the future.
It's easier in implementation, as we're worrying less about invalidation (though it still needs some work),
But it works worse with cache misses - as some users will wait for a response full time.
So to use that, your API should not be too slow!
For the push method, you are good to use cache_key
, but for the pull method we need to check if the cache is still correct, so cache_key_with_version
is our friend
I'd say generally it's better to start with pull method, and use push method in cases when it's possible to update the cache at some time, e.g. once a day.
But you should always take into consideration your requirements.
Going to the project
We'll go with pull method as an easier one, and use cache_key_with_version
. Let's add a helper method in Grape API:
helpers do
def present(resource, namespace='MyAPI', caching: false)
return resource.to_json unless caching
Rails.cache.fetch([resource.cache_key_with_version, namespace].join('-'), expires_in: 1.day) do
resourse.to_json
end
rescue Redis::TimeoutError
resource.to_json
end
end
Besides resource.cache_key_with_version
, we also make separate caches for different API namespaces (we don't want users to see admin's output).
In more complex examples, you could add more params, e.g., options for serializers if they change the output when SQL is the same.
Then we could use it in the API like this
class Edu::API::V2::Products < Grape::API
resources :products do
get do
products = Product.some_query(params)
present products, "API::V2::Products", caching: true
end
end
end
The next thing is to make sure we invalidate products if the output depends on related models. As our cache depends on the updated_at
column of Products, and other models will not affect it by default.
class Option < ApplicationRecord
# invalidate product cache if option changes
belongs_to :product, touch: true
end
class Product < ApplicationRecord
# invalidate product cache if promos change
has_and_belongs_to_many :promos,
after_add: :touch_updated_at,
after_remove: :touch_updated_at
def touch_updated_at(_promo)
touch if persisted?
end
end
Things to take into consideration
It's better to have a separate Redis database number for the cache. There could be a case that Redis will be full of cache entries and remove e.g., Sidekiq-related data. (it depends on the Redis eviction config)
Beware of untrusted input. There could be a case when an attacker could fill your cache up by sending different versions of parameters.
E.g., if you have the parameter search, better not cache it as every cache will be different.
You could easily disable cache for specific params
params do
optional :search
end
get do
products = Product.some_query(params)
caching = params[:search].blank?
present products, "API::V2::Products", caching: caching
end
Links
Documentation:
- https://guides.rubyonrails.org/caching_with_rails.html Rails caching guide
- https://apidock.com/rails/ActiveRecord/Base/cache_key
- https://apidock.com/rails/ActiveRecord/Relation/cache_key
Source code:
Examples of caching in some projects:
Additional concepts:
- Russian-doll caching
- Push vs Pull caching strategies in System Design
Top comments (0)