Jump to a section:
Gateway-level rate limiting
- Gateway-level rate limiting is a popular approach to rate limiting that allows developers to set rate limits at the gateway level.
- Gateway-level rate limiting is typically implemented in API gateways such as Kong, Google's Apigee, or Amazon API Gateway.
- Gateway-level rate limiting can provide simple and effective rate limiting, but may not offer as much fine-grained control as other approaches.
Token bucket algorithm
- The token bucket algorithm is a popular rate limiting algorithm that involves allocating tokens to API requests.
- The tokens are refilled at a set rate, and when an API request is made, it must consume a token.
- If there are no tokens available, the request is rejected.
- The token bucket algorithm is commonly used in many rate limiting libraries and tools, such as rate-limiter, redis-rate-limiter, and the Google Cloud Endpoints.
More: Token Bucket vs Bursty Rate Limiter by @animir
Leaky bucket algorithm
- The leaky bucket algorithm is similar to the token bucket algorithm, but instead of allocating tokens, API requests are added to a "bucket" at a set rate.
- If the bucket overflows, the requests are rejected.
- The leaky bucket algorithm can be useful for smoothing out request bursts, and for ensuring that requests are processed at a consistent rate.
Sliding window algorithm
- The sliding window algorithm is a rate limiting approach that involves tracking the number of requests made in a sliding window of time.
- If the number of requests exceeds a set limit, further requests are rejected.
- The sliding window algorithm is commonly used in many rate limiting libraries and tools, such as Django Ratelimit, Express Rate Limit, and the Kubernetes Rate Limiting.
More: Rate limiting using the Sliding Window algorithm by @satrobit
Distributed rate limiting
- For high-traffic APIs, it may be necessary to implement rate limiting across multiple servers.
- Distributed rate limiting algorithms such as Redis-based rate limiting or Consistent Hashing-based rate limiting can be used to implement rate limiting across multiple servers.
- Distributed rate limiting can help to ensure that rate limiting is consistent across multiple servers, and can help to reduce the impact of traffic spikes.
In this example, we'll create a simple Next.js application with a rate-limited API endpoint using Redis and Upstash. Upstash is a serverless Redis database provider that allows you to interact with Redis easily and cost-effectively.
First, let's create a new Next.js project:
npx create-next-app redis-rate-limit-example
cd redis-rate-limit-example
Install the required dependencies:
npm install upstash-redis@0.4.4 ioredis@4.27.6 express-rate-limit@5.3.0
Create a .env.local file in the project root to store your Upstash Redis credentials:
UPSTASH_REDIS_URL=your_upstash_redis_url_here
Replace your_upstash_redis_url_here
with your actual Upstash Redis URL.
Create a new API route in pages/api/limited.js
:
import { connectRedis } from '../../lib/redis';
import rateLimit from 'express-rate-limit';
import { createError } from 'micro';
const redisClient = connectRedis();
const rateLimiter = rateLimit({
store: new RedisStore({
client: redisClient,
}),
windowMs: 60 * 1000, // 1 minute
max: 5, // limit each IP to 5 requests per minute
handler: (req, res) => {
res.status(429).json({ message: 'Too many requests, please try again later.' });
},
});
export default async function handler(req, res) {
try {
await rateLimiter(req, res);
} catch (error) {
if (error instanceof createError.HttpError) {
return res.status(error.statusCode).json({ message: error.message });
}
res.status(500).json({ message: 'Internal server error' });
}
res.status(200).json({ message: 'Success! Your request was not rate-limited.' });
}
export const config = {
api: {
bodyParser: false,
},
};
Create a lib/redis.js
file to handle Redis connections:
import Redis from 'ioredis';
let cachedRedis = null;
export function connectRedis() {
if (cachedRedis) {
return cachedRedis;
}
const redis = new Redis(process.env.UPSTASH_REDIS_URL);
cachedRedis = redis;
return redis;
}
Create a new RedisStore class in lib/redis-store.js
:
import { connectRedis } from './redis';
export class RedisStore {
constructor({ client } = {}) {
this.redis = client || connectRedis();
}
async get(key) {
const data = await this.redis.get(key);
return JSON.parse(data);
}
async set(key, value, ttl) {
await this.redis.set(key, JSON.stringify(value), 'EX', ttl);
}
async resetKey(key) {
await this.redis.del(key);
}
}
Now you can test your rate-limited API endpoint by starting the development server:
npm run dev
Visit http://localhost:3000/api/limited
in your browser or use a tool like Postman or curl to make requests. You should see the Success! Your request was not rate-limited. message. If you make more than 5 requests within a minute, you'll receive the rate limit message:
Too many requests, please try again later.
User-based rate limiting
- Some APIs may require rate limiting at the user level, rather than the IP address or client ID level.
- User-based rate limiting involves tracking the number of requests made by a particular user account, and limiting requests if the user exceeds a set limit.
- User-based rate limiting is commonly used in many API frameworks, such as Django Rest Framework, and can be implemented using session-based or token-based authentication.
API key rate limiting
- For APIs that require authentication with an API key, rate limiting can be implemented at the API key level.
- API key rate limiting involves tracking the number of requests made with a particular API key, and limiting requests if the key exceeds a set limit.
- API key rate limiting is commonly used in many API frameworks, such as Flask-Limiter, and can be implemented using API key-based authentication.
Custom rate limiting
- Finally, it's worth noting that there are many other rate limiting approaches that can be customized to suit the needs of a particular API.
- Some examples include adaptive rate limiting, which adjusts the rate limit based on the current traffic load, and request complexity-based rate limiting, which takes into account the complexity of individual requests when enforcing rate limits.
- Custom rate limiting approaches can be useful for optimizing the rate limiting strategy for a specific API use case.
For my latest project Pub Index API I am making use of an API gateway for rate-limiting.
Top comments (3)
Super interesting and helpful post! 👍
Could you make a continuation where you show the pros and cons?
Honestly I think the best advice is simply: use an API gateway and configure it as needed. All the stuff about leaky buckets is good theory but really you are probably just going to set a limit eg x requests per y period.
Thanks for extraordinary insights on API, for more information do check out Cloud Computing And DevOps Courses.