Advanced Concepts: Pagination, Filtering, Versioning, and Rate LimitingLesson 5.4

Advanced Rate Limiting and Throttling

rate limiting algorithms, token bucket, sliding window, fixed window, per-user rate limiting, Redis rate limiting, rate limit headers, throttling

Advanced Rate Limiting and Throttling

Simple IP-based rate limiting is the starting point, but production APIs need more sophisticated strategies. This lesson covers rate limiting algorithms, per-user limits, Redis-backed distributed rate limiting, and communicating limits to clients through headers.

Rate Limiting Algorithms

The Fixed Window algorithm counts requests in fixed time windows (e.g., 100 requests per hour). It is simple but allows burst traffic at window boundaries. The Sliding Window algorithm counts requests in a rolling window, providing more uniform protection against bursts. The Token Bucket algorithm gives each client a bucket of tokens that refills at a constant rate — each request consumes a token. It naturally allows short bursts while enforcing a long-term average rate.

Per-User Rate Limiting with Redis

npm install ioredis rate-limit-redis

const userRateLimiter = rateLimit({
  windowMs: 60 * 1000,
  max: 60,
  keyGenerator: (req) => req.user ? req.user.id : req.ip,
  store: new RedisStore({ sendCommand: (...args) => redisClient.call(...args) })
});

Using Redis ensures rate limit state is shared across all API server instances. Without Redis, each server instance maintains its own counter, allowing a client to make N requests to each server instance.

Rate Limit Headers

Communicate limits to clients via response headers: X-RateLimit-Limit: 100, X-RateLimit-Remaining: 47, X-RateLimit-Reset: 1705320000. When the limit is exceeded, return 429 and add Retry-After: 37 so clients know when to retry.