Script Valley
System Design: APIs, Caching & Scalability
Rate Limiting and ThrottlingLesson 4.3

Token bucket and leaky bucket rate limiting explained

token bucket algorithm, burst capacity, refill rate, leaky bucket algorithm, smooth output rate, API burst allowance, traffic shaping

Token bucket and leaky bucket rate limiting explained

Token bucket vs leaky bucket

Token Bucket

A bucket holds up to N tokens. Tokens refill at a fixed rate such as 10 per second. Each request consumes one token. If the bucket is not empty, the request is allowed โ€” even bursts up to bucket capacity. If empty, reject with 429.

const tokens = await redis.get(`tokens:${userId}`);
const lastRefill = await redis.get(`refill:${userId}`);
const now = Date.now();
const elapsed = (now - lastRefill) / 1000;
const newTokens = Math.min(BUCKET_MAX, parseFloat(tokens) + elapsed * REFILL_RATE);
if (newTokens < 1) return res.status(429).end();
await redis.set(`tokens:${userId}`, newTokens - 1);
await redis.set(`refill:${userId}`, now);

Leaky Bucket

Requests enter the bucket at any rate. The bucket drains at a constant rate, processing one request per time unit. Excess requests spill out and are rejected. This smooths traffic regardless of input bursts โ€” output rate is always constant. Used when a steady downstream rate is critical such as payment processors or SMS gateways.

Choosing Between Them

Use token bucket when clients should be able to burst occasionally such as mobile apps catching up after offline mode. Use leaky bucket when downstream systems require a smooth, predictable request rate and cannot handle spikes.

Up next

Distributed rate limiting with Redis across multiple servers

Sign in to track progress

Token bucket and leaky bucket rate limiting explained โ€” Rate Limiting and Throttling โ€” System Design: APIs, Caching & Scalability โ€” Script Valley โ€” Script Valley