Caching SystemsLesson 4.5
How to design a distributed cache system
cache sharding, cache cluster topology, replication for availability, hot key problem, local vs remote cache, cache hierarchy
Distributed Cache Architecture
A single Redis node maxes out at ~100GB and 100K ops/second. For larger systems, you need distributed caching across multiple nodes.
Sharding the Cache
Use consistent hashing to map keys to cache nodes. The client library (or a proxy like Twemproxy) handles routing.
# Client-side sharding with redis-py
from rediscluster import RedisCluster
startup_nodes = [
{'host': 'cache-0', 'port': 6379},
{'host': 'cache-1', 'port': 6379},
{'host': 'cache-2', 'port': 6379}
]
rc = RedisCluster(startup_nodes=startup_nodes, decode_responses=True)
rc.set('user:123', 'alice') # automatically routed to correct shardHot Key Problem
If one key (e.g., a viral post) gets 1M reads/second, all requests hit one cache node. Solutions:
- Replicate hot keys to multiple nodes and randomly pick one per request
- Use local in-process cache (L1) for ultra-hot keys, backed by Redis (L2)
Local vs Remote Cache Hierarchy
- L1 (in-process): microsecond access, limited size, per-instance cache (Guava Cache, Caffeine)
- L2 (Redis cluster): millisecond access, shared across instances, larger
- L3 (database): tens of milliseconds, source of truth
