Script Valley
System Design: APIs, Caching & Scalability
Caching FundamentalsLesson 2.1

What is caching and why does it matter for performance

cache definition, cache hit vs miss, cache hit ratio, latency reduction, origin offloading, cost reduction, when not to cache

What is caching and why does it matter for performance

Cache hit vs miss

The Core Concept

A cache is a fast, temporary storage layer that serves repeated requests without hitting the slow origin. The origin might be a database, an external API, or a computation-heavy function. Caching trades storage for speed.

Cache Hit vs Cache Miss

A cache hit occurs when requested data is in the cache — response time is microseconds. A cache miss occurs when it is not — the system fetches from origin, stores the result, then responds. Your goal is maximizing the hit ratio.

# Redis cache-aside example
const cached = await redis.get(`user:${id}`);
if (cached) return JSON.parse(cached);

const user = await db.query('SELECT * FROM users WHERE id = $1', [id]);
await redis.setex(`user:${id}`, 3600, JSON.stringify(user));
return user;

When Not to Cache

Caching adds complexity: stale data, invalidation bugs, and cold-start problems. Avoid caching when data changes on every request, when accuracy is critical (financial balances, inventory counts), or when the fetch is already fast. Cache read-heavy, slow-to-compute, or expensive-to-fetch data. The 80/20 rule applies: 20% of your data is requested 80% of the time — those are your cache candidates.

Up next

Cache eviction policies: LRU, LFU, and TTL explained

Sign in to track progress