Script Valley
Database Design: From Idea to Schema
Schema for Scale/Assessment

Practice & Assessment

Test your understanding of Schema for Scale

Multiple Choice Questions

5
1

A table with 500 million rows is partitioned by YEAR(created_at). A query filters by WHERE created_at >= '2024-01-01'. What performance benefit does partitioning provide?

2

Why is created_at a poor choice for a sharding key in a high-write system?

3

What is the correct order of operations when archiving old rows to prevent data loss?

4

You want dashboard queries to run fast without impacting production OLTP performance. Which pattern achieves this?

5

What must be true about the shard key for a sharded schema to work correctly?

Coding Challenges

1
1

Design a Partitioned Events Table

Write CREATE TABLE DDL for an events table (event_id BIGINT, user_id INT, event_type VARCHAR, payload JSON, created_at TIMESTAMP) partitioned by RANGE on YEAR(created_at) for years 2022 through 2025 plus a MAXVALUE partition. Then write the SQL for: (1) an archive job that copies all 2022 events to an events_archive table and deletes them from the live table in batches of 5000; (2) a query that returns a count of events per event_type for the year 2024, demonstrating partition pruning. Time estimate: 30 minutes.

Hard

Mini Project

1

Scale-Ready Schema for a Multi-Tenant SaaS Platform

Design the full schema for a multi-tenant project management SaaS (think Asana or Linear). Requirements: tenants (organizations) each have users, projects, tasks, and comments; schema must be shard-friendly with tenant_id as the shard key propagated into every table's PK; tasks table must be partitioned by RANGE on YEAR(created_at); implement soft deletes on tasks and projects; add an audit table for task status changes; create a read-optimized materialized view (or equivalent) for per-tenant task completion rate. Provide all DDL, index definitions, and a written justification (as SQL comments) for each sharding and partitioning decision.

Hard