MongoDB FoundationsLesson 1.4

MongoDB schema design: embedding vs referencing documents

embedded documents, document references, one-to-many patterns, data duplication trade-offs, 16 MB document limit, denormalization

Two ways to model relationships

MongoDB does not enforce foreign key constraints. You either embed related data directly inside a parent document or store a reference - an _id value pointing to another document in a separate collection. Picking the right strategy is one of the most important architectural decisions in MongoDB schema design.

Embed when the data always travels together

Embedding is ideal when child data has no independent lifecycle and is always read alongside the parent. A user's shipping address is a perfect example - you rarely query addresses without the user. Embedded data means one read fetches everything with zero additional round-trips.

// Fully embedded - one document, one read
{
  "_id": ObjectId("u1"), "name": "Turing",
  "address": { "city": "Wilmslow", "zip": "SK9" },
  "tags": ["cs", "math"]
}

Reference when data is shared or unbounded

Reference when child data is accessed independently, shared across many parents, or can grow without limit. A user's order history can reach thousands of records - embedding all orders inside the user document would hit MongoDB's 16 MB document limit and make every user load expensive.

// User stores only IDs
{ "_id": ObjectId("u1"), "orderIds": [ObjectId("o1"), ObjectId("o2")] }
// Orders live in their own collection
{ "_id": ObjectId("o1"), "userId": ObjectId("u1"), "total": 99.99 }

Most production schemas mix both patterns. Embed for hot co-read paths, reference for unbounded or independently accessed data.