Aggregation PipelineLesson 4.5
How to write MongoDB aggregation pipelines for real analytics
$facet multi-pipeline, $out and $merge to persist results, allowDiskUse for large datasets, aggregation pipeline optimization, pipeline explain
$facet โ multiple sub-pipelines in one query
$facet processes the same input documents through multiple independent aggregation sub-pipelines and returns all results in one query response. This is ideal for analytics dashboards that need several different metrics from the same collection without multiple database round-trips.
const dashboard = await db.collection('orders').aggregate([{
$facet: {
byStatus: [
{ $group: { _id: '$status', count: { $sum: 1 }, total: { $sum: '$total' } } }
],
revenueByMonth: [
{ $group: {
_id: { $month: '$createdAt' },
revenue: { $sum: '$total' }
}},
{ $sort: { _id: 1 } }
],
topCustomers: [
{ $group: { _id: '$userId', spent: { $sum: '$total' } } },
{ $sort: { spent: -1 } }, { $limit: 5 }
]
}
}]).toArray()
console.log(dashboard[0]) // { byStatus: [...], revenueByMonth: [...], topCustomers: [...] }Persisting pipeline results to a collection
// $out: replace the entire target collection
{ $out: 'monthly_revenue_cache' }
// $merge: upsert results into an existing collection
{ $merge: { into: 'reports', on: '_id', whenMatched: 'replace' } }Large dataset processing
// Pipelines exceeding 100 MB working memory will error by default
db.collection('events').aggregate(pipeline, { allowDiskUse: true })Use $out for scheduled report generation and $merge for incremental updates to existing reporting tables. Both are atomic โ the target collection is not partially updated if the pipeline fails mid-way.
