MongoDB Aggregations

Process and analyze data with aggregation pipelines

🔄 What are Aggregations?

Aggregations process data records and return computed results. They group values, perform calculations, and transform documents through a pipeline of stages, making complex data analysis simple and efficient.


// Basic aggregation pipeline
db.orders.aggregate([
    { $match: { status: "completed" } },
    { $group: { _id: "$customer", total: { $sum: "$amount" } } }
])
                                    

Aggregation Pipeline Stages

Aggregation pipelines consist of stages that process documents sequentially. Each stage transforms documents and passes results to the next stage.

🔍

$match

Filters documents like find()

{ $match: { age: { $gte: 18 } } }
📊

$group

Groups documents by field

{ $group: { _id: "$city" } }
🎯

$project

Reshapes documents

{ $project: { name: 1, age: 1 } }
📈

$sort

Orders documents

{ $sort: { price: -1 } }

🔹 $match Stage

The $match stage filters documents, similar to find() queries. Place $match early in pipelines to reduce documents processed by later stages.

🔸 Basic Matching

// Find active users
db.users.aggregate([
    { $match: { status: "active" } }
])

// Multiple conditions
db.products.aggregate([
    { 
        $match: { 
            category: "Electronics",
            price: { $lt: 1000 },
            inStock: true
        } 
    }
])

Input:

{ "_id": 1, "name": "Laptop", "category": "Electronics", "price": 800, "inStock": true }
{ "_id": 2, "name": "Phone", "category": "Electronics", "price": 1200, "inStock": true }
{ "_id": 3, "name": "Mouse", "category": "Electronics", "price": 25, "inStock": true }

Output:

{ "_id": 1, "name": "Laptop", "category": "Electronics", "price": 800, "inStock": true }
{ "_id": 3, "name": "Mouse", "category": "Electronics", "price": 25, "inStock": true }

🔹 $group Stage

The $group stage groups documents by a specified field and performs calculations on grouped data using accumulator operators.

🔸 Group and Count

// Count users by country
db.users.aggregate([
    {
        $group: {
            _id: "$country",
            count: { $sum: 1 }
        }
    }
])

Input:

{ "name": "John", "country": "USA" }
{ "name": "Alice", "country": "USA" }
{ "name": "Bob", "country": "Canada" }

Output:

{ "_id": "USA", "count": 2 }
{ "_id": "Canada", "count": 1 }

🔸 Calculate Totals

// Total sales by category
db.sales.aggregate([
    {
        $group: {
            _id: "$category",
            totalRevenue: { $sum: "$amount" },
            avgSale: { $avg: "$amount" },
            maxSale: { $max: "$amount" },
            salesCount: { $sum: 1 }
        }
    }
])

Input:

{ "category": "Electronics", "amount": 1200 }
{ "category": "Electronics", "amount": 800 }
{ "category": "Books", "amount": 50 }

Output:

{ "_id": "Electronics", "totalRevenue": 2000, "avgSale": 1000, "maxSale": 1200, "salesCount": 2 }
{ "_id": "Books", "totalRevenue": 50, "avgSale": 50, "maxSale": 50, "salesCount": 1 }

🔹 $project Stage

The $project stage reshapes documents by including, excluding, or adding computed fields. Create new fields using expressions and operators.

🔸 Select and Transform Fields

// Create computed fields
db.products.aggregate([
    {
        $project: {
            name: 1,
            price: 1,
            discountedPrice: {
                $multiply: ["$price", 0.9]
            },
            priceWithTax: {
                $multiply: ["$price", 1.1]
            }
        }
    }
])

Input:

{ "_id": 1, "name": "Widget", "price": 100, "stock": 50 }

Output:

{ "_id": 1, "name": "Widget", "price": 100, "discountedPrice": 90, "priceWithTax": 110 }

🔹 $sort Stage

The $sort stage orders documents by one or more fields. Use 1 for ascending order and -1 for descending order.

🔸 Sort Documents

// Sort by price descending
db.products.aggregate([
    { $sort: { price: -1 } }
])

// Sort by multiple fields
db.users.aggregate([
    { 
        $sort: { 
            country: 1,
            age: -1
        } 
    }
])

Input:

{ "name": "Mouse", "price": 25 }
{ "name": "Laptop", "price": 1200 }
{ "name": "Keyboard", "price": 75 }

Output (sorted by price descending):

{ "name": "Laptop", "price": 1200 }
{ "name": "Keyboard", "price": 75 }
{ "name": "Mouse", "price": 25 }

🔹 $limit and $skip Stages

Use $limit to restrict the number of documents and $skip to skip a specified number of documents. Perfect for pagination.

🔸 Pagination Example

// Get page 2 (skip 10, limit 10)
db.products.aggregate([
    { $sort: { name: 1 } },
    { $skip: 10 },
    { $limit: 10 }
])

// Top 5 most expensive products
db.products.aggregate([
    { $sort: { price: -1 } },
    { $limit: 5 }
])

🔹 $lookup Stage

The $lookup stage performs left outer joins with other collections. It adds a new array field containing matching documents from the joined collection.

🔸 Join Collections

// Join orders with customer data
db.orders.aggregate([
    {
        $lookup: {
            from: "customers",
            localField: "customerId",
            foreignField: "_id",
            as: "customerInfo"
        }
    }
])

Orders Collection:

{ "_id": 1, "customerId": 101, "amount": 250 }

Customers Collection:

{ "_id": 101, "name": "John Doe", "email": "[email protected]" }

Output:

{ 
    "_id": 1, 
    "customerId": 101, 
    "amount": 250,
    "customerInfo": [
        { "_id": 101, "name": "John Doe", "email": "[email protected]" }
    ]
}

🔹 $unwind Stage

The $unwind stage deconstructs array fields, creating a separate document for each array element.

🔸 Unwind Arrays

// Unwind tags array
db.posts.aggregate([
    { $unwind: "$tags" }
])

Input:

{ "_id": 1, "title": "MongoDB Tutorial", "tags": ["database", "nosql", "mongodb"] }

Output:

{ "_id": 1, "title": "MongoDB Tutorial", "tags": "database" }
{ "_id": 1, "title": "MongoDB Tutorial", "tags": "nosql" }
{ "_id": 1, "title": "MongoDB Tutorial", "tags": "mongodb" }

🔹 Complete Pipeline Examples

Combine multiple stages for powerful data analysis:

Example 1: Sales Report

// Monthly sales report by category
db.sales.aggregate([
    {
        $match: {
            date: { 
                $gte: ISODate("2024-01-01"),
                $lt: ISODate("2024-12-31")
            }
        }
    },
    {
        $group: {
            _id: {
                category: "$category",
                month: { $month: "$date" }
            },
            totalSales: { $sum: "$amount" },
            avgSale: { $avg: "$amount" },
            count: { $sum: 1 }
        }
    },
    {
        $sort: { "_id.month": 1, totalSales: -1 }
    },
    {
        $project: {
            _id: 0,
            category: "$_id.category",
            month: "$_id.month",
            totalSales: 1,
            avgSale: { $round: ["$avgSale", 2] },
            count: 1
        }
    }
])

Example 2: Customer Analytics

// Top customers by total spending
db.orders.aggregate([
    {
        $match: { status: "completed" }
    },
    {
        $group: {
            _id: "$customerId",
            totalSpent: { $sum: "$amount" },
            orderCount: { $sum: 1 },
            avgOrderValue: { $avg: "$amount" }
        }
    },
    {
        $lookup: {
            from: "customers",
            localField: "_id",
            foreignField: "_id",
            as: "customerInfo"
        }
    },
    {
        $unwind: "$customerInfo"
    },
    {
        $project: {
            _id: 0,
            customerName: "$customerInfo.name",
            email: "$customerInfo.email",
            totalSpent: 1,
            orderCount: 1,
            avgOrderValue: { $round: ["$avgOrderValue", 2] }
        }
    },
    {
        $sort: { totalSpent: -1 }
    },
    {
        $limit: 10
    }
])

Example 3: Product Performance

// Best selling products with ratings
db.orderItems.aggregate([
    {
        $group: {
            _id: "$productId",
            totalSold: { $sum: "$quantity" },
            revenue: { $sum: { $multiply: ["$price", "$quantity"] } }
        }
    },
    {
        $lookup: {
            from: "products",
            localField: "_id",
            foreignField: "_id",
            as: "product"
        }
    },
    {
        $unwind: "$product"
    },
    {
        $project: {
            _id: 0,
            productName: "$product.name",
            category: "$product.category",
            totalSold: 1,
            revenue: 1,
            avgRating: "$product.rating"
        }
    },
    {
        $sort: { revenue: -1 }
    },
    {
        $limit: 20
    }
])

🧠 Test Your Knowledge

Which stage groups documents by a field?