MongoDB Performance Tuning

Optimize queries and improve database speed

⚡ What is Performance Tuning?

Performance tuning optimizes MongoDB queries and operations for speed and efficiency. It involves creating proper indexes, analyzing query execution, optimizing schema design, and configuring server settings for maximum throughput.


// Check query performance
db.users.find({ email: "[email protected]" }).explain("executionStats")
                                    

Result:

Detailed execution statistics and performance metrics

Key Performance Strategies

🔍

Indexing

Speed up queries with indexes

db.users.createIndex({ email: 1 })
📊

Query Analysis

Analyze query execution

db.collection.find().explain()
💾

Projection

Return only needed fields

db.users.find({}, { name: 1, email: 1 })
🎯

Aggregation

Optimize pipeline stages

db.orders.aggregate([{ $match: {...} }])

🔹 Creating Effective Indexes

Indexes dramatically improve query performance by allowing MongoDB to quickly locate documents without scanning entire collections. Create indexes on fields used in queries, sorts, and joins.

// Single field index
db.users.createIndex({ email: 1 })  // 1 = ascending, -1 = descending

// Compound index (multiple fields)
db.orders.createIndex({ userId: 1, createdAt: -1 })

// Unique index (enforce uniqueness)
db.users.createIndex({ username: 1 }, { unique: true })

// Partial index (index subset of documents)
db.orders.createIndex(
  { status: 1 },
  { partialFilterExpression: { status: "active" } }
)

// Text index for full-text search
db.articles.createIndex({ title: "text", content: "text" })

// Geospatial index
db.locations.createIndex({ coordinates: "2dsphere" })

// Check existing indexes
db.users.getIndexes()

// Drop an index
db.users.dropIndex("email_1")

Result:

Indexes created successfully. Queries on indexed fields are now faster.

🔹 Query Optimization with Explain

Use explain() to analyze query performance and identify bottlenecks. It shows execution time, documents scanned, index usage, and query plan details for optimization insights.

// Basic explain
db.users.find({ age: { $gt: 25 } }).explain()

// Execution statistics (detailed)
db.users.find({ email: "[email protected]" })
  .explain("executionStats")

// All plans (shows rejected plans too)
db.users.find({ age: 30 }).explain("allPlansExecution")

// Key metrics to check:
// - executionTimeMillis: Query execution time
// - totalDocsExamined: Documents scanned
// - totalKeysExamined: Index entries scanned
// - executionStages.stage: "IXSCAN" (good) vs "COLLSCAN" (bad)

// Example output analysis:
{
  executionStats: {
    executionTimeMillis: 2,      // Fast!
    totalDocsExamined: 1,         // Only 1 doc scanned
    totalKeysExamined: 1,         // Used index
    executionStages: {
      stage: "IXSCAN",            // Index scan (good!)
      indexName: "email_1"
    }
  }
}

// Bad performance indicator:
{
  executionStats: {
    executionTimeMillis: 450,     // Slow!
    totalDocsExamined: 100000,    // Scanned all docs
    executionStages: {
      stage: "COLLSCAN"           // Collection scan (bad!)
    }
  }
}

Goal:

IXSCAN (index scan) with low docsExamined and fast execution time

🔹 Projection for Efficiency

Reduce network transfer and memory usage by returning only required fields. Projection limits the data MongoDB sends to your application, improving response times and reducing bandwidth.

// BAD: Return entire document (wasteful)
db.users.find({ status: "active" })
// Returns: { _id, name, email, password, address, preferences, ... }

// GOOD: Return only needed fields
db.users.find(
  { status: "active" },
  { name: 1, email: 1, _id: 0 }  // 1 = include, 0 = exclude
)
// Returns: { name: "John", email: "[email protected]" }

// Exclude specific fields
db.users.find(
  { status: "active" },
  { password: 0, internalData: 0 }
)

// Projection with arrays (limit array elements)
db.posts.find(
  { _id: 1 },
  { 
    title: 1,
    comments: { $slice: 10 }  // Only first 10 comments
  }
)

// Projection in aggregation
db.orders.aggregate([
  { $match: { status: "completed" } },
  { $project: { 
      orderId: 1,
      total: 1,
      customerName: "$customer.name"  // Extract nested field
    }
  }
])

Benefits:

Faster queries, less memory, reduced network transfer

🔹 Compound Index Strategy

Compound indexes support queries on multiple fields efficiently. Field order matters: place equality filters first, then sort fields, then range filters for optimal performance.

// ESR Rule: Equality, Sort, Range

// Create compound index
db.orders.createIndex({ 
  status: 1,      // Equality filter
  createdAt: -1,  // Sort field
  total: 1        // Range filter
})

// This query uses the index efficiently
db.orders.find({
  status: "completed",           // Equality (uses index)
  total: { $gt: 100 }            // Range (uses index)
}).sort({ createdAt: -1 })       // Sort (uses index)

// Index prefixes work too
db.orders.find({ status: "completed" })  // Uses index
db.orders.find({ status: "completed" }).sort({ createdAt: -1 })  // Uses index

// This does NOT use the index well
db.orders.find({ total: { $gt: 100 } })  // Skips first field

// Multiple compound indexes for different queries
db.products.createIndex({ category: 1, price: 1 })
db.products.createIndex({ brand: 1, rating: -1 })

// Check index usage
db.orders.find({ status: "completed" })
  .sort({ createdAt: -1 })
  .explain("executionStats")

Best Practice:

Order fields: Equality → Sort → Range (ESR Rule)

🔹 Aggregation Pipeline Optimization

Optimize aggregation pipelines by filtering early, using indexes, limiting data, and ordering stages efficiently. Place $match and $limit stages early to reduce processed documents.

// BAD: Process all docs then filter
db.orders.aggregate([
  { $lookup: { /* expensive join */ } },
  { $unwind: "$items" },
  { $match: { status: "completed" } }  // Filter at end (slow!)
])

// GOOD: Filter first, then process
db.orders.aggregate([
  { $match: { status: "completed" } },  // Filter early (fast!)
  { $limit: 1000 },                     // Limit data
  { $lookup: { /* join only filtered docs */ } },
  { $unwind: "$items" }
])

// Use indexes in $match
db.orders.createIndex({ status: 1, createdAt: -1 })
db.orders.aggregate([
  { $match: { 
      status: "completed",
      createdAt: { $gte: ISODate("2024-01-01") }
    }
  },  // Uses index!
  { $group: { _id: "$userId", total: { $sum: "$amount" } } }
])

// Project early to reduce data size
db.orders.aggregate([
  { $match: { status: "completed" } },
  { $project: { userId: 1, amount: 1 } },  // Only needed fields
  { $group: { _id: "$userId", total: { $sum: "$amount" } } }
])

// Use $limit with $sort
db.products.aggregate([
  { $match: { category: "Electronics" } },
  { $sort: { price: -1 } },
  { $limit: 10 }  // Only top 10
])

// Avoid $lookup when possible (use embedded data)
// Instead of joining, embed frequently accessed data

Optimization Tips:

$match early, use indexes, $limit data, $project only needed fields

🔹 Connection Pooling

Reuse database connections with connection pooling to reduce overhead. Configure pool size based on your application's concurrency needs and available server resources.

// Configure connection pool
const { MongoClient } = require('mongodb');

const client = new MongoClient(uri, {
  maxPoolSize: 50,        // Max connections (default: 100)
  minPoolSize: 10,        // Min connections (default: 0)
  maxIdleTimeMS: 30000,   // Close idle connections after 30s
  waitQueueTimeoutMS: 5000 // Wait 5s for available connection
});

// Good practice: Reuse client
let cachedClient = null;

async function connectToDatabase() {
  if (cachedClient) {
    return cachedClient;
  }
  
  cachedClient = await MongoClient.connect(uri, {
    maxPoolSize: 50
  });
  
  return cachedClient;
}

// Use the connection
const client = await connectToDatabase();
const db = client.db("myapp");
const users = await db.collection("users").find().toArray();

// Don't close after each operation!
// Connection pool handles this automatically

// Pool size guidelines:
// - Small app: 10-20 connections
// - Medium app: 50-100 connections
// - Large app: 100-500 connections
// - Monitor with: db.serverStatus().connections

Result:

Faster queries with reused connections, reduced overhead

🔹 Performance Monitoring

Monitor database performance with built-in tools and commands. Track slow queries, index usage, connection stats, and resource utilization to identify and fix performance issues.

// Enable profiling (log slow queries)
db.setProfilingLevel(1, { slowms: 100 })  // Log queries > 100ms
// Levels: 0 = off, 1 = slow queries, 2 = all queries

// View slow queries
db.system.profile.find().sort({ ts: -1 }).limit(5)

// Current operations
db.currentOp()

// Server status
db.serverStatus()

// Collection statistics
db.users.stats()

// Index statistics
db.users.aggregate([{ $indexStats: {} }])

// Database statistics
db.stats()

// Monitor connections
db.serverStatus().connections
// { current: 52, available: 838808 }

// Check replication lag (if using replica set)
rs.printReplicationInfo()
rs.printSecondaryReplicationInfo()

// Useful metrics to monitor:
// - Query execution time
// - Index hit ratio
// - Connection pool usage
// - Memory usage
// - Disk I/O
// - Replication lag

// MongoDB Compass and Atlas provide visual monitoring

Tools:

MongoDB Compass, Atlas Monitoring, mongostat, mongotop

🔹 Performance Best Practices

Follow these proven strategies to maintain optimal MongoDB performance. Regular monitoring, proper indexing, and efficient queries are key to scalable applications.

✅ Do's:

  • Create indexes on frequently queried fields
  • Use projection to return only needed fields
  • Filter early in aggregation pipelines ($match first)
  • Use compound indexes for multi-field queries
  • Monitor slow queries with profiling
  • Use connection pooling
  • Limit result sets with $limit
  • Use covered queries (index-only)

❌ Don'ts:

  • Don't create too many indexes (slows writes)
  • Don't use $where or $regex without indexes
  • Don't return entire documents when not needed
  • Don't use collection scans on large collections
  • Don't ignore explain() output
  • Don't fetch data you won't use
// Performance checklist
// ✅ Indexed query with projection
db.users.find(
  { email: "[email protected]" },  // Indexed field
  { name: 1, email: 1, _id: 0 }   // Projection
).explain("executionStats")

// Check for:
// - stage: "IXSCAN" (not "COLLSCAN")
// - executionTimeMillis < 100ms
// - totalDocsExamined ≈ nReturned

🧠 Test Your Knowledge

What does COLLSCAN in explain() output indicate?