Why This Matters for Backend Engineers in North East India
As backend engineers working with MongoDB in production, especially those using aggregation pipelines and $lookup, it is crucial to understand how to optimize the database for efficient query execution. This guide offers essential principles and rules that can help reduce query times and improve the overall performance of MongoDB systems, benefiting projects in North East India and beyond.
Core Principles
Every Index Must Map to a Real Query
Indexes should be designed to support specific queries. An index that doesn't correspond to a real query is essentially wasted effort.
Indexes Must Have a Documented Reason
It's essential to document the reasoning behind creating each index. This helps ensure that indexes are created intentionally and are not added haphazardly.
Indexes are Validated Using explain()
The MongoDB explain() function provides insights into how a query is executed, helping developers understand whether an index is being used effectively.
Unused Indexes are Technical Debt
Indexes that are not utilized contribute to technical debt, slowing down the system and increasing maintenance costs.
Verifying Index Usage
Check Execution Stats
Before creating or blaming an index, always verify its usage by checking the execution stats. If totalDocsExamined and totalKeysExamined are low, and there's no COLLSCAN, it indicates that the index is being used.
Index Usage Counter
The indexUsageCounter aggregate operation can help identify indexes with zero active usage, which may be candidates for removal.
Index Design Rules
Entry Point Matters
MongoDB executes aggregation pipelines left to right. Indexes apply only to the collection where the pipeline starts. $lookup does not reorder execution like SQL joins.
Unique Equality Ends Index Traversal
If a query includes equality on a unique field, MongoDB uses only that unique index, and no compound index fields after it will be used.
$expr Prevents Index Optimization
While $expr may still touch an index, it prevents efficient scans, range bounds, and compound index usage. Avoid using $expr wherever possible.
$lookup Index Rules
Indexes on the foreign collection are used only if $match is inside $lookup and $match is the first stage with no $expr predicates or unique equality that dominates the lookup.
$match After $lookup Is In-Memory
Indexes on the joined collection will not be used after the $lookup and $unwind stages. Always filter before or inside $lookup.
Aggregation-Specific Guidelines
Date Filtering
Always filter dates using direct field predicates. Never use $expr for date ranges.
Sorting
If you sort, ensure that the sort field appears last in the index.
Golden Rules
- One index = one query pattern
- $match early or inside $lookup
- Unique equality dominates everything
- $expr kills optimization
- $lookup is not SQL JOIN
- ops = 0 means debt
Final Note
Indexes do not make queries fast. Correct query shape makes indexes usable. Creating templates lets you quickly answer FAQs or store snippets for re-use.