Understanding Document Relationships
In real-world applications, data is rarely isolated. Most information is related to other pieces of data in some way. For example, a blog post is written by a user, a product belongs to a category, or an order contains multiple products.
MongoDB is a document database, which means it's non-relational by design. However, applications often need to work with related data. Mongoose provides powerful tools to create and manage relationships between documents.
Analogy: Social Network
Think of MongoDB collections like different groups of people. Just as people in real life have connections and relationships with others (friends, family, colleagues), documents in MongoDB can have relationships with other documents. Mongoose's population feature is like having a personal assistant who can quickly find all your friends' information when needed, even though they're stored in different address books.
Types of Relationships in MongoDB
There are two primary ways to model relationships in MongoDB:
Embedded Documents (Subdocuments)
In this approach, related data is nested within the parent document. This is like keeping all your contact information (address, phone numbers, emails) on a single business card.
// Example of embedded documents
const userSchema = new mongoose.Schema({
name: String,
email: String,
address: {
street: String,
city: String,
state: String,
zipCode: String
},
phoneNumbers: [{
type: String,
label: String
}]
});
References (Document References)
This approach stores the ID of a related document. It's similar to how a library catalog might reference a book's location rather than containing the book itself.
// Example of document references
const authorSchema = new mongoose.Schema({
name: String,
bio: String,
website: String
});
const bookSchema = new mongoose.Schema({
title: String,
summary: String,
author: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Author' // This references the Author model
}
});
When to Use Each Approach
Use Embedded Documents When:
- The embedded data is always loaded with the parent
- The embedded data is relatively small
- The embedded data is updated together with the parent
- You have a "contains" relationship (e.g., a user contains addresses)
- The embedded data is specific to the parent and not shared
Use References When:
- The referenced data is often accessed independently
- The referenced data is large
- The referenced data is shared among multiple documents
- You have many-to-many relationships
- The data grows unbounded (e.g., comments on a post)
Real-World Example: E-commerce Application
In an e-commerce platform:
- Embedded: Product variants (sizes, colors) within a product document
- Referenced: Customer information referenced by orders
Amazon uses a mix of both approaches in their database design. Product information might contain embedded documents for variations, while orders reference customer profiles stored separately.
Creating References in Mongoose
To create a reference in Mongoose, use the mongoose.Schema.Types.ObjectId type along with the ref option to specify the model being referenced.
Single Reference
const postSchema = new mongoose.Schema({
title: String,
content: String,
author: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User' // References the User model
}
});
const Post = mongoose.model('Post', postSchema);
Array of References
const productSchema = new mongoose.Schema({
name: String,
price: Number,
categories: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'Category' // References the Category model
}]
});
const Product = mongoose.model('Product', productSchema);
Creating Documents with References
// Create a user
const user = new User({
name: 'John Doe',
email: 'john@example.com'
});
// Save the user
await user.save();
// Create a post with a reference to the user
const post = new Post({
title: 'Introduction to Mongoose',
content: 'Mongoose is an ODM for MongoDB...',
author: user._id // Store the user's ID as a reference
});
// Save the post
await post.save();
Population: Retrieving Referenced Documents
Population is the process of automatically replacing the specified path(s) in a document with document(s) from other collection(s). It's like joining tables in SQL databases, but done in your application code.
Basic Population
// Find a post and populate the author field
const post = await Post.findById('postId')
.populate('author')
.exec();
console.log(post);
/* Output:
{
_id: ObjectId('postId'),
title: 'Introduction to Mongoose',
content: 'Mongoose is an ODM for MongoDB...',
author: {
_id: ObjectId('userId'),
name: 'John Doe',
email: 'john@example.com'
}
}
*/
Populating Multiple Fields
const orderSchema = new mongoose.Schema({
products: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'Product'
}],
customer: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User'
},
shippingAddress: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Address'
}
});
// Populate multiple fields
const order = await Order.findById('orderId')
.populate('products')
.populate('customer')
.populate('shippingAddress')
.exec();
Chaining Populate Method Calls
// Alternative syntax using method chaining
const order = await Order.findById('orderId')
.populate('products')
.populate('customer')
.populate('shippingAddress')
.exec();
Populating with a Single Object
// Populate multiple paths with a single object
const order = await Order.findById('orderId')
.populate([
{ path: 'products' },
{ path: 'customer' },
{ path: 'shippingAddress' }
])
.exec();
Advanced Population Features
Selective Field Population
You can choose which fields from the referenced document to include or exclude:
// Only populate specific fields from the author
const post = await Post.findById('postId')
.populate('author', 'name') // Only include name field
.exec();
// Exclude specific fields
const post = await Post.findById('postId')
.populate('author', '-email -password') // Exclude email and password
.exec();
Query Conditions for Population
You can add query conditions to filter which documents are populated:
// Only populate active users as authors
const post = await Post.findById('postId')
.populate({
path: 'author',
match: { isActive: true }
})
.exec();
// If no matching document is found, the field will be null
console.log(post.author); // null if the author's isActive is false
Sorting Populated Data
For array references, you can sort the populated documents:
// Sort populated products by price (descending)
const order = await Order.findById('orderId')
.populate({
path: 'products',
options: { sort: { price: -1 } }
})
.exec();
Limiting Populated Data
For array references, you can limit the number of populated documents:
// Limit populated products to 5
const order = await Order.findById('orderId')
.populate({
path: 'products',
options: { limit: 5 }
})
.exec();
Populating Nested Paths
You can populate nested references:
const commentSchema = new mongoose.Schema({
text: String,
user: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User'
}
});
const postSchema = new mongoose.Schema({
title: String,
content: String,
comments: [commentSchema],
author: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User'
}
});
// Populate both the author and the user in each comment
const post = await Post.findById('postId')
.populate('author')
.populate('comments.user')
.exec();
Multi-level Population
You can perform multi-level population:
const courseSchema = new mongoose.Schema({
title: String,
instructor: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User'
}
});
const studentSchema = new mongoose.Schema({
name: String,
courses: [{
type: mongoose.Schema.Types.ObjectId,
ref: 'Course'
}]
});
// Populate courses for a student, then populate the instructor for each course
const student = await Student.findById('studentId')
.populate({
path: 'courses',
populate: {
path: 'instructor'
}
})
.exec();
Virtual Populations
Virtual populations allow you to define a virtual property that gets populated by documents from another collection. This is especially useful for reverse references.
Setting Up Virtual Populations
// Author schema
const authorSchema = new mongoose.Schema({
name: String,
bio: String
});
// Virtual property for the author's books
authorSchema.virtual('books', {
ref: 'Book', // The model to use
localField: '_id', // Find books where `localField`
foreignField: 'author', // is equal to `foreignField`
justOne: false // Set to true for one-to-one relationships
});
// Enable virtuals in your schema options
const Author = mongoose.model('Author', authorSchema, {
toJSON: { virtuals: true },
toObject: { virtuals: true }
});
// Book schema
const bookSchema = new mongoose.Schema({
title: String,
author: {
type: mongoose.Schema.Types.ObjectId,
ref: 'Author'
}
});
const Book = mongoose.model('Book', bookSchema);
Using Virtual Populations
// Find an author and populate their books
const author = await Author.findById('authorId')
.populate('books')
.exec();
console.log(author.books); // Array of books where author field is authorId
Analogy: Mutual Friends on Social Media
Virtual populations are like seeing your mutual friends on social media. The direct relationship is stored in one direction (you've friended them), but the platform can compute and show the reverse relationship (they're friends with you) without storing it explicitly.
Performance Considerations
While population is powerful, it's important to be mindful of its impact on performance:
- Multiple Database Queries: Each populate() call generally results in an additional query to the database.
- Large Result Sets: Be careful when populating arrays that may contain many documents.
- Deep Population: Multi-level population can lead to complex queries and large result sets.
Performance Tips
- Select only needed fields: Use field selection to limit the data retrieved.
- Use lean(): This returns plain JavaScript objects instead of Mongoose documents, reducing overhead.
- Consider denormalization: For frequently accessed data, consider embedding some fields instead of using references.
- Use indexing: Ensure that referenced fields are properly indexed.
// Optimized query with field selection and lean()
const post = await Post.findById('postId')
.select('title author') // Only select these fields
.populate('author', 'name') // Only populate the name from author
.lean() // Return plain JavaScript objects
.exec();
Real-World Example: Blog Platform
Let's build a comprehensive example of a blog platform using Mongoose references and population:
const mongoose = require('mongoose');
const { Schema } = mongoose;
// User schema
const userSchema = new Schema({
name: String,
email: {
type: String,
unique: true,
required: true
},
password: String, // In production, you'd hash this!
bio: String,
joinDate: {
type: Date,
default: Date.now
},
isAdmin: {
type: Boolean,
default: false
}
});
// Category schema
const categorySchema = new Schema({
name: {
type: String,
required: true,
unique: true
},
description: String,
slug: {
type: String,
required: true,
unique: true
}
});
// Comment schema
const commentSchema = new Schema({
content: {
type: String,
required: true
},
author: {
type: Schema.Types.ObjectId,
ref: 'User',
required: true
},
post: {
type: Schema.Types.ObjectId,
ref: 'Post',
required: true
},
createdAt: {
type: Date,
default: Date.now
},
isApproved: {
type: Boolean,
default: true
}
});
// Post schema
const postSchema = new Schema({
title: {
type: String,
required: true
},
content: {
type: String,
required: true
},
excerpt: String,
slug: {
type: String,
required: true,
unique: true
},
author: {
type: Schema.Types.ObjectId,
ref: 'User',
required: true
},
categories: [{
type: Schema.Types.ObjectId,
ref: 'Category'
}],
tags: [String],
featuredImage: String,
publishDate: {
type: Date,
default: Date.now
},
isPublished: {
type: Boolean,
default: false
},
viewCount: {
type: Number,
default: 0
}
}, {
timestamps: true,
toJSON: { virtuals: true },
toObject: { virtuals: true }
});
// Virtual for comments - reverse reference
postSchema.virtual('comments', {
ref: 'Comment',
localField: '_id',
foreignField: 'post'
});
// Virtual for user's posts - reverse reference
userSchema.virtual('posts', {
ref: 'Post',
localField: '_id',
foreignField: 'author'
});
// Create models
const User = mongoose.model('User', userSchema);
const Category = mongoose.model('Category', categorySchema);
const Post = mongoose.model('Post', postSchema);
const Comment = mongoose.model('Comment', commentSchema);
// Example usage: Getting a post with all related data
async function getCompletePost(postId) {
const post = await Post.findById(postId)
.populate('author', 'name email bio') // Populate author with selected fields
.populate('categories') // Populate categories
.populate({ // Populate comments and their authors
path: 'comments',
match: { isApproved: true }, // Only approved comments
options: { sort: { createdAt: -1 } }, // Sort by newest first
populate: {
path: 'author',
select: 'name'
}
})
.exec();
return post;
}
// Example usage: Getting a user with their posts
async function getUserWithPosts(userId) {
const user = await User.findById(userId)
.populate({
path: 'posts',
match: { isPublished: true }, // Only published posts
options: { sort: { publishDate: -1 } }, // Sort by newest first
select: 'title excerpt publishDate viewCount' // Only select these fields
})
.exec();
return user;
}
// Example usage: Getting posts by category
async function getPostsByCategory(categorySlug) {
// First find the category
const category = await Category.findOne({ slug: categorySlug });
if (!category) return [];
// Then find posts in this category
const posts = await Post.find({
categories: category._id,
isPublished: true
})
.sort({ publishDate: -1 })
.populate('author', 'name')
.populate('categories', 'name slug')
.select('title excerpt slug publishDate author categories')
.exec();
return posts;
}
module.exports = {
User,
Category,
Post,
Comment,
getCompletePost,
getUserWithPosts,
getPostsByCategory
};
This example demonstrates how to set up relationships between users, posts, categories, and comments, and how to use population to efficiently retrieve related data.
Practice Activities
Activity 1: E-commerce Product and Review System
Create schemas for an e-commerce system with products and reviews.
- Create User, Product, and Review schemas
- Set up references between them (a Product has Reviews, Reviews have a User author)
- Implement functions to:
- Get a product with all its reviews populated
- Get a user with all their reviews populated
- Create a new review for a product
Activity 2: Library Management System
Design schemas for a library system with books, authors, and borrowing records.
- Create Author, Book, User, and BorrowRecord schemas
- Set up appropriate references (Books have Authors, BorrowRecords reference Users and Books)
- Implement functions to:
- Get all books by a specific author
- Get a user's borrowing history
- Find all currently borrowed books
Activity 3: Social Media Application
Create schemas for a simple social media application.
- Create User, Post, and Comment schemas
- Implement a "friends" or "follows" relationship between users
- Use virtual populations to find:
- All posts by a user's friends
- All comments on a user's posts
- A user's activity feed (posts and comments)