Mongoose Population and References

Building Relationships Between Documents

Understanding Document Relationships

In real-world applications, data is rarely isolated. Most information is related to other pieces of data in some way. For example, a blog post is written by a user, a product belongs to a category, or an order contains multiple products.

MongoDB is a document database, which means it's non-relational by design. However, applications often need to work with related data. Mongoose provides powerful tools to create and manage relationships between documents.

Analogy: Social Network

Think of MongoDB collections like different groups of people. Just as people in real life have connections and relationships with others (friends, family, colleagues), documents in MongoDB can have relationships with other documents. Mongoose's population feature is like having a personal assistant who can quickly find all your friends' information when needed, even though they're stored in different address books.

graph LR A[Document] -->|References| B[Document] C[Document] -->|Embeds| D[Subdocument] style A fill:#f5f5f5,stroke:#333,stroke-width:2px style B fill:#f5f5f5,stroke:#333,stroke-width:2px style C fill:#f5f5f5,stroke:#333,stroke-width:2px style D fill:#d9f7be,stroke:#333,stroke-width:2px

Types of Relationships in MongoDB

There are two primary ways to model relationships in MongoDB:

Embedded Documents (Subdocuments)

In this approach, related data is nested within the parent document. This is like keeping all your contact information (address, phone numbers, emails) on a single business card.


// Example of embedded documents
const userSchema = new mongoose.Schema({
  name: String,
  email: String,
  address: {
    street: String,
    city: String,
    state: String,
    zipCode: String
  },
  phoneNumbers: [{
    type: String,
    label: String
  }]
});
            

References (Document References)

This approach stores the ID of a related document. It's similar to how a library catalog might reference a book's location rather than containing the book itself.


// Example of document references
const authorSchema = new mongoose.Schema({
  name: String,
  bio: String,
  website: String
});

const bookSchema = new mongoose.Schema({
  title: String,
  summary: String,
  author: {
    type: mongoose.Schema.Types.ObjectId,
    ref: 'Author'  // This references the Author model
  }
});
            
classDiagram class Embedded { +Pros: Fast reads +Pros: Atomic operations +Pros: Single query access +Cons: Document size limit +Cons: Cannot query directly +Cons: Duplication for many-to-many } class Referenced { +Pros: No duplication +Pros: Smaller documents +Pros: Independent queries +Cons: Multiple queries +Cons: No atomic operations +Cons: Consistency challenges }

When to Use Each Approach

Use Embedded Documents When:

  • The embedded data is always loaded with the parent
  • The embedded data is relatively small
  • The embedded data is updated together with the parent
  • You have a "contains" relationship (e.g., a user contains addresses)
  • The embedded data is specific to the parent and not shared

Use References When:

  • The referenced data is often accessed independently
  • The referenced data is large
  • The referenced data is shared among multiple documents
  • You have many-to-many relationships
  • The data grows unbounded (e.g., comments on a post)

Real-World Example: E-commerce Application

In an e-commerce platform:

  • Embedded: Product variants (sizes, colors) within a product document
  • Referenced: Customer information referenced by orders

Amazon uses a mix of both approaches in their database design. Product information might contain embedded documents for variations, while orders reference customer profiles stored separately.

Creating References in Mongoose

To create a reference in Mongoose, use the mongoose.Schema.Types.ObjectId type along with the ref option to specify the model being referenced.

Single Reference


const postSchema = new mongoose.Schema({
  title: String,
  content: String,
  author: {
    type: mongoose.Schema.Types.ObjectId,
    ref: 'User'  // References the User model
  }
});

const Post = mongoose.model('Post', postSchema);
            

Array of References


const productSchema = new mongoose.Schema({
  name: String,
  price: Number,
  categories: [{
    type: mongoose.Schema.Types.ObjectId,
    ref: 'Category'  // References the Category model
  }]
});

const Product = mongoose.model('Product', productSchema);
            

Creating Documents with References


// Create a user
const user = new User({
  name: 'John Doe',
  email: 'john@example.com'
});

// Save the user
await user.save();

// Create a post with a reference to the user
const post = new Post({
  title: 'Introduction to Mongoose',
  content: 'Mongoose is an ODM for MongoDB...',
  author: user._id  // Store the user's ID as a reference
});

// Save the post
await post.save();
            
User Collection User Document _id: ObjectId('123abc') name: 'John Doe' email: 'john@example.com' Post Collection Post Document _id: ObjectId('456def') title: 'Introduction...' author: ObjectId('123abc') Reference

Population: Retrieving Referenced Documents

Population is the process of automatically replacing the specified path(s) in a document with document(s) from other collection(s). It's like joining tables in SQL databases, but done in your application code.

Basic Population


// Find a post and populate the author field
const post = await Post.findById('postId')
  .populate('author')
  .exec();

console.log(post);
/* Output:
{
  _id: ObjectId('postId'),
  title: 'Introduction to Mongoose',
  content: 'Mongoose is an ODM for MongoDB...',
  author: {
    _id: ObjectId('userId'),
    name: 'John Doe',
    email: 'john@example.com'
  }
}
*/
            

Populating Multiple Fields


const orderSchema = new mongoose.Schema({
  products: [{ 
    type: mongoose.Schema.Types.ObjectId, 
    ref: 'Product' 
  }],
  customer: { 
    type: mongoose.Schema.Types.ObjectId, 
    ref: 'User' 
  },
  shippingAddress: { 
    type: mongoose.Schema.Types.ObjectId, 
    ref: 'Address' 
  }
});

// Populate multiple fields
const order = await Order.findById('orderId')
  .populate('products')
  .populate('customer')
  .populate('shippingAddress')
  .exec();
            

Chaining Populate Method Calls


// Alternative syntax using method chaining
const order = await Order.findById('orderId')
  .populate('products')
  .populate('customer')
  .populate('shippingAddress')
  .exec();
            

Populating with a Single Object


// Populate multiple paths with a single object
const order = await Order.findById('orderId')
  .populate([
    { path: 'products' },
    { path: 'customer' },
    { path: 'shippingAddress' }
  ])
  .exec();
            

Advanced Population Features

Selective Field Population

You can choose which fields from the referenced document to include or exclude:


// Only populate specific fields from the author
const post = await Post.findById('postId')
  .populate('author', 'name') // Only include name field
  .exec();

// Exclude specific fields
const post = await Post.findById('postId')
  .populate('author', '-email -password') // Exclude email and password
  .exec();
            

Query Conditions for Population

You can add query conditions to filter which documents are populated:


// Only populate active users as authors
const post = await Post.findById('postId')
  .populate({
    path: 'author',
    match: { isActive: true }
  })
  .exec();

// If no matching document is found, the field will be null
console.log(post.author); // null if the author's isActive is false
            

Sorting Populated Data

For array references, you can sort the populated documents:


// Sort populated products by price (descending)
const order = await Order.findById('orderId')
  .populate({
    path: 'products',
    options: { sort: { price: -1 } }
  })
  .exec();
            

Limiting Populated Data

For array references, you can limit the number of populated documents:


// Limit populated products to 5
const order = await Order.findById('orderId')
  .populate({
    path: 'products',
    options: { limit: 5 }
  })
  .exec();
            

Populating Nested Paths

You can populate nested references:


const commentSchema = new mongoose.Schema({
  text: String,
  user: {
    type: mongoose.Schema.Types.ObjectId,
    ref: 'User'
  }
});

const postSchema = new mongoose.Schema({
  title: String,
  content: String,
  comments: [commentSchema],
  author: {
    type: mongoose.Schema.Types.ObjectId,
    ref: 'User'
  }
});

// Populate both the author and the user in each comment
const post = await Post.findById('postId')
  .populate('author')
  .populate('comments.user')
  .exec();
            

Multi-level Population

You can perform multi-level population:


const courseSchema = new mongoose.Schema({
  title: String,
  instructor: {
    type: mongoose.Schema.Types.ObjectId,
    ref: 'User'
  }
});

const studentSchema = new mongoose.Schema({
  name: String,
  courses: [{
    type: mongoose.Schema.Types.ObjectId,
    ref: 'Course'
  }]
});

// Populate courses for a student, then populate the instructor for each course
const student = await Student.findById('studentId')
  .populate({
    path: 'courses',
    populate: {
      path: 'instructor'
    }
  })
  .exec();
            

Virtual Populations

Virtual populations allow you to define a virtual property that gets populated by documents from another collection. This is especially useful for reverse references.

Setting Up Virtual Populations


// Author schema
const authorSchema = new mongoose.Schema({
  name: String,
  bio: String
});

// Virtual property for the author's books
authorSchema.virtual('books', {
  ref: 'Book',         // The model to use
  localField: '_id',   // Find books where `localField`
  foreignField: 'author', // is equal to `foreignField`
  justOne: false       // Set to true for one-to-one relationships
});

// Enable virtuals in your schema options
const Author = mongoose.model('Author', authorSchema, {
  toJSON: { virtuals: true },
  toObject: { virtuals: true }
});

// Book schema
const bookSchema = new mongoose.Schema({
  title: String,
  author: {
    type: mongoose.Schema.Types.ObjectId,
    ref: 'Author'
  }
});

const Book = mongoose.model('Book', bookSchema);
            

Using Virtual Populations


// Find an author and populate their books
const author = await Author.findById('authorId')
  .populate('books')
  .exec();

console.log(author.books); // Array of books where author field is authorId
            

Analogy: Mutual Friends on Social Media

Virtual populations are like seeing your mutual friends on social media. The direct relationship is stored in one direction (you've friended them), but the platform can compute and show the reverse relationship (they're friends with you) without storing it explicitly.

Performance Considerations

While population is powerful, it's important to be mindful of its impact on performance:

Performance Tips


// Optimized query with field selection and lean()
const post = await Post.findById('postId')
  .select('title author') // Only select these fields
  .populate('author', 'name') // Only populate the name from author
  .lean() // Return plain JavaScript objects
  .exec();
            

Real-World Example: Blog Platform

Let's build a comprehensive example of a blog platform using Mongoose references and population:


const mongoose = require('mongoose');
const { Schema } = mongoose;

// User schema
const userSchema = new Schema({
  name: String,
  email: {
    type: String,
    unique: true,
    required: true
  },
  password: String, // In production, you'd hash this!
  bio: String,
  joinDate: {
    type: Date,
    default: Date.now
  },
  isAdmin: {
    type: Boolean,
    default: false
  }
});

// Category schema
const categorySchema = new Schema({
  name: {
    type: String,
    required: true,
    unique: true
  },
  description: String,
  slug: {
    type: String,
    required: true,
    unique: true
  }
});

// Comment schema
const commentSchema = new Schema({
  content: {
    type: String,
    required: true
  },
  author: {
    type: Schema.Types.ObjectId,
    ref: 'User',
    required: true
  },
  post: {
    type: Schema.Types.ObjectId,
    ref: 'Post',
    required: true
  },
  createdAt: {
    type: Date,
    default: Date.now
  },
  isApproved: {
    type: Boolean,
    default: true
  }
});

// Post schema
const postSchema = new Schema({
  title: {
    type: String,
    required: true
  },
  content: {
    type: String,
    required: true
  },
  excerpt: String,
  slug: {
    type: String,
    required: true,
    unique: true
  },
  author: {
    type: Schema.Types.ObjectId,
    ref: 'User',
    required: true
  },
  categories: [{
    type: Schema.Types.ObjectId,
    ref: 'Category'
  }],
  tags: [String],
  featuredImage: String,
  publishDate: {
    type: Date,
    default: Date.now
  },
  isPublished: {
    type: Boolean,
    default: false
  },
  viewCount: {
    type: Number,
    default: 0
  }
}, {
  timestamps: true,
  toJSON: { virtuals: true },
  toObject: { virtuals: true }
});

// Virtual for comments - reverse reference
postSchema.virtual('comments', {
  ref: 'Comment',
  localField: '_id',
  foreignField: 'post'
});

// Virtual for user's posts - reverse reference
userSchema.virtual('posts', {
  ref: 'Post',
  localField: '_id',
  foreignField: 'author'
});

// Create models
const User = mongoose.model('User', userSchema);
const Category = mongoose.model('Category', categorySchema);
const Post = mongoose.model('Post', postSchema);
const Comment = mongoose.model('Comment', commentSchema);

// Example usage: Getting a post with all related data
async function getCompletePost(postId) {
  const post = await Post.findById(postId)
    .populate('author', 'name email bio') // Populate author with selected fields
    .populate('categories') // Populate categories
    .populate({ // Populate comments and their authors
      path: 'comments',
      match: { isApproved: true }, // Only approved comments
      options: { sort: { createdAt: -1 } }, // Sort by newest first
      populate: {
        path: 'author',
        select: 'name'
      }
    })
    .exec();
  
  return post;
}

// Example usage: Getting a user with their posts
async function getUserWithPosts(userId) {
  const user = await User.findById(userId)
    .populate({
      path: 'posts',
      match: { isPublished: true }, // Only published posts
      options: { sort: { publishDate: -1 } }, // Sort by newest first
      select: 'title excerpt publishDate viewCount' // Only select these fields
    })
    .exec();
  
  return user;
}

// Example usage: Getting posts by category
async function getPostsByCategory(categorySlug) {
  // First find the category
  const category = await Category.findOne({ slug: categorySlug });
  if (!category) return [];
  
  // Then find posts in this category
  const posts = await Post.find({
    categories: category._id,
    isPublished: true
  })
    .sort({ publishDate: -1 })
    .populate('author', 'name')
    .populate('categories', 'name slug')
    .select('title excerpt slug publishDate author categories')
    .exec();
  
  return posts;
}

module.exports = {
  User,
  Category,
  Post,
  Comment,
  getCompletePost,
  getUserWithPosts,
  getPostsByCategory
};
            

This example demonstrates how to set up relationships between users, posts, categories, and comments, and how to use population to efficiently retrieve related data.

Practice Activities

Activity 1: E-commerce Product and Review System

Create schemas for an e-commerce system with products and reviews.

  1. Create User, Product, and Review schemas
  2. Set up references between them (a Product has Reviews, Reviews have a User author)
  3. Implement functions to:
    • Get a product with all its reviews populated
    • Get a user with all their reviews populated
    • Create a new review for a product

Activity 2: Library Management System

Design schemas for a library system with books, authors, and borrowing records.

  1. Create Author, Book, User, and BorrowRecord schemas
  2. Set up appropriate references (Books have Authors, BorrowRecords reference Users and Books)
  3. Implement functions to:
    • Get all books by a specific author
    • Get a user's borrowing history
    • Find all currently borrowed books

Activity 3: Social Media Application

Create schemas for a simple social media application.

  1. Create User, Post, and Comment schemas
  2. Implement a "friends" or "follows" relationship between users
  3. Use virtual populations to find:
    • All posts by a user's friends
    • All comments on a user's posts
    • A user's activity feed (posts and comments)

Further Reading