MongoDB: You're doing it wrong!

#mongodb #database

The main reason we use NoSQL, typically, MongoDB, is to store and query big data in a scalable way.

Document Reference Pattern.

When we think of modelling NoSQL in a RDBMS way, we'll need to reference documents in other collections to link or join 2 pieces of related data.

// document in organization collection
{
   _id: "google",
   name: "Google"
}

// document in user collection
{
   _id: "john",
   name: "John Smith",
   organization_id: "google"

}

{
   _id: "jeff",
   name: "Jeff Brown",
   organization_id: "google"

}

So to find an organization and all the users in one query we need to use the aggregation framework:

db.getCollection('organization')
.aggregate([
  {
    $match: { _id: "google"}
  },
  {
    $lookup: {
        from: "user",
        localField: "_id",
        foreignField: "organization_id",
        as : "users"
    }
  }
])

Result

{
    "_id" : "google",
    "name" : "Google",
    "users" : [ 
        {
            "user_id" : "john",
            "name" : "John Smith",
            "organization_id" : "google"
        },
        {
            "user_id" : "jeff",
            "name" : "Jeff Brown",
            "organization_id" : "google"
        }
    ]
}

When using joins, our queries don't scale. The computation cost rises as data footprint increases.

Adjacency list pattern

Let's try the Adjacency list pattern for storing data:
Use one collection for all data. Let's call it "DATA"

//organization document in DATA collection
{
    "_id": "org#google",
    "name": "Google",
}
{
    "_id": "org#microsoft",
    "name": "Microsoft",
}
{
    "_id": "org#apple",
    "name": "Apple",
}

//user document in DATA collection
{
   _id: "org#google#user#john",
   name: "John Smith"
}
{
   _id: "org#google#user#jeff",
   name: "Jeff Brown"
}
{
   _id: "org#apple#user#tim",
   name: "Tim Cook"
}

Let's try to find an organization and all the users in one query.

db.getCollection('DATA').find({_id: {$regex: /^org#google/}})

The query finds all documents in the DATA collection starting where _id starts with "org#google"

Result


{
    "_id" : "org#google",
    "name" : "Google"
}

{
    "_id" : "org#google#user#jeff",
    "name" : "Jeff Brown"
}

{
    "_id" : "org#google#user#john",
    "name" : "John Smith"
}

We can retrieve the same data without a join, without adding indexes, without using the aggregation framework

DEV Community

MongoDB: You're doing it wrong!

Document Reference Pattern.

Result

Adjacency list pattern

Result

Top comments (0)

Read next

Smart Bookmark Manager

Resolving MongoDB Error When Starting with Homebrew on macOS

How to connect and query multiple databases with a single REST API

The Medallion Architecture: Refining Data from Bronze to Gold 🏅