The main reason we use NoSQL, typically, MongoDB, is to store and query big data in a scalable way.
Document Reference Pattern.
When we think of modelling NoSQL in a RDBMS way, we'll need to reference documents in other collections to link or join 2 pieces of related data.
// document in organization collection
{
_id: "google",
name: "Google"
}
// document in user collection
{
_id: "john",
name: "John Smith",
organization_id: "google"
}
{
_id: "jeff",
name: "Jeff Brown",
organization_id: "google"
}
So to find an organization and all the users in one query we need to use the aggregation framework:
db.getCollection('organization')
.aggregate([
{
$match: { _id: "google"}
},
{
$lookup: {
from: "user",
localField: "_id",
foreignField: "organization_id",
as : "users"
}
}
])
Result
{
"_id" : "google",
"name" : "Google",
"users" : [
{
"user_id" : "john",
"name" : "John Smith",
"organization_id" : "google"
},
{
"user_id" : "jeff",
"name" : "Jeff Brown",
"organization_id" : "google"
}
]
}
When using joins, our queries don't scale. The computation cost rises as data footprint increases.
Adjacency list pattern
Let's try the Adjacency list pattern for storing data:
Use one collection for all data. Let's call it "DATA"
//organization document in DATA collection
{
"_id": "org#google",
"name": "Google",
}
{
"_id": "org#microsoft",
"name": "Microsoft",
}
{
"_id": "org#apple",
"name": "Apple",
}
//user document in DATA collection
{
_id: "org#google#user#john",
name: "John Smith"
}
{
_id: "org#google#user#jeff",
name: "Jeff Brown"
}
{
_id: "org#apple#user#tim",
name: "Tim Cook"
}
Let's try to find an organization and all the users in one query.
db.getCollection('DATA').find({_id: {$regex: /^org#google/}})
The query finds all documents in the DATA collection starting where _id starts with "org#google"
Result
{
"_id" : "org#google",
"name" : "Google"
}
{
"_id" : "org#google#user#jeff",
"name" : "Jeff Brown"
}
{
"_id" : "org#google#user#john",
"name" : "John Smith"
}
We can retrieve the same data without a join, without adding indexes, without using the aggregation framework
Top comments (0)