If you have ever worked with MongoDB, chances are you've heard of or even come across the aggregation pipeline that comes with it. Aggregation pipeline in MongoDB is a very powerful feature to get insights out of your data that is not possible with simple find operation.
In this article, we're going to take at the project stage. We do have the projection support in find queries as well but it cannot achieve what project stage in the aggregation pipeline can. Today we're going to explore some of the common use-cases for the project stage:
- Projecting all fields
- Working with arrays
- Naming field in project stage
- Filtering fields
- Literal fields
- Conditionally removing fields
Establishing the data
Before we jump into the aggregation pipeline and the project stage, we would need some data to work with. I'm taking an example Restaurants
collection for understanding the concept here. I'll be linking to a MongoDB playground for each query so that you can play around and see how it affects the output.
Here's the Restaurants
collection with only 5 documents containing random data:
{
name: 'Thai Restaurant',
location: 'London',
notOpenOnDays: ['Monday', 'Thursday'],
entryFee: '$5',
rating: '3',
},
{
name: 'Indian.com',
location: 'Broughton',
notOpenOnDays: ['Wednesday'],
entryFee: '$0',
rating: '4',
},
{
name: '23rd Street Pizza',
location: 'Fallowfield',
notOpenOnDays: ['Sunday'],
entryFee: '$2',
rating: '4',
},
{
name: '7 STAR Pizza',
location: 'Newcastle',
notOpenOnDays: ['Saturday', 'Sunday'],
entryFee: '$10',
rating: '2',
},
{
name: 'A Cake A Shake',
location: 'London',
notOpenOnDays: ['Monday'],
entryFee: '$1',
rating: '4',
},
Projecting all fields
In the MongoDB aggregation pipeline, a pipeline stage has access to only those fields which are included in the previous stage.
So if you want to include all fields, you can include everything in the project stage so that it gets passed to the next one. Any field missed is not going to make it to the next stage of the pipeline.
MongoDB gives us a way to state which fields we want to include and exclude. So if you want to include only the name, the project stage would be:
{
$project: { name: 1 }
}
You can also put true
or any number greater than 1
to show inclusion of a field but let's just stick to 1
for simplicity & conciseness.
Similarly, if you want to exclude the field you'd do the opposite:
{
$project: { name: 0 }
}
And, if you want to include everything in the Restaurants
collection, here's what the project stage would look like:
{
$project: {
name: 1,
location: 1,
notOpenOnDays: 1,
entryFee: 1,
rating: 1,
}
}
NOTE: Including a field in the project stage implicitly means excluding all the other fields and vice-versa.
Working with arrays
Creating arrays
Creating a new array field is just as creating any non-array field, the only difference is that the field value is wrapped in []
to make it an array.
We can do it like this:
{
$project: {
review: [ "$rating" ]
}
}
Now review
field will be an array with rating values in it.
Extracting element(s) at specific index
To get an array element at a specific index, we use use $arrayElemAt
like shown below,
{
$project: {
fieldName: { $arrayElemAt: ["$arrayFieldName", index] }
}
}
With the latest MongoDB version 4.4, we now also have dedicated operators to get first and last elements in an array. The operators are $first
and $last
.
{
$project: {
review: [ "$rating" ]
}
},
{
$project: {
newReview: { $first: "$review" },
_id: 0
}
}
We are putting the restaurant rating in an array named review and in the next project stage we are extracting the first element of the array created in the stage before. This gives us:
{ "newReview" : 3 }
{ "newReview" : 4 }
{ "newReview" : 4 }
{ "newReview" : 2 }
{ "newReview" : 4 }
NOTE: We can chain as many project stages as we want in an aggregation pipeline.
{
$project: {
review: [ "$rating" ]
}
},
{
$project: {
newReview: { $last: "$review" },
_id: 0
}
}
This will give the same output as there is only one element in the array currently, let's add some values to the reviews
array.
{
$project: {
review: [ 1, 2, 3, "$rating" ]
}
},
{
$project: {
newReview: { $last: "$review" },
_id: 0
}
}
We would get the ratings as the output of this as we have it in the last position:
{ "newReview" : 3 }
{ "newReview" : 4 }
{ "newReview" : 4 }
{ "newReview" : 2 }
{ "newReview" : 4 }
Filtering values
Filter lets you write custom filter which is applied to every element in the array and only the elements which pass the filter are passed as output to next stage.
{
$project: {
_id: 0,
openOnWeekend: {
$filter: {
input: "$notOpenOnDays",
as: "notOpen",
cond: {
$and: [
{ $ne: ["$$notOpen", "Saturday"] },
{ $ne: ["$$notOpen","Sunday"] },
]
}
}
}
}
}
Output of the above aggregation query would only contain the restaurants which are open on weekends:
[
{
"openOnWeekend": [ "Monday", "Thursday" ]
},
{
"openOnWeekend": [ "Wednesday" ]
},
{
"openOnWeekend": []
},
{
"openOnWeekend": []
},
{
"openOnWeekend": [ "Monday" ]
}
]
NOTE: We use
$
for values present in our document and$$
for fields that are introduced within a pipeline stage.
Transforming data
$map
operator is just like the array.map()
in javascript, it takes an input array, applies an expression on each item and return the transformed array.
Here you can directly transform data as compared to filter where you need to provide a condition to handle the data appropriately.
Let's transform the notOpenOnDays
array and shorten the weekday names:
{
$project: {
_id: 0,
newPrice: {
$map: {
input: "$notOpenOnDays",
as: "noOpenOn",
in: {
$substrCP: [ "$$noOpenOn", 0, 3 ]
}
}
}
}
}
We have the substring operator to get only first 3 characters and remove the rest:
[
{
"newPrice": [ "Mon", "Thu" ]
},
{
"newPrice": [ "Wed" ]
},
{
"newPrice": [ "Sun" ]
},
{
"newPrice": [ "Sat", "Sun" ]
},
{
"newPrice": [ "Mon" ]
}
]
There are a bunch of other array expression operators, you can refer them here.
Naming field in project stage
The aggregation pipeline provides great flexibility when it comes to naming fields. Any name you specify in the project stage acts as the "new" name for that field from that point forwards.
You give an existing field a new name which to effectively rename it:
{
$project: {
closedOn: "$notOpenOnDays"
}
}
Now the output will contain closedOn
field with notOpenOnDays
array as value.
Literal fields
By default, whatever you put in the project stage is treated as an expression and it is evaluated to a value. You can use $literal
operator if you want MongoDB to treat any field's value like a constant or literal.
Let's say we want to find out the restaurants where the entry fee is exactly equal to "$1", we can leverage literal operator to achieve this:
{
"$project": {
affordable: {
$eq: [
"$entryFee",
{
$literal: "$1"
}
]
}
}
},
{
"$project": {
_id: 0,
status: {
"$cond": {
"if": "$affordable",
"then": "restaurant is affordable ππ",
"else": "Not affordable π"
}
}
}
}
You can chain as many project stages as you need in the aggregation pipeline. Output of the above pipeline evaluates to ππ»
[
{
"status": "Not affordable π"
},
{
"status": "Not affordable π"
},
{
"status": "Not affordable π"
},
{
"status": "Not affordable π"
},
{
"status": "restaurant is affordable ππ"
}
]
Conditionally excluding fields
In the above example, we saw a bunch of redundant information in the output. We are only interested in knowing the affordable restaurants so it makes sense to not output the other ones.
With MongoDB 3.6, you can remove fields from the output conditionally. Let's modify our above project stage to only output affordable restaurants:
{
"$project": {
name: 1,
affordable: {
$eq: [
"$entryFee",
{
$literal: "$1"
}
]
}
}
},
{
"$project": {
_id: 0,
status: {
"$cond": {
"if": "$affordable",
"then": "$name",
"else": "$$REMOVE"
}
}
}
}
- We're now including the name in the first project stage to make sure it is available in the next stage.
-
$cond
operator is updated to show the name of only the affordable restaurants.
Output:
[
{},
{},
{},
{},
{
"status": "A Cake A Shake"
}
]
Conclusion
So to summarize we have seen that the project stage in the MongoDB aggregation pipeline can do much more than just include & exclude fields. We can create arrays, modify its elements, perform filters, slice strings, conditionally remove fields and get data as per our needs.
If you think something is missing or have any questions, please post it down in the comments. I'd be happy to connect!
Top comments (0)