Intro
Let's look at a MongoDB query that uses the Mongo Aggregation Framework (AF) that retrieves a resultset of docs between two points in time.
mgArr(dbEnum.nlpdb, collEnum.users_actions,
{ $addFields: { ts: { $toDate: "$_id" } } },
{
$match: {
ts: {
$gte: new Date("2021-02-01T00:00:00.000Z"),
$lt: new Date("2021-02-15T00:00:00.000Z")
}
}
},
{ $sort: { ts: 1 } },
{ $skip: 0 },
{ $limit: 100 },
)
Notes
There are 5 stages here.
extract the timestamp from the _id field (PK), as a new field called "ts" (timestamp)
filter (match) by dates that are greater than the earliest date and less than the latest date
sort by the earliest date (the 1 means ascending; a -1 would mean descending)
let's add pagination, which requires two stages, $skip and $limit. $skip: 0 mean get the first page of results without skiping to the next page
limit the resultset to the first 100 results, as page 1 (no skip)
These 5 stages are the raw AF (Aggregation Framework) syntaxes.
Version 2 - using functional wrappers
mgArr(dbEnum.nlpdb, collEnum.users_actions,
addField_ts(),
matchBetweenDates("2021-02-01", "2021-02-04"),
sortAsc("ts"),
paginate(100, 1),
)
Notes
We've made the code more terse and increased its readability by abstracting away the tedious and repetitious curl braces. Now it looks more functional, since it's just one function call after another, passing in our argument that are particular to this queries use case.
Notice the naming convention for the addField_ts func. The underscore indicates a namespacing. I use namespacing to clearly specify a category of funcs. I have a number of "addField" funcs that add fields to a resultset for particular common use cases. Adding a new timestamp field is just one common use case.
Notice it appears we have 4 stages now instead of 5. The paginate func composes two stages, $skip and $limit to give us a simpler pagination behavior. The paginate func returns the $skip and $limit, so technically there are still 5 stages sent to MongoDB as a raw query.
I made the matchBetweenDates func to be inclusive of the start and end date.
Here is the source code:
/**
@func
get docs between two dates
inclusive of days
@param {string} start - the earliest day to include
@param {string} end - the latest day to include
@return {object}
*/
export const matchBetweenDates = (start, end) => {
return {
$match: {
ts: {
$gte: new Date(start + "T00:00:00.000Z"),
$lte: new Date(end + "T23:59:59.000Z")
}
}
};
};
What's Next
We'll be covering more advanced usages of partitioning your data on date dimensions, to allow you better insights into your data, and make better predictions about the future.
As always, if you have any questions on this let me know.
Top comments (0)