TL;TR: I research how to modelize MongoDB operators in PHP. From namespaced function names, to static method on classes or enums, to closures in variables. I'm looking for the best developer experience.
About MongoDB aggregation pipeline operators
As developer in the MongoDB PHP driver team, my goal is to provide the best developer experience to PHP developers using the document database.
MongoDB provides drivers for various languages including PHP. In order to ease create of aggregation pipelines in PHP, we need to modeling all the stages and operators as functions that can be combined.
An aggregation pipeline is a list of "stage" documents. We will take an example doing a query with $match
and a join with $lookup
:
db.orders.aggregate([
{
$match: {
$or: [
{ status: "shipped" },
{ created_at: { $gte: ISODate("2023-01-01T00:00:00Z") } }
]
}
},
{
$lookup: {
from: "inventory",
localField: "product_id",
foreignField: "product_id",
as: "inventory_docs"
}
}
])
Each key with a dollar prefix is an operator for which we want to provide a factory method.
Namespaced functions
The most obvious solution is to create namespaced functions, like: MongoDB\Operator\eq
for the $eq
operator.
namespace MongoDB\Operator;
function eq(mixed $value): array {
return ['$eq' => $value];
}
function lookup(string $from, string $localField, string $foreignField, string $as): array {
return ['$lookup' => [
'from' => $from,
'localField' => $localField,
'foreignField' => $foreignField,
'as' => $as,
]];
}
Using this functions with named arguments, the pipeline would be written in PHP:
pipeline(
match(
or(
query(status: eq('shipped')),
query(date: gte(new UTCDateTime())),
),
),
lookup(
from: 'inventory',
localField: 'product_id',
foreignField: 'product_id',
as: 'inventory_docs',
),
);
But, some operator names conflict with the reserved keywords in PHP. We cannot create functions (global or namespaced) with the following operator names: $and
, $or
, $match
, and $unset
.
Adding suffix to function names
To avoid this problem of reserved names, we can add a prefix or a suffix to the function names.
With the type of operator as suffix:
function andQuery(...) { /* ... */ }
function matchStage(...) { /* ... */ }
With an underscore:
function _and(...) { /* ... */ }
function _match(...) { /* ... */ }
Or with an emoji. Beautiful, but not practical:
function 🔍and(...) { /* ... */ }
function 💲match(...) { /* ... */ }
Static class methods
By chance, the list of reserved keywords is shorter for method names. We can create static methods on classes.
final class Stage {
public static function lookup(...) { /* ... */ }
public static function match(...) { /* ... */ }
}
final class Query {
public static function and(...) { /* ... */ }
public static function eq(...) { /* ... */ }
}
Writing got a little bit longer, but it's still readable.
new Pipeline(
Stage::match(
Query::or(
Query::query(status: Query::eq('shipped')),
Query::query(date: Query::gte(new UTCDateTime())),
),
),
Stage::lookup(
from: 'inventory',
localField: 'product_id',
foreignField: 'product_id',
as: 'inventory_docs',
),
);
To prevent anyone from creating instances of this class, we can make the constructor private.
final class Operator {
// ...
// This constructor can't be called
private function __construct() {}
}
We can also use an enum
without case. Enum accepts static methods and cannot be instantiated.
enum Query {
public static function and() { /* ... */ }
public static function eq() { /* ... */ }
}
Both the class and the enum static method can be called the same way.
Closures in variables
By not finding the ideal solution, we start raving about improbable solutions.
An idea that comes if we want a short syntax that looks very similar to the MongoDB syntax, without name restrictions, is to use variables to store the closures. Note that (...)
is a new syntax to create closures in PHP 8.1.
$eq = Operator::eq(...);
$and = Operator::and(...);
Don't trust the syntax hightlighter.
It's so wonderful that PHP uses the same dollar $
sign to prefix variables that MongoDB uses for operators.
pipeline(
$match(
$or(
$query(status: $eq('shipped')),
$query(date: $gte(new UTCDateTime())),
),
),
$lookup(
from: 'inventory',
localField: 'product_id',
foreignField: 'product_id',
as: 'inventory_docs',
),
);
These closures could be made available by the library as an array.
enum Query {
public static function and(array ...$queries) { /* ... */ }
public static function eq(mixed $value) { /* ... */ }
public static function query(mixed ...$query) { /* ... */ }
/** @return array{and:callable,eq:callable,query:callable} */
public static function functions(): array {
return [
'and' => self::and(...),
'eq' => self::eq(...),
'query' => self::query(...),
];
}
}
The syntax to get all the variables is a little bit verbose, but it's still readable.
['and' => $and, 'eq' => $eq, 'query' => $query] = Query::functions();
We can import all the variables into the current scope with the magical extract
function that is so much used in Laravel, but so much hated by PHPStorm and static analysis tools.
extract(Query::functions());
var_dump($and(
$query(foo: $eq(5)),
$query(bar: $eq(10))
));
// INFO: MixedFunctionCall - Cannot call function on mixed
See psalm analysis: https://psalm.dev/r/3bbbafd469
Conclusion
As you can see, naming functions in PHP isn't all that simple when it comes to using reserved keywords. We haven't yet decided how we're going to proceed. If you have any ideas or remarks, please leave a comment.
Please note that the code has been simplified, the functions will be more complex to offer more features and type safety.
Top comments (0)