DEV Community

Cover image for How to overcome PHP's naming constraints to model MongoDB operators
Jérôme TAMARELLE
Jérôme TAMARELLE

Posted on • Edited on

How to overcome PHP's naming constraints to model MongoDB operators

TL;TR: I research how to modelize MongoDB operators in PHP. From namespaced function names, to static method on classes or enums, to closures in variables. I'm looking for the best developer experience.

About MongoDB aggregation pipeline operators

As developer in the MongoDB PHP driver team, my goal is to provide the best developer experience to PHP developers using the document database.

MongoDB provides drivers for various languages including PHP. In order to ease create of aggregation pipelines in PHP, we need to modeling all the stages and operators as functions that can be combined.

An aggregation pipeline is a list of "stage" documents. We will take an example doing a query with $match and a join with $lookup:

db.orders.aggregate([
    {
        $match: {
            $or: [
                { status: "shipped" },
                { created_at: { $gte: ISODate("2023-01-01T00:00:00Z") } }
            ]
        }
    },
    {
        $lookup: {
            from: "inventory",
            localField: "product_id",
            foreignField: "product_id",
            as: "inventory_docs"
        }
    }
])
Enter fullscreen mode Exit fullscreen mode

Each key with a dollar prefix is an operator for which we want to provide a factory method.

Namespaced functions

The most obvious solution is to create namespaced functions, like: MongoDB\Operator\eq for the $eq operator.

namespace MongoDB\Operator;

function eq(mixed $value): array {
    return ['$eq' => $value];
}

function lookup(string $from, string $localField, string $foreignField, string $as): array {
    return ['$lookup' => [
        'from' => $from,
        'localField' => $localField,
        'foreignField' => $foreignField,
        'as' => $as,
    ]];
}
Enter fullscreen mode Exit fullscreen mode

Using this functions with named arguments, the pipeline would be written in PHP:

pipeline(
    match(
        or(
            query(status: eq('shipped')),
            query(date: gte(new UTCDateTime())),
        ),
    ),
    lookup(
        from: 'inventory',
        localField: 'product_id',
        foreignField: 'product_id',
        as: 'inventory_docs',
    ),
);
Enter fullscreen mode Exit fullscreen mode

But, some operator names conflict with the reserved keywords in PHP. We cannot create functions (global or namespaced) with the following operator names: $and, $or, $match, and $unset.

Adding suffix to function names

To avoid this problem of reserved names, we can add a prefix or a suffix to the function names.

With the type of operator as suffix:

function andQuery(...) { /* ... */ }
function matchStage(...) { /* ... */ }
Enter fullscreen mode Exit fullscreen mode

With an underscore:

function _and(...) { /* ... */ }
function _match(...) { /* ... */ }
Enter fullscreen mode Exit fullscreen mode

Or with an emoji. Beautiful, but not practical:

function 🔍and(...) { /* ... */ }
function 💲match(...) { /* ... */ }
Enter fullscreen mode Exit fullscreen mode

Static class methods

By chance, the list of reserved keywords is shorter for method names. We can create static methods on classes.

final class Stage {
    public static function lookup(...) { /* ... */ }
    public static function match(...) { /* ... */ }
}
final class Query {
    public static function and(...) { /* ... */ }
    public static function eq(...) { /* ... */ }
}
Enter fullscreen mode Exit fullscreen mode

Writing got a little bit longer, but it's still readable.

new Pipeline(
    Stage::match(
        Query::or(
            Query::query(status: Query::eq('shipped')),
            Query::query(date: Query::gte(new UTCDateTime())),
        ),
    ),
    Stage::lookup(
        from: 'inventory',
        localField: 'product_id',
        foreignField: 'product_id',
        as: 'inventory_docs',
    ),
);
Enter fullscreen mode Exit fullscreen mode

To prevent anyone from creating instances of this class, we can make the constructor private.

final class Operator {
    // ...
    // This constructor can't be called
    private function __construct() {}
}
Enter fullscreen mode Exit fullscreen mode

We can also use an enum without case. Enum accepts static methods and cannot be instantiated.

enum Query {
    public static function and() { /* ... */ }
    public static function eq() { /* ... */ }
}
Enter fullscreen mode Exit fullscreen mode

Both the class and the enum static method can be called the same way.

Closures in variables

By not finding the ideal solution, we start raving about improbable solutions.

An idea that comes if we want a short syntax that looks very similar to the MongoDB syntax, without name restrictions, is to use variables to store the closures. Note that (...) is a new syntax to create closures in PHP 8.1.

$eq = Operator::eq(...);
$and = Operator::and(...);
Enter fullscreen mode Exit fullscreen mode

Don't trust the syntax hightlighter.

It's so wonderful that PHP uses the same dollar $ sign to prefix variables that MongoDB uses for operators.

pipeline(
    $match(
        $or(
            $query(status: $eq('shipped')),
            $query(date: $gte(new UTCDateTime())),
        ),
    ),
    $lookup(
        from: 'inventory',
        localField: 'product_id',
        foreignField: 'product_id',
        as: 'inventory_docs',
    ),
);
Enter fullscreen mode Exit fullscreen mode

These closures could be made available by the library as an array.

enum Query {
    public static function and(array ...$queries) { /* ... */ }
    public static function eq(mixed $value) { /* ... */ }
    public static function query(mixed ...$query) { /* ... */ }

    /** @return array{and:callable,eq:callable,query:callable} */
    public static function functions(): array {
        return [
            'and' => self::and(...),
            'eq' => self::eq(...),
            'query' => self::query(...),
        ];
    }
}
Enter fullscreen mode Exit fullscreen mode

The syntax to get all the variables is a little bit verbose, but it's still readable.

['and' => $and, 'eq' => $eq, 'query' => $query] = Query::functions();
Enter fullscreen mode Exit fullscreen mode

We can import all the variables into the current scope with the magical extract function that is so much used in Laravel, but so much hated by PHPStorm and static analysis tools.

extract(Query::functions());

var_dump($and(
    $query(foo: $eq(5)),
    $query(bar: $eq(10))
));

// INFO: MixedFunctionCall - Cannot call function on mixed
Enter fullscreen mode Exit fullscreen mode

See psalm analysis: https://psalm.dev/r/3bbbafd469

Conclusion

As you can see, naming functions in PHP isn't all that simple when it comes to using reserved keywords. We haven't yet decided how we're going to proceed. If you have any ideas or remarks, please leave a comment.

Please note that the code has been simplified, the functions will be more complex to offer more features and type safety.

Top comments (0)