DEV Community

Cover image for Unlocking the Mystery: Deciphering the Enigmatic Code of URL's Glob Patterns in cy.intercept()
Sebastian Clavijo Suero
Sebastian Clavijo Suero

Posted on

Unlocking the Mystery: Deciphering the Enigmatic Code of URL's Glob Patterns in cy.intercept()

Glob Patterns: A Hilarious Journey through Slashes and Asterisks!

(Cover image from pexels.com by cottonbro studio)


ACT 1: EXPOSITION

I don't know about you, but I almost never manage to nail down a Glob Pattern URL on the first try when using it in cy.intercept().

Sure, it seems easy enough at first glance! But let's be honest—most of the time, you don't get it right on the first attempt, either!

And that's Okay. I promise I won't judge you if you don't judge me. 😉

So, I decided to unravel this enigma once and for all, crafting the definitive solution with strategies and even a cheat sheet that will help us nail down these URL glob patterns in our Cypress interceptions—making this article the ultimate Baba Yaga of glob patterns, striking fear into the heart of any misplaced "path segment"!


ACT 2: CONFRONTATION

THE THEORETIC

Alright, in Cypress, URLs used in an intercept can be:

  • Full URL (e.g. https://example.com/users/admin)
  • Relative URL (e.g. /users/admin or users/admin—both represent the same)

Also, a URL can be specified as a String, a Glob, or a RegExp. In this article, we will focus on Glob patterns.

⚠️ IMPORTANT: For relative URLs in an intercept, they will be relative to the baseUrl Cypress configuration parameter, unless you specify the url and the API hostname in a RouteMatcher object.

According to the Cypress documentation for URL glob patterns: "Under the hood, cy.intercept uses the minimatch library with the { matchBase: true } option".

If you're like me, you might be tempted to immediately explore the minimatch library. However, let's set this library aside, at least for now, and focus on the basics first. That's exactly why I haven't included a hyperlink to the library.. 🤗

Before we continue, it's important to define and understand a key concept: path segment.

A path segment in a URL is the portion of the URL path located between slashes (/). In a relative URL like /users/admin, users is first path segment, and admin is the second path segment.

And now, let's confront the two most notorious offenders in the shadowy realm of glob patterns: the elusive single *, and the daunting double **. 👹

In URL glob patterns:

  • *: Matches any characters except / within a single path segment.

    ⚠️ IMPORTANT:

    • /*/ signifies exactly one path segment.
    • /*/*/ indicates precisely two path segments, and so forth.
    • You can use * multiple times within a single path segment, such as in /*-us*/.

Examples

  • /images/*.jpg will match /images/photo.jpg, but not /images/nature/photo.jpg
  • /images/*/photo.jpg will match /images/nature/photo.jpg, but not /images/photo.jpg
  • /*ag*/photo.jpg will match /images/photo.jpg

 

  • **: Matches any characters including / across multiple path segments, allowing for a broader match.

    ⚠️ IMPORTANT:

    • ** has special significance only when it is the sole content in a path part (between /**/); otherwise, it will behave exactly like *.
    • /**/ indicates be zero or any number of path segments.
    • /**/**/ is treated as /**/.
    • Note that if you specify /** at the end of a pattern, there might be zero path segments after it, but the URL must end with a / (e.g., api/users/** will match api/users/admin and api/users/, but will not match api/users).

Examples:

  • ab/**/cd will match ab/wx/yz/cd, but ab/**cd will not
  • ab/w**/**z/cd will match ab/wx/yz/cd since ** behaves like * in this case, as it is not the only thing between two /
  • /images/**/photo.jpg will match both /images/photo.jpg and /images/nature/gallery/photo.jpg

 

Easy peasy, right? Yet, like trying to track John Wick's next move, pinpointing those elusive URL paths remains a challenge when we try to intercept a request in our tests.

 

TEST YOUR GLOB ABILITIES: AN EXAMPLE

Suppose that in our cypress/config.js, we have defined baseUrl: 'https://example.com', and our glob pattern is */v2/**/images/*/*umb*. Which of the following URL requests would be intercepted successfully?

  1. https://example.com/api/files/v2/project/2022/gallery/images/small/thumb.png
  2. https://example.com/v2/project/2022/gallery/images/large/thumbs
  3. https://example.com/files/v2/project/2021/gallery/images/snapshot/small/thumb.png
  4. https://example.com/api/v2/project/2023/gallery/images/long_thumbnail
  5. https://example.com/files/v2/project/2022/images/small/my_thumb/pics
  6. https://example.com/files/v2/project/2022/gallery/images/04/umb
  7. https://example.com/api/v2/project/images/large/thumbnail

🤔...
🤔...
🤔...

Let's break it down what the URL glob pattern */v2/**/images/*/*umb* signifies:

  • It's a relative URL, with the root path being baseUrl: 'https://example.com'.
  • Between the root path and the v2 path segment, there must be exactly one path segment.
  • Between the v2 path segment and the images path segment, there can be any number of path segments.
  • Following the images path segment, there must be one path segment, followed by another path segment that contains the string umb.

So the answer is: Requests from 1 to 5 will not be intercepted; however, requests 6 and 7 will. Did you get them right on your first try?

We will analyze each of these URLs one by one to determine why they do or do not match with the glob pattern */v2/**/images/*/*umb*, based on our understanding of * and **. Remember we defined baseUrl as 'https://example.com'. Additionally for clarity, we will stack the URL and the glob pattern vertically to unravel the mystery.

 

CASE 1

Image description

The * before v2 in the glob indicates that exactly one path segment is expected between baseUrl and v2. However, the URL contains two path segments: api and files.

So, interception failed!

 

CASE 2

Image description

The * before v2 in the glob indicates that exactly one path segment is expected between baseUrl and v2. However, the URL does not contain any path segments in that position.

As result, interception failed!

 

CASE 3

Image description

The * between images and *umb* in the glob indicates that exactly one path segment is expected between them. However, the URL contains two path segments: snapshot and small.

Hence, interception failed!

 

CASE 4

Image description

The * between images and *umb* in the glob indicates that exactly one path segment is expected between them. However, the URL does not contain any path segments in that position.

Meaning, interception failed!

 

CASE 5

Image description

The *umb* at the end of the glob pattern indicates that it must be the last path segment and must contain the string umb. However, in the URL, the path segment my_thumb is not the final path segment.

As consequence, interception failed!

 

CASE 6

Image description

Bingo, interception successful! ✔️

 

CASE 7

Image description

We're on a roll, interception successful! ✔️

Okay, maybe things are a bit clearer now. But how about we create a cheat sheet so we can spot this stuff spot on?! 🎯
 

THE CHEAT SHEET

I have created a table where the first column lists the main use cases for URL Glob Patterns. The second column contains examples of URLs that match each glob pattern, with the matching portions highlighted in green. The third column provides examples of URLs that do not match, with the cause of the mismatch highlighted in red.

Image description

From these primary glob patterns, you can construct the rest and confidently apply what you've learned so far, along with the cheat sheet!

 

BONUS CONTENT: EXTGLOB PATTERNS

Remember when I mentioned at the beginning of the article to "set aside" the minimatch library for now?

Well, if you're feeling adventurous, we can now explore some concepts from the minimatch library. Specifically, we will focus on extglob patterns, adding yet another enlightening twist to our URL glob knowledge. 🦉

An extglob pattern is an extension of the standard glob patterns that provide additional pattern matching capabilities in Unix-like environments such as Bash. Extglobs allow for more complex matching conditions by using specific syntax to define patterns for inclusion and exclusion. These patterns enhance the flexibility of path matching, making them more powerful than the simpler glob patterns.

Some of the most useful extglob patterns are:

  • !(pattern): Matches anything except the specified pattern.
  • @(pattern): Matches exactly one of the specified patterns.
  • +(pattern): Matches one or more occurrences of the specified pattern.

Let's look at a few examples of extglob patterns. Suppose we want to intercept requests to the URL https://owasp.org/www--site-theme/assets/sitedata/menus.json.

Will glob pattern is www--site-theme/**/menus.!(jpg) intercept that request?
The answer is yes, because it will intercept files with name menus and the extension is NOT jpg. ✔️

How about the glob pattern www--site-theme/**/menus.+(json|png)?
In this case, the answer is also yes, because it will intercept files with the name menus and an extension of either json or png. ✔️

You get the idea! 👍👍👍


ACT3: RESOLUTION

Ah, we've journeyed together through the labyrinth of slashes and asterisks, becoming glob masters along the way! With our newfound knowledge and cheat sheet in hand, we can confidently commandeer those elusive URLs in cy.intercept(), leaving no stray path segment unturned.

Remember, with each glob pattern conquered, you are an inch closer to becoming the Baba Yaga of your test suites!

 

Revel in your mastery and share your triumph — follow, react, or comment if these insights have sparked joy or illuminated your path! ❤️ 🦄 🤯 🙌 🔥

Top comments (0)