Calin Baenen

Posted on Jun 19, 2022

Making a regex engine in Rust from scratch.

#rust #regex #watercoler #javascript

So I'm starting a new project in Rust called LexRs that will be a Rust remake of my original (misnamed) ParseJS library.

There was this one feature I wanted in the original library that I couldn't figure out how to make.
To alleviate this issue I thought that when making the Rust version I'd take a hand at making my own regular expression engine based on JavaScript's.

What I have so far.

I have chars.rs which contains REChar and REString, the former being a representation of a character in a regex.
I have pattern.rs which contains information for patterns, such as PatternType and PatternSize.

`PatternType`

The following are the pattern types; only three patterns are planned to be supported by my engine (for now):

enum PatternType {
    NoneOf(REString),   // Same as JS  /[^xyz]/.
    AnyOf(REString),    // Same as JS  /[xyz]/.
    Char(REChar)        // Same as JS  /x/.
}

These are equivalent to [^xyz], [xyz], and x respectively.

`PatternSize`

The following are the sizes a pattern can be; only four distinct sizes will be supported:

enum PatterSize {
    OoM,        // Short for "One or More".
                // Same as JS  /x+/.
    ZoM,        // Short for "Zero or More".
                // Same as JS  /x*/.
    ZoO,        // Short for "Zero or One".
                // Same as JS  /x?/.
    N(usize),   // Represents a N-repetitions.
                // Same as JS  /x{N}/ where `N = int > 0`.
}

These are equivalent to /x+/, /x*, x?, and x{1} respectively.

`Flags`

The following are the flags that will be supported by the regex:

struct Flags {
    case_insensitive:bool,   // Same as JS  /x/i.
    multiline:bool,          // Same as JS  /x/m.
    global:bool,             // Same as JS  /x/g.
}

These are equivalent to the i, m, and g flags respectively.

Before the analyzer.

Before I work on the lexer for the regular expressions I feel I might want to work on the matching system first by matching a string against multiple Patterns.

How it's going so far.

I think things are going well and I am really excited to do this.
I look forward to seeing the technology I make work in action!

Þanks for reading!
Cheers!

DEV Community

Making a regex engine in Rust from scratch.

What I have so far.

`PatternType`

`PatternSize`

`Flags`

Before the analyzer.

How it's going so far.

Þanks for reading!
Cheers!

Top comments (0)

Read next

Conclusion of My Node.js Journey and a Sneak Peek into My Upcoming AWS Series

Are Angular Resolvers on Life Support ?

CSS in JS: Complete guide to Styled-Components, Emotion and more for cleaner and scalable styling.

Ng-News 24/46: Angular Camp, TypeScript 5.7 RC

What I have so far.

PatternType

PatternSize

Flags

Before the analyzer.

How it's going so far.

Þanks for reading!Cheers!

Read next

Conclusion of My Node.js Journey and a Sneak Peek into My Upcoming AWS Series

Are Angular Resolvers on Life Support ?

CSS in JS: Complete guide to Styled-Components, Emotion and more for cleaner and scalable styling.

Ng-News 24/46: Angular Camp, TypeScript 5.7 RC

`PatternType`

`PatternSize`

`Flags`

Þanks for reading!
Cheers!