So I'm starting a new project in Rust called LexRs that will be a Rust remake of my original (misnamed) ParseJS library.
There was this one feature I wanted in the original library that I couldn't figure out how to make.
To alleviate this issue I thought that when making the Rust version I'd take a hand at making my own regular expression engine based on JavaScript's.
What I have so far.
I have chars.rs
which contains REChar
and REString
, the former being a representation of a character in a regex.
I have pattern.rs
which contains information for patterns, such as PatternType
and PatternSize
.
PatternType
The following are the pattern types; only three patterns are planned to be supported by my engine (for now):
enum PatternType {
NoneOf(REString), // Same as JS /[^xyz]/.
AnyOf(REString), // Same as JS /[xyz]/.
Char(REChar) // Same as JS /x/.
}
These are equivalent to [^xyz]
, [xyz]
, and x
respectively.
PatternSize
The following are the sizes a pattern can be; only four distinct sizes will be supported:
enum PatterSize {
OoM, // Short for "One or More".
// Same as JS /x+/.
ZoM, // Short for "Zero or More".
// Same as JS /x*/.
ZoO, // Short for "Zero or One".
// Same as JS /x?/.
N(usize), // Represents a N-repetitions.
// Same as JS /x{N}/ where `N = int > 0`.
}
These are equivalent to /x+/
, /x*
, x?
, and x{1}
respectively.
Flags
The following are the flags that will be supported by the regex:
struct Flags {
case_insensitive:bool, // Same as JS /x/i.
multiline:bool, // Same as JS /x/m.
global:bool, // Same as JS /x/g.
}
These are equivalent to the i
, m
, and g
flags respectively.
Before the analyzer.
Before I work on the lexer for the regular expressions I feel I might want to work on the matching system first by matching a string against multiple Pattern
s.
How it's going so far.
I think things are going well and I am really excited to do this.
I look forward to seeing the technology I make work in action!
Top comments (0)