Regular expressions are a code smell. Sometimes also a vulnerability
TL;DR: Try to minimize Regular Expression's recursive rules.
Problems
Security Issues
Readability
Premature Optimization
Solutions
Cover the cases with tests to see if they halt
Use algorithms instead of regular expressions
Add timeout handlers
Context
This is known as ReDos attack, a subtype of a Denial of Service attack.
ReDoS attacks can be divided into two types:
A string with an evil pattern is passed to an application. Then this string is used as a regex, which leads to ReDoS.
A string with a vector attack format is passed to an application. Then this string is evaluated by a vulnerable regex, which leads to ReDoS.
Sample Code
Wrong
package main
import (
"regexp"
"fmt"
)
func main() {
var re = regexp.MustCompile(`^(([a-z])+.)+[A-Z]([a-z])+$`)
var str = `aaaaaaaaaaaaaaaaaaaaaaaa!`
for i, match := range re.FindAllString(str, -1) {
fmt.Println(match, "found at index", i)
}
}
Right
package main
import (
"fmt"
"strings"
)
func main() {
var str = `aaaaaaaaaaaaaaaaaaaaaaaa!`
words := strings.Fields(str)
for i, word := range words {
if len(word) >= 2 && word[0] >= 'a' && word[0] <= 'z'
&& word[len(word)-1] >= 'A' && word[len(word)-1] <= 'Z' {
fmt.Println(word, "found at index", i)
}
}
}
Detection
[X] Semi-Automatic
Many languages avoid this kind of regular expression.
We can also scan the code for this vulnerability.
Tags
- Security
Conclusion
Regular Expressions are tricky and hard de debug.
We should avoid them as much as possible.
Relations
Code Smell 41 - Regular Expression Abusers
Maxi Contieri ・ Dec 3 '20
More Info
Catastrophic backtracking: how can a regular expression cause a ReDoS vulnerability?
https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
Runaway Regular Expressions: Catastrophic Backtracking
Disclaimer
Code Smells are just my opinion.
Credits
Photo by engin akyurt on Unsplash
Thank you @unicorn_developer
Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.
Jamie Zawinski
Software Engineering Great Quotes
Maxi Contieri ・ Dec 28 '20
This article is part of the CodeSmell Series.
Top comments (1)
This is the root of every evil: use a string given by the user without sanitizing it. The tale of Bobby Tables comes to my mind.