Carrie

Posted on Sep 19

Five Free WAF Performance Comparison

#cybersecurity #webdev #websecurity

Testing the Effectiveness of WAF Protection

Attack defense is the core capability of a WAF. This article will introduce how to test the effectiveness of WAF protection.

To ensure the fairness of the test results, all targets, testing tools, and test samples mentioned in this article are open-source projects.

Testing Metrics

The test results are based on four main metrics:

Detection Rate: Reflects the comprehensiveness of the WAF's detection ability. Missing detections are considered "false negatives".
False Positive Rate: Reflects the interference with normal traffic. Unreliable results are considered "false positives".
Accuracy: A composite metric of the detection rate and false positive rate, aiming to balance between false negatives and false positives.
Detection Time: Reflects the performance of the WAF; the longer the detection time, the poorer the performance.

Detection time can be directly measured using tools. The other three metrics can be mapped to the concept of prediction classification in statistics:

TP (True Positives): The number of attack samples intercepted.
TN (True Negatives): The number of normal samples correctly allowed.
FN (False Negatives): The number of attack samples allowed, i.e., missed detections.
FP (False Positives): The number of normal requests intercepted, i.e., false alarms.

The formulas for the above three metrics are as follows:

Detection Rate = TP / (TP + FN)
False Positive Rate = FP / (TP + FP)
Accuracy = (TP + TN) / (TP + TN + FP + FN)

To reduce the impact of randomness and minimize errors, for "Detection Time," I will break it down into "90% Average Time" and "99% Average Time" metrics.

Test Samples

Data Source: All test data comes from my own browser.
Packet Capture Method: Use Burp Suite as a proxy, point the browser globally to Burp, and export the XML file, then use a Python script to process it into individual requests.

Based on past experience, the ratio of normal traffic to attack traffic for exposed services on the internet is usually around 100:1. We will use this ratio for sample allocation.

White Samples: Browsing Weibo, Zhihu, Bilibili, and various forums, collecting a total of 60,707 HTTP requests, totaling 2.7 GB (this process took 5 hours).
Black Samples: To ensure thorough testing, I collected black samples using four different methods, totaling 600 HTTP requests (this process took 5 hours).

The black sample collection methods are:

Simple Generic Attack Traffic: Deploy a DVWA target machine and attack each generic vulnerability example.
Common Attack Traffic: Use all attack payloads provided on the PortSwigger website.
Targeted Vulnerability Traffic: Deploy a VulHub target machine and attack each classic vulnerability using default POCs.
Countermeasure Attack Traffic: Increase the countermeasure level of DVWA and attack it again under medium and high protection settings.

Testing Method

With the test metrics and samples defined, we now need three things: a WAF, a target machine to receive the traffic, and testing tools.

WAF: All WAFs use initial configurations without any adjustments.
Target Machine: Use Nginx, configured to return a 200 status for any request as follows:

location / {
    return 200 'hello WAF!';
    default_type text/plain;
}

Testing Tools: The requirements for the testing tool are:
- Parse Burp's export results.
- Reassemble the HTTP requests.
- Remove the Cookie header to ensure data can be open-sourced.
- Modify the Host header field to ensure the target machine can receive the traffic correctly.
- Determine if the request was intercepted by the WAF based on whether a 200 status was returned.
- Mix black and white samples and send the requests evenly.
- Automatically calculate the above "testing metrics".

I found two open-source WAF testing tools that look good and meet most of the requirements. By combining these tools and adding some additional details, they can be used:

gotestwaf: An open-source WAF testing tool from Thailand.
blazehttp: An open-source WAF testing tool from Chaitin Tech.

Start Testing

SafeLine Community Edition

TP: 426
TN: 33,056
FP: 38
FN: 149
Total Samples: 33,669
Success: 33,669
Errors: 0
Detection Rate: 74.09%
False Positive Rate: 8.19%
Accuracy: 99.44%
90% Average Time: 0.73 ms
99% Average Time: 0.89 ms

Coraza

TP: 404
TN: 27,912
FP: 5,182
FN: 171
Total Samples: 33,669
Success: 33,669
Errors: 0
Detection Rate: 70.26%
False Positive Rate: 92.77%
Accuracy: 84.10%
90% Average Time: 3.09 ms
99% Average Time: 5.10 ms

ModSecurity

TP: 400
TN: 25,713
FP: 7,381
FN: 175
Total Samples: 33,669
Success: 33,669
Errors: 0
Detection Rate: 69.57%
False Positive Rate: 94.86%
Accuracy: 77.56%
90% Average Time: 1.36 ms
99% Average Time: 1.71 ms

Nginx-Lua-WAF

TP: 213
TN: 32,619
FP: 475
FN: 362
Total Samples: 33,669
Success: 33,669
Errors: 0
Detection Rate: 37.04%
False Positive Rate: 69.04%
Accuracy: 97.51%
90% Average Time: 0.41 ms
99% Average Time: 0.49 ms

SuperWAF

TP: 138
TN: 33,048
FP: 46
FN: 437
Total Samples: 33,669
Success: 33,669
Errors: 0
Detection Rate: 24.00%
False Positive Rate: 25.00%
Accuracy: 98.57%
90% Average Time: 0.34 ms
99% Average Time: 0.41 ms

Comparison Table

The SafeLine Community Edition performed the best overall, with the fewest false positives and false negatives.

Conclusion

To ensure fairness and impartiality, all testing tools and data used in this article are open-sourced and available at:

https://gitee.com/kxlxbb/testwaf

Different test samples and methods may lead to significant differences in test results. It is necessary to select appropriate test samples and methods based on the actual situation.

The results of this test are for reference only and should not be used as the sole standard for evaluating products, tools, algorithms, or models.

DEV Community

Five Free WAF Performance Comparison

Testing the Effectiveness of WAF Protection

Testing Metrics

Test Samples

Testing Method

Start Testing

SafeLine Community Edition

Coraza

ModSecurity

Nginx-Lua-WAF

SuperWAF

Comparison Table

Conclusion

Top comments (0)

Read next

How to understand the ins and outs of how DNS really works.

Nexus Network Testnet: A Revolution in Blockchain and Cybersecurity

What is SignalR? A Real-Time Communication Framework for .NET

Autofac Dependency Injection in ASP .NET Core 8