Regular Expression Denial of Service (ReDoS) Attack: A Practical Guide

ReDoS stands for Regular Expression Denial of Service. It is a type of DoS that exploits how some programs handle complex regex. ReDoS attacks exploit a weakness in how programs handle complex search patterns. By crafting a special search patterns.

Regular Expression Denial of Service (ReDoS) Attack: A Practical Guide

Regular expressions (Regex) are a concise and powerful tool for programmers to match, search, and manipulate text. They define patterns of characters that can be used to identify specific text strings within a larger body of text. Let's have a better understanding of how regex works:

  1. Patterns: Regex uses a combination of literal characters (like "a" or "@" ) and special characters with specific meanings. For example, "." (dot) matches any single character, "*" (star/asterisk) matches the preceding character zero or more times, and "+" (plus) matches one or more times. You can combine these to create complex patterns.
  2. Matching: A regex engine attempts to match the entire pattern against a target string. If there's a complete match, it succeeds. The match can occur at the beginning of the string, anywhere within it, or even at the end.

Benefits Of Using Regex

Regex offers several advantages and some of them are explained below:

  1. Conciseness: Complex search patterns can be expressed in a short and efficient way compared to looping through every character.
  2. Power: It allows for powerful text manipulation and validation tasks beyond simple string comparisons.
  3. Flexibility: Regex patterns can be adapted to various text formats and needs by adjusting the combination of characters and special symbols.

Example - Login Form Validation

Let's consider a simple username validation in a login form using JavaScript and regex. The target validation code looks like following:

function validateUsername(username) {
  const usernameRegex = /^[a-zA-Z0-9_]{5,15}$/; // The regular expression pattern
  return usernameRegex.test(username); // Test the username against the pattern
}
const usernameInput = document.getElementById("username");
usernameInput.addEventListener("keyup", function() {
  const username = this.value;
  const isValid = validateUsername(username);
  if (isValid) {
    usernameInput.classList.remove("error"); // Remove error class if valid
  } else {
    usernameInput.classList.add("error"); // Add error class for invalid username
  }
});

In the above code, the "usernameRegex" variable holds the actual regex pattern. Below is the breakdown of the regex pattern:

  • ^: Matches the beginning of the string.
  • [a-zA-Z0-9_]: Matches any lowercase or uppercase letter, digit, or underscore.
  • {5,15}: Specifies that the username must be between 5 and 15 characters long.
  • $: Matches the end of the string.

The test method of the regex object checks if the provided username (username) matches the pattern (usernameRegex).

This code can be used to dynamically validate the username as the user types. If the username doesn't match the pattern (e.g., it's less than 5 characters or contains special symbols), an error class can be added to the input field for visual feedback.


Regular Expression Denial of Service Attack

ReDoS stands for Regular Expression Denial of Service. It is a type of DoS that exploits how some programs handle complex regex.

ReDoS attacks exploit a weakness in how programs handle complex search patterns. By crafting a special search pattern (called an "evil regex"), attackers can trick the program into taking an extremely long time to find a match. This can slow down the program or crash it entirely, making it unavailable to users (DoS). Regex are common on the web, so this attack can target many different parts of a website

Causes of ReDoS Attack

The issue lies with vulnerable regex patterns. When regex code becomes overly complex or includes certain elements, attackers can exploit them to create a malicious input string (vulnerable regex patterns, untrusted input, and potentially inefficient regex engines).

This string takes the program an abnormally long time to evaluate, consuming excessive resources and ultimately crashing or making the program unresponsive. 

This DoS effect is what defines a ReDoS attack.

  • Poorly written regex with excessive repetition (.*) or complex backtracking options can be manipulated by attackers.
  • Evil Regex Input Attackers craft a special regex string that takes the program an extremely long time to evaluate due to its vulnerable regex.

The program gets bogged down trying to process the evil regex, ultimately crashing or becoming unresponsive to legitimate users (DoS).

Example of code that is vulnerable to ReDoS:

import re
def validate_user_input(user_input):
  # Evil regex pattern
  pattern = r".*(a*a*a*a*a*a*a*a*a*a*)*"
  match = re.match(pattern, user_input)
  if match:
    return True
  else:
    return False
# User input (malicious)
user_input = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
if validate_user_input(user_input):
  print("Valid input")
else:
  print("Invalid input")

Explanation of the vulnerable code:

  1. Evil Regex Pattern: The pattern variable defines a regex with nested quantifiers (* repeated multiple times). This pattern attempts to match any string that contains zero or more occurrences of the sequence "a" repeated ten times.
  2. Vulnerability: The problem lies in the nested quantifiers. The regex engine needs to explore an exponential number of possibilities as the number of "a"s in the user input increases. This can lead to the engine getting stuck in a loop or taking an extremely long time to process the input.
  3. Malicious User Input: The user_input variable contains a string with many "a" characters. This triggers the vulnerability in the regex pattern.
  4. Potential DoS: If this code is used in a real-world application and a malicious user submits such input, it could overload the program with the regex engine stuck in the matching process. This could lead to a DoS attack, making the application unresponsive to legitimate users.

An algorithm that acts as an affecting factor for ReDoS attack 

The Naive Matching Algorithm:

This is a basic algorithm used by some regex engines to match patterns in text. It works in a step-by-step fashion, like trying different keys on a keychain to see if it opens a lock. The naive regex algorithm struggles with complex patterns.

  1. Exponential Growth: Wildcards like ".*" can explode the number of matching attempts as the text gets bigger.
  2. Endless Loops: The algorithm might get stuck backtracking forever with patterns offering many choices.
  3. Evil regex, also known as a DoS causing regex, is a malicious or poorly constructed regex pattern that can cause problems in programs.

Examples of Evil Regex:

(a+)+
([a-zA-Z]+)*
(a|aa)+
(a|a?)+
(.*a){x} for x \> 10

ReDoS Regex Example 

text = "This is a normal string"
pattern = r"a(a*)+"  # This is the evil regex

This will hang the program (or take an extremely long time)

if re.findall(pattern, text):
  print("Found a match!")

Explanation of the ReDos Regex example:

  • a: Matches the letter "a" at the beginning.
  • (a*): This is a capturing group that matches zero or more repetitions of the letter "a". The asterisk (*) makes the repetition greedy, meaning it will try to match as many "a"s as possible.
  • +: This operator matches one or more occurrences of the preceding element, which in this case is the capturing group (a*)

Case Study: CVE-2024-6434 - ReDoS of Service in Premium Addons for Elementor

This case study examines CVE-2024-6434, a ReDoS vulnerability discovered in the premium add-ons for the Elementor plugin for WordPress. The vulnerability allows attackers to crash a WordPress website by crafting a specific malicious string for the post title. This case study will explore the technical details of the vulnerability, its potential impact, and the recommended remediation steps.

Functional aspects

The Premium Addons for Elementor plugin, up to version 4.10.35, is vulnerable to ReDoS due to its improper handling of user-supplied input in the form of post titles. The plugin uses regex to validate and process post titles. However, the implemented regex pattern is flawed and can be exploited by an attacker to create a complex pattern that takes an excessive amount of time and resources for the server to process. This excessive processing can lead to a DoS condition, rendering the website unresponsive to legitimate users. 


Let's understand how Regex can be bypassed using a demo lab. In this lab, we are not going to perform any DoS attacks but aiming to understand how Regex are bypassed and attacks are performed.

Lab: BuggyLabs Regex-bypass

You can find our labs here: https://hub.docker.com/r/defensiumlabs/buggylabs_regex-bypass

Pull the docker image from Docker-hub from the official page of Defensium labs

docker pull defensiumlabsbuggylabs_regex-bypass
docker run -p 8000:8000 defensiumlabs/buggylabs_regex-bypass

Now visit localhost:8000 on your browser 

You can see a search field that takes input. Let's try to look for any other files that validate the input because the lab is based on regex bypass it has supposed to have validation for regex let's look at any js file and we do have a JS file 

Right-click → Inspect element  → Network tab → index.js


On analysis of the JS file, we can see that the code checks if the user input can be converted into a valid regular expression using new RegExp(inputValue). If it throws an error during this conversion (try...catch), it raises an alert saying "Regex bypassed!".

However, this doesn't indicate a true bypass of the intended validation. A malicious user can still provide invalid input that bypasses the second regular expression check (/^[a-zA-Z0-9_]+$/).

This is done on purpose because we have simulated an environment where we just want to showcase how regex is bypass 

So here, we just need to craft a payload or just copy the (/^[a-zA-Z0-9_]+$/). The given check so will basically bypass the regex 

Lets try to input “(!/^[a-zA-Z0-9_]”

As you can see the evil regex allowed us to bypass the regex successfully while in the real scenario it would have caused the ReDoS attack.

Please note that the lab is just an explanation of how can we bypass the regex filter and then later use it to escalate to DoS via regex bypass.


Mitigation for ReDoS Attack

1. Secure Regex Writing: Avoid overly complex patterns: Simpler regex with fewer repetitions and backtracking options are less vulnerable. The more complex the regex, the higher the vulnerability. Break down complex validations into smaller, simpler patterns.

Example of Username Validation:

Vulnerable Pattern:

r"^[a-zA-Z0-9_]{3,20}$" (Greedy match for 3-20 chars)

Safe Pattern:

r"^[a-zA-Z0-9_]{3,20}" (Fixed-length match)

2. Input Validation and Sanitization: Minimise vulnerabilities in the regex itself by trying to write secure regex and ensure user input is safe before using it in regex.

Example: Validating usernames with a regex to allow letters, numbers, and underscores (3-20 characters).

Vulnerable Code: 
def is_valid_username(username):
  pattern = r"^[a-zA-Z0-9_]{3,20}$"  # This regex can be exploited
  return bool(re.match(pattern, username))
username = input("Enter username: ")
if is_valid_username(username):
  # Process login...
else:
  print("Invalid username!")
Secure Code:
def is_valid_username(username):
  if not (3 <= len(username) <= 20):
    return False  # Username length check (optional)
  pattern = r"^[a-zA-Z0-9_]+$"  # Secure regex, no .*
  return bool(re.match(pattern, username))
username = input("Enter username: ")
if is_valid_username(username):
  # Process login...
else:
  print("Invalid username!")

  1. Timeouts and Resource Limits: Set a maximum time limit for regex evaluation. If the engine takes too long, terminate the operation to prevent infinite loops.

Example: Imagine a program solving a maze. A timeout is like an alarm clock. If the program takes too long (stuck in a loop with bad regex), the timeout cuts it off, preventing a crash.

# Example (pseudocode):
def regex_match(text, pattern, timeout):
  # ... regex matching logic ...
  if time_spent > timeout:
    raise TimeoutError  # Stop the program if it takes too long
  1. Choose Appropriate Regex Engine: Not all regex engines are created equal. Some engines employ more sophisticated algorithms less prone to ReDoS vulnerabilities. Consider the engine's capabilities when designing your application. Below are some of the options for safer Regex engines:
  • PCRE
  • RE2
  • GNU grep
  1. Stay Informed and Update Regularly: Keep yourself updated on the latest ReDoS vulnerabilities and best practices. New attack vectors might emerge, so staying informed is crucial. Also, patch and update your software regularly. Software vendors often release patches to address security vulnerabilities, including ReDoS exploits.

Useful tools for the exploitation of ReDoS Attacks:

Conclusion

Regex are powerful tools for text manipulation, but like any tool, they can be misused. This blog explored regex and a hidden danger - ReDoS attacks. These attacks exploit poorly crafted regex patterns to overload servers, causing DoS attacks.

We saw how to mitigate ReDoS by prioritising simpler patterns and using tools like timeouts. To illustrate the concept, we even built a mini-lab to bypass a vulnerable regex (remember, don't use this for malicious purposes!).

By understanding regex and ReDoS, you can leverage this powerful tool safely and keep your applications secure.


We hope you enjoyed reading this blog, if you did, please do share within your community reach and like minded professionals. Do let us know in the comments if you have any questions or what topic you want to learn next.