Timing attacks
Timing attacks are a class of malicious attacks against a product where the length of time that your application takes to perform a task leaks some information. Take, for example, an application that takes in an email and password to check. If there is no user with a provided email address, returns an error in 5ms, but when given a valid email for a user with an incorrect password, returns an error in 500ms.
To an attacker, the difference in times between those two requests can make it relatively obvious if there is a valid email or not. If the difference was more subtle, an attacker can make many requests over a long time and average them together to distinguish different cases.
Is it a big deal?
This might not seem like a big deal, but let’s say I’m trying to find someone’s personal email. I only have their name, and I know they have signed up for your site. I can try a bunch of variations of firstname.lastname@gmail.com or lastname{3digitnumber}@gmail.com and so on until I find their actual email address.
How can we fix it?
To fix this issue, we need to make sure that all code paths take the same amount of time. This means that we should avoid returning early in sensitive parts of the codebase. In the case where we are checking users emails and passwords, instead of returning early if the email wasn’t found, we should check the password against a hardcoded value and then return false.
So in the checking emails example, a typical flow would look something like this:
Does a user exist with this email address? (1ms)
If yes, what is their password hash? (1ms)
Does the password hash match the password provided? (400ms)
This flow is fine when a correct email and password are provided, but it becomes vulnerable to a timing attack in the following scenario:
Does a user exist with this email address? (1ms)
If no, return (1ms)
One way to avoid this vulnerability, like I mentioned above, is to make both correct and incorrect flows follow the same procedures to align more closely timing wise:
Does a user exist with this email address (1ms)
If no, compare the provided password against a hardcoded password hash (400ms)
Return false anyways (1ms)
This ensures that the function takes the same amount of time for all inputs, making it harder for an attacker to extract information.
While you should do what you can to protect against timing attacks, you can also add additional protections to be safe. Since subtle timing attacks rely on making a large number of requests, another defense here is rate limiting. By rate limiting the requests, we can make it impractical for an attacker to distinguish between different cases.
When building out authentication flows into applications, it can be easy to overlook these kinds of subtle vulnerabilities like timing attacks in code. Although it might feel strange to intentionally slow down your code, stopping the potential leak of personal information is worth the trade off.
Top comments (8)
This is a really interesting article. I can totally see how a hacker might make use of timing data to try and discover more information so that they might crack into someone's account.
Anyway, good post and appreciate ya sharing this series!
There's a different way to solve this by using rate limiting and slowdown (Requests past the rate limit get slowed down). This ensures that normal users still get rapid feedback from your API, while still making it hard for an attacker to guess which requests were valid paths and which weren't.
Absolutely, this is also a great way to solve it - as long as the timing attack requires more than a few attempts. There was a theoretical timing attack with Lobste.rs a while back and one of the reasons it wasn't that big of a deal was both rate limiting and a short window where it could be exploited.
Interesting point.
In most cases, we are taught that we should add guard statements and return early. However in this case, it seems we should run through all the steps regardless to avoid unintentionally revealing secrets.
There is a subtle flaw in the proposed solution, which for the most part is easily fixed. The issue is that information about which path is taken can be leasked via what CPU cache lines are present. The proposed solution should have CPU flush instructions both on entry and all exits, otherwise a malious user can tell which path was taken by soon after making an additional call and measuring the time difference if the CPU has to or doesn't have to fetch certain cache lines.
Besides what is described there is also a form of timing attack where two or more code paths with incorrect locking are executed in parallel. Normally this attack is done to code paths within the kernel, which is a common location for locking issues. During my professional career I created and used tools for the near deterministic detection of such issues. See U.S. patents 7475385 Cooperating Test Triggers and 7310741 Phase Adjusted Delay Loop. Most issues can be discovered via the use of Cooperating Test Triggers. For the most part the pahse adusted delay loop is only needed to detect cases where the prescense or absense of a single cache line is involved.
Indeed, with Node, I make my important checks with Node crypto's
timingSafeEqual
function to avoid short-circuiting timing clues.What I always do when an invalid login is happening, is to enlarge the waiting time each time, so an attacker will be slowed down after each attempt.
But then how do we know if the user actually given a password if we always check for the password, even for the empty values