What is the best Distributed Brute Force countermeasure?
Combining methods 3 and 4 from the original post into a kind of 'fuzzy' or dynamic whitelist, and then - and here's the trick - not blocking non-whitelisted IPs, just throttling them to hell and back.
Note that this measure is only meant to thwart this very specific type of attack. In practice, of course, it would work in combination with other best-practices approaches to auth: fixed-username throttling, per-IP throttling, code-enforced strong password policy, unthrottled cookie login, hashing all password equivalents before saving them, never using security questions, etc.
Assumptions about the attack scenario
If an attacker is targeting variable usernames, our username throttling doesn't fire. If the attacker is using a botnet or has access to a large IP range, our IP throttling is powerless. If the attacker has pre-scraped our userlist (usually possible on open-registration web services), we can't detect an ongoing attack based on number of 'user not found' errors. And if we enforce a restrictive system-wide (all usernames, all IPs) throttling, any such attack will DoS our entire site for the duration of the attack plus the throttling period.
So we need to do something else.
The first part of the countermeasure: Whitelisting
What we can be fairly sure of, is that the attacker is not able to detect and dynamically spoof the IP addresses of several thousand of our users(+). Which makes whitelisting feasible. In other words: for each user, we store a list of the (hashed) IPs from where the user has previously (recently) logged in.
Thus, our whitelisting scheme will function as a locked 'front door', where a user must be connected from one of his recognized 'good' IPs in order to log in at all. A brute-force attack on this 'front door' would be practically impossible(+).
(+) unless the attacker 'owns' either the server, all our users' boxes, or the connection itself -- and in those cases, we no longer have an 'authentication' issue, we have a genuine franchise-sized pull-the-plug FUBAR situation
The second part of the countermeasure: System-wide throttling of unrecognized IPs
In order to make a whitelist work for an open-registration web service, where users switch computers frequently and/or connect from dynamic IP addresses, we need to keep a 'cat door' open for users connecting from unrecognized IPs. The trick is to design that door so botnets get stuck, and so legitimate users get bothered as little as possible.
In my scheme, this is achieved by setting a very restrictive maximum number of failed login attempts by unapproved IPs over, say, a 3-hour period (it may be wiser to use a shorter or longer period depending on type of service), and making that restriction global, ie. for all user accounts.
Even a slow (1-2 minutes between attempts) brute force would be detected and thwarted quickly and effectively using this method. Of course, a really slow brute force could still remain unnoticed, but too slow speeds defeat the very purpose of the brute force attack.
What I am hoping to accomplish with this throttling mechanism is that if the maximum limit is reached, our 'cat door' slams closed for a while, but our front door remains open to legitimate users connecting by usual means:
- Either by connecting from one of their recognized IPs
- Or by using a persistent login cookie (from anywhere)
The only legitimate users who would be affected during an attack - ie. while the throttling was activated - would be users without persistent login cookies who were logging in from an unknown location or with a dynamic IP. Those users would be unable to login until the throttling wore off (which could potentially take a while, if the attacker kept his botnet running despite the throttling).
To allow this small subset of users to squeeze through the otherwise-sealed cat door, even while bots were still hammering away at it, I would employ a 'backup' login form with a CAPTCHA. So that, when you display the "Sorry, but you can't login from this IP address at the moment" message, include a link that says "secure backup login - HUMANS ONLY (bots: no lying)". Joke aside, when they click that link, give them a reCAPTCHA-authenticated login form that bypasses the site-wide throttling. That way, IF they are human AND know the correct login+password (and are able to read CAPTCHAs), they will never be denied service, even if they are connecting from an unknown host and not using the autologin cookie.
Oh, and just to clarify: Since I do consider CAPTCHAs to be generally evil, the 'backup' login option would only appear while throttling was active.
There is no denying that a sustained attack like that would still constitute a form of DoS attack, but with the described system in place, it would only affect what I suspect to be a tiny subset of users, namely people who don't use the "remember me" cookie AND happen to be logging in while an attack is happening AND aren't logging in from any of their usual IPs AND who can't read CAPTCHAs. Only those who can say no to ALL of those criteria - specifically bots and really unlucky disabled people - will be turned away during a bot attack.
EDIT: Actully, I thought of a way to let even CAPTCHA-challenged users pass through during a 'lockdown': instead of, or as a supplement to, the backup CAPTCHA login, provide the user with an option to have a single-use, user-specific lockdown code sent to his email, that he can then use to bypass the throttling. This definitely crosses over my 'annoyance' threshold, but since it's only used as a last resort for a tiny subset of users, and since it still beats being locked out of your account, it would be acceptable.
(Also, note that none of this happens if the attack is any less sophisticated than the nasty distributed version I've described here. If the attack is coming from just a few IPs or only hitting a few usernames, it will be thwarted much earlier, and with no site-wide consequences)
So, that is the countermeasure I will be implementing in my auth library, once I'm convinced that it's sound and that there isn't a much simpler solution that I've missed. The fact is, there are so many subtle ways to do things wrong in security, and I'm not above making false assumptions or hopelessly flawed logic. So please, any and all feedback, criticism and improvements, subtleties etc. are highly appreciated.
A few simple steps:
Blacklist certain common usernames, and use them as a honeypot. Admin, guest, etc... Don't let anyone create accounts with these names, so if someone does try to log them in you know it's someone doing something they shouldn't.
Make sure anyone who has real power on the site has a secure password. Require admins/ moderators to have longer passwords with a mix of letters, numbers and symbols. Reject trivially simple passwords from regular users with an explanation.
One of the simplest things you can do is tell people when someone tried to log into their account, and give them a link to report the incident if it wasn't them. A simple message when they log in like "Someone tried to log into your account at 4:20AM Wednesday blah blah. Click here if this wasn't you." It lets you keep some statistics on attacks. You can step up monitoring and security measures if you see that there's a sudden increase in fraudulent accesses.
If I understand the MO of brute force attacks properly, then one or more usernames are tried continuously.
There are two suggestions which I don't think I've seen yet here:
- I always thought that the standard practice was to have a short delay (a second or so) after each wrong login for every user. This deters brute-force, but I don't know how long a one second delay would keep a dictionary attack at bay. (dictionary of 10,000 words == 10,000 seconds == about 3 hours. Hmm. Not good enough.)
- instead of a site-wide slow down, why not a user-name throttle. The throttle becomes increasingly harsh with each wrong attempt (up to a limit, I guess so the real user can still login)
Edit: In response to comments on a username throttle: this is a username specific throttle without regard to the source of the attack.
If the username is throttled, then even a coordinated username attack (multi IP, single guess per IP, same username) would be caught. Individual usernames are protected by the throttle, even if the attackers are free to try another user/pass during the timeout.
From an attackers point of view, during the timeout you may be able to take a first time guess at 100 passwords, and quickly discover one wrong password per account. You may only be able to make a 50 second guesses for the same time period.
From a user account point of view, it still takes the same average number of guesses to break the password, even if the guesses are coming from multiple sources.
For the attackers, at best, it will be the same effort to break 100 accounts as it would 1 account, but since you're not throttling on a site wide basis, you can ramp up the throttle quite quickly.
Extra refinements:
- detect IPs that are guessing multiple accounts - 408 Request Timeout
- detect IPs that are guessing the same account - 408 Request Timeout after a large (say 100) number of guesses.
UI ideas (may not be suitable in this context), which may also refine the above:
- if you are in control of the password setting, then showing the user how strong their password is encourages them to pick a better one.
- if you are in control of the login page, after a small (say 10) number of guesses of a single username, offer a CAPTCHA.