In my first post on Strong Password Hashing, we discussed that the solution for the most common way to secure passwords, even with the possibility of brute force attacks, was to incorporate a computation time component.  This technique essentially makes the password hashing process computationally expensive such that an attacker using brute force would have to spend a relatively enormous amount of time attacking the passwords.  But as I’ll talk about below, even this isn’t the most secure technique possible.  I’ll talk about that here.

Today, on the Apache Shiro list, one of our users asked a great question:

“Your post is careful to point out the need to tune for load, etc.; but slowing down logons via cpu spinning is a terrible thing to do to web applications that experience any kind of load. I wonder if there is a way to have just one of these two evils – either cpu drain or connection / throughput drag. IIUC the problem the slowing is trying to solve, that could be accomplished by forcing some kind of time dependency in the hashing algorithm itself without cpu drain. Is that possible somehow?”

I decided to discuss that here for the benefit of anyone that may not be on the user list. Let me first answer the question:

I’m personally not aware of a time-based strategy that doesn’t result in forcing the user to change their password so frequently that it would effectively be useless (this doesn’t mean there isn’t a way of course! – just that I’m not aware of such a thing).

The entire point of multiple iterations is to slow down an attacker that has direct access to your password data.  For example, they’ve cracked the relational database or have a file dump of it.

Aside: File Dumps

Mysql dump files (backup files) accessible over the internet are a major contributor to password attacks.  People (Information Technology engineers) spend an awful lot of time making sure their RDBMS servers are physically secure on the network, but the second a database is backed up or dumped to a file for backup, and that backup file isn’t just as secure as the database, attack vectors begin to open up.  

There are many companies that fall into this trap because although the network infrastructure may be rock solid, once a dev has dumped mysql as a backup, an unaware dev usually doesn’t encrypt it and may put it in a place that may be compromised later on.

Because of this, BCrypt and the # of iterations techniques discussed in the previous article assume flat-out that the data will be accessible to attackers some way (either unknowingly or surreptitiously), even if they don’t necessarily breach network security.

Back on Track

Ok, back on track with the execution time issue.

So the easiest way to protect against brute force is to increase CPU time when the password hash is calculated.  

But as the above comment points out, this is less than ideal for load tuning.  The only solution to that is to fire up more app instances to counter any increased load.  Most people are ok with that, especially in virturalized and/or cloud environments where this is easy/cheap.

The Cloud Computing Wrench

Unfortunately, modern cloud computing and compute grids reduce the benefits of a time-based component in a password hash: yes, each brute force attempt will be slow due to the # of hash iterations, but if you can spread attempts across a grid of computers, the total attack time is reduced enough such that it is no longer as solid as it once was.

A very public example of this was the recent Sony PSN attack data breach and how the attackers used Amazon EC2 to execute the attack.

You could increase the number of hash iterations to make grid attacks less effective, but the number of iterations would probably have to be so high that your end users would complain due to the very slow-feeling login process.

The new way of cloud computing appears to call for an even stronger technique. I’ll talk about my preferred solution next.

Split It Up

There is another way that I happen to be partial to that many people feel is overkill, but it doesn’t require much CPU churn, aleviating the computational/time concern.  However, my solution relies more on I/O, so the end result may not reduce overall load, but distributes it better.  Using this approach your end-users probably won’t notice a slowdown and it is much more secure (even if it is more resource and I/O heavy).

My solution is to split up the data required to perform a password hash across multiple data stores:

  1. Store the password hash (aka digest) output in one data store
  2. Store the salt used to compute the hash in a different data store
  3. Store the # of hash iterations in yet another location (perhaps application config?)
  4. Have the value stored in #1 not be the password hash itself, but a MAC (Message Authentication Code, aka ‘keyed hash’) using an application-private MAC key.  This key would be stored in yet another location than #1, #2 and #3.

In order for an attacker to compromise any passwords, they would need to crack all four mechanisms and storage locations.  Any 3 of the 4 could be compromised and the attacker still won’t be able to perform a brute force attack.

While not impossible to crack, successfully attacking all 4 locations and knowing how to associate them is orders of magnitute more difficult than cracking standard *nix crypt password output (which are susceptible to brute-force when using only one storage location).

Wrapping Up

The issue with splitting up the data is that now we have a much more complex environment.  No one said security was simple! (In fact, I believe security is inversely proportional to simplicity, but such is life).

However, your end-user experience would be much faster without the artificial time restriction, and the solution is inherently more secure.  This is probably a better solution for really security conscientious organizations or products.