In the past few years there have been some very high-profile security breaches that ended up exposing users’ passwords. Although there’s no such thing as a ‘good’ outcome in these situations some password losses are potentially more catastrophic than others.
The infamous clear-text password
In July of last year Yahoo had a breach that exposed 450,000 usernames and passwords which were stored in clear text. As soon as the breach occurred the attacker had the passwords for all the users and had the opportunity to try them at other sites. Many users reuse passwords, after all, so why not try the password stolen from Yahoo over at Gmail? And Dropbox, and Salesforce? Or the credit card companies?
Clear-text password storage is banana-land bad. If it’s not your first Freshman programming project there is absolutely no excuse for it. At the very least all DevOps and system administrators to your system can read the passwords, and in case of a breach there is obviously no protection at all.
The most basic step is to hash the passwords before storing them. Hashing means you put the original password through a one-way algorithm that maps the original password to some new value and the system never stores the password itself but rather the hash. In the future when the user attempts to log onto the system the system will hash that input, compare against the stored value, and if they match then the system knows that the user has entered the same password. The hash itself can’t be used to access the system, so exposing the hash isn’t as catastrophic as exposing the password.
Hashing. A step in the right direction but…
In June of last year LinkedIn had a breach that exposed 6.5 million users’ information. LinkedIn came out slightly ahead of Yahoo – at least they hashed the passwords. So was LinkedIn in the clear? No – hashing by itself is not sufficient. Within days 60% of the passwords stolen from LinkedIn were decoded and available on the web.
Even though hashes are one-way, it’s fairly simple to run a bunch of password guesses through that algorithm and see if they match. For instance, the hash for the password “swordfish” is “4f57181dcaade980555f2ce6755ca425f00658be”. By running a bunch of guesses for simple passwords through the hash and storing them in a table you can reverse lookup and decode a whole bunch of passwords at once. The stupid password “password”, for instance, is almost 5% of all passwords. C’mon!
So when breaches like this occur the opposition will build what’s called a rainbow table – the reverse table of hashes for common passwords, dictionary words, and short passwords. By doing this they were able to extract the majority of the passwords from LinkedIn.
How they could have done better
To make this more difficult security conscious companies use a ‘salt’ – a set of random characters that are added to the password before it’s hashed. Doing so completely changes the hash – with password “swordfish” and salt “B6kTHEr^XUBjI&f7?44aM:Ao0]7g$Zd4zNOF8c_nMrO5b)V%;bAE$iR4zgl?+GSz” the hashed value becomes “e4e9b86d8dd476a7c7008c1fe6f816e30902d5bf” – completely different from the hash above. So if each individual password is salted and hashed then each stored hash value has to be broken individually. No rainbow tables, so the entire stolen set of passwords can’t be attacked at once.
A few weeks ago LivingSocial lost 50 million passwords. The good news, they were hashed and salted. Every single of the 50 million hash values will need to be attacked independently. That’s good. What’s bad (or at least suboptimal) is the choice of hashing algorithm.
Above we talked about how we used a one-way algorithm to transform a password into a hash. LivingSocial used the hashing algorithm SHA1. The problem with SHA1 is it’s too fast – kind of surprising, but true. A determined attacker now has the hash and the salt used with the password and he can just throw computing resources at each individual row to crack them. Secure hashing algorithms are specifically designed to make this difficult, because they run slow (and/or take a lot of memory). We make it inconvenient for our adversary by hashing with an algorithm that is slow enough to frustrate them but not slow enough to slow our own password checking TOO much. Several algorithms are designed specifically for this – bcrypt, scrypt, and PBKDF2. Cracking a password hashed with scrypt, for example, will take about 5 orders of magnitude longer!
Evernote, by the way, had the same situation arise using MD5 hashes a few months ago.
What else when it comes to best practices and breaches?
Is that it for best practices? No – even with all of this in place users with bad passwords are still vulnerable. Dictionary attacks are very effective (90% of passwords are within the top 1000 most common?!), short passwords and those with silly substitutions (zero for the letter ‘o’) are all crackable in very short time periods. It’s important to have a truly strong password. Users hate this – heck we all hate this. But the more complex a password is (length, mix of character types) the longer it will take to brute force any compromised hashes. The trouble is enforcing this.
We care a lot about identities and password here at JumpCloud. Our core Identity-as-a-Service platform is called Directory-as-a-Service® and serves as the core cloud-based directory service for an organization. Our multi-tenant virtual identity provider securely manages and connects user identities to the devices, applications, and networks that they need to access. And, by the way, we do salt and hash passwords with a very strong algorithm. That’s basic security in our mind and something that we take very seriously. If you would like to chat with us more about our Directory-as-a-Service platform, drop us a note. Or, feel free to give JumpCloud a try.