Hashing Myths
Because our forum is being polluted with bad information.
Myth #1: When you hash something, you get a unique result that no other file or string or password can have.
Wroonnnnggggg. Let's attack this one with simple logic. Let's say your hash is 32 characters long. Now let's say you hash every possible 33-character string there is. You will have strings with matching hashes, or "collisions". It's simple logic -- there are far more combinations of 33-character strings than there are of 32-character strings, because for every 32-character string that exists, you can tack on every possible character to the end and make a bunch of 33-character strings. So, just making up some example numbers, if there are 90,000 33-character strings and 20,000 32-character strings, some of those 33'ers MUST have the exactly the same 32-character hash. The goal of hashing algorithms is to make collisions as rare as possible, but it is impossible to write to a hashing algorithm that has no collisions.
Myth #2: MD5 is insecure.
Wroonnnnggggg. MD5 is a less sophisticated (and therefore much faster) hashing algorithm than, say, SHA-256, but it is not insecure. An insecure hash would mean that the hash could be reversed -- or rather, that you could take a hash, and, using that and having no other information, produce a string that has the same hash. You cannot do that with MD5. In fact, the closest anyone has gotten to this is changing an existing, large file in a way that doesn't change the hash it already has. No one in the history of humankind has been able to produce a "reverse" MD5. This myth comes from the fact that MD5 is a common target of password-cracking attacks, which leads to our next myth...
Myth #3: MD5 is less secure for password hashing than other algorithms like SHA.
Wroonnnnggggg. There are exactly three attacks that can be used to find out someone's password if you have the hash of that password:
- Hash database lookup. You go to a super-large online database of short words and their hashes, plug in the hash, and see if a short word with that hash has ever been submitted before. A common prevention for this is to salt your passwords.
- Brute force. You run through all the words of a dictionary, hashing each one, to see if the hash matches what you have on hand. If that doesn't work, you just start hashing every possible combination of 5, 6, 7, or 8-character words to find a match. This takes for effing ever and rarely produces a result, and can easily be prevented with a salt.
- Rainbow tables. This is a method of hashing and re-hashing the data you have to find similarities in the hashes of other words, which can eventually lead to finding the password. Salts have minimal effect on these attacks.
Myth #4: Sophisticated, super-long hashes like SHA-256 are harder to attack because they're longer.
Wroonnnnggggg. Let's face it: the biggest threat to hash cracking attacks is the rainbow tables method. It is by far the most efficient tradeoff between processing time and storage space, and often times can find a password in under a minute if you have enough chains. But rainbow tables are not magical. That's the impression most people have because it's easier to believe that than to learn how they work, but seriously, the rainbow table attack isn't hard to understand. I suggest reading up on it if you have a spare 15 minutes.
So if you know how rainbow tables work, you know that the biggest weakness they have is hash collisions -- which, you'll remember from above, means more than one string that produces the same hash. So password hashing security is a tradeoff: You want an algorithm that has reasonably few collisions so that no two passwords are likely to generate the same hash, but you also want one that produces enough collisions to potentially send a rainbow table into an endless loop that can't be cracked. SHA-256 is horrible for this, because it's too good a hashing algorithm. You're extremely unlikely to get any collisions at all when using SHA-256 for a password (even a salted one), so using that makes your passwords -- salted or not -- easier to crack. SHA-256 is awesome for hashing large files. Not so awesome for passwords.
At the same time, though, you don't want to use a measly 32 or 48-bit hash, because collisions are extremely likely with those. You want to find a happy medium, which is around 128 bits. What's a fast 128-bit hashing algorithm? MD5.
Myth #5: SHA-1 is still better to use than MD5 because MD5 is more likely to be cracked.
Wroonnnnggggg. Again, you're being lulled into a false sense of security. Not only is SHA-1 160 bits (so you'll still get collisions, but fewer than MD5), hash databases and rainbow tables are just as easy and well-developed for SHA-1 as they are for MD5. This myth is one that is spread far and wide among developers who have never taken the time to research the facts. The fact that SHA produces a longer hash has little-to-no impact on the ability to store them in a database or produce rainbow chains with them. All it takes is a tiny bit more storage space, and that's true for any hashing algorithm in the world, whether it's 8 bits or 8 thousand bits. The length only helps for reducing collisions, which, past around 128 bits, doesn't help at all when you're hashing passwords rather than large files.
Myth #6: Hashing with multiple algorithms is more secure!
Wroonnnnggggg. No. Just no. Come on, now you're just pulling stuff out of your ass. I've seen this before: sha1(md5("Password")). That is ridiculous. You're feeding 128 bits into 160 bits, which is an easy easy crack. You can't make hashes more-hashy. You might add one more step to the process for someone cracking it, but it's going to end up with the same result. Don't guess at what's more secure, know what's more secure.
Myth #7: Using a global salt AND a user salt is more secure than using just a user salt.
There was a thread recently in which folks supported the idea of using a global salt in addition to a user salt, and I dismissed it as pointless. I was met with some fierce opposition, claiming that it makes passwords more secure if a hacker should be able to steal a copy of your database but not your source code. This is true, however the amount of extra security this offers is minimal at best. Here's why:
Out of our list of attacks on hashes that I wrote out earlier, there is only one attack that's made more difficult to crack by using a global salt, and that's the brute force attack -- the one that's already almost impossible to use successfully to begin with. Adding a global salt only makes this more-impossible-than-nearly-impossible. So I'll admit there is SOME benefit, but with the most common and most successful attack being rainbow tables, folks would be crazy to try a brute force attack anyway. They'd just rainbow table it, and the global salt won't make your password any more secure than a properly long user salt would. And hey, let's face it: If someone's able to steal a copy of your database, chances are that they can nab a copy of your source without too much trouble too.
If you're looking to protect against the case where someone gets a copy of your database, a far, far, far more effective solution would be to use symmetric encryption on your hashes. Choose a fast, simple symmetric encryption algorithm (RC4, Blowfish... your call), generate a key and store it somewhere in a configuration file. Use that to encrypt all your salted password hashes in the database. Then, when you pull them from the database, just decrypt them with that key when you read them. Now you have a hash that can't even be attacked with a rainbow table if someone happens to gain unauthorized access to either your database or your account on the server, and this is still effective if you distribute your software to the public. MUCH more effective than using a global salt.
So what are the best practices for password hashing?
The only two, good options for password hashing are MD5 and SHA-1, and you should let the language you're programming in dictate which one you use. For PHP (and most languages), MD5 is faster than SHA-1, so it's the better option. The key to making it secure is using a salt of the appropriate length. The sweet spot for securely storing a password that isn't susceptible to dictionary or brute force attacks AND is relatively safe from rainbow table attacks is to feed twice the number of bits of a hash into a new hash. So, for example, using MD5, you'd want to hash two MD5s. You can generate a 128-bit salt for this (which, for a lot of users, can take up some significant storage space), or you can generate a salt that's just a handful of characters long and hash that. My favorite method is this:
$finalHash = md5(md5($salt) . md5($password))
The benefit of that is that you're feeding twice the data of an MD5 into a new hash (hitting the collision sweet spot), it's ultra ultra portable because you can do this in any language including raw SQL, three MD5s is still a relatively fast process, and, for web use, you can have javascript hash the password before it gets sent to your server and you can still generate the final hash with that.
Whether you want to put the symmetric encryption layer on top of that is entirely up to you
So please, for the love of all that is holy, stop teaching these myths to other people! The world's programmers thank you :D






Cartoon Clouds
Mountains
Sunrise
Clouds
Green Clouds
None





















Help