Now that we’re all on the same page as far as how to create secure passwords for ourselves, I’m going to go into further detail about how companies store passwords on their webpages, as well as more techniques hackers use to try to crack passwords. To start, I’m going to review what I said about hashes in the previous post.
Hashes and MD5
In a sentence, hashes are one-way functions. MD5 at one point was the most popular hash. It takes an input string (or file) and outputs a 32-character long hash. Even if the input string is two letters long, it gives a 32-character hash. The first question you may ask is “how is the MD5 function irreversible?” This is a complicated question that really is dependent on the algorithm. As a simple example, lets pretend that every letter has a corresponding number: A-1, B-2, C-3, and so on. Now lets just pretend that the MD5 hash takes your input string, say “I Like Bananas”, and converts it to numbers: {9, 0, 12, 9, 11, 5, 0, 2, 1, 14, 1, 14, 1, 19}. Then it just adds them all together! So the MD5 in our example would be 88! Now how many combinations of letters are there that add up to 88? Lots. This is a simple example of how MD5 works. In reality, the algorithm is much more complicated (you can read about it in detail here), but you get the picture.
You may ask if there are two input strings that give the same MD5 hash. The answer is yes! These are called collisions. The proof of this is easy: There are a lot more possible inputs than there are outputs, since the outputs are limited to 32 characters, and inputs can be as long as one wants (the idea behind this is called the pigeonhole principle). Finding a collision is a lot harder, but they are few and far between so that collisions weren’t really a problem for most applications.
The problem with MD5 is directly related to how fast the algorithm works. Modern graphics cards can calculate billions of MD5 hashes per second. This is a problem, but I’ll come back to that. First lets review what I said about MD5 hashes and passwords. Passwords should never be stored directly on a website by a company. The reason behind this is, if the server is compromised, then so is everyone’s password. Instead, the password hash is stored on a website, so that if someone cracks the website, all they have access to is the hashes, not the passwords themselves.
Here comes the problem with MD5. A list of hashes may be useless to the common user, but then again the common user is not one who could crack into a website. Advanced users will start generating all the possible MD5 hashes by starting with “a”, hashing that, and keep going until they figure out that “bananas”, when hashed, matches on of the hashes in the database. They then know that bananas is the password for “John Davis”. Just like I talked about brute forcing in the last post, the same technique can be applied to a whole list of hashes.
Rainbow Tables
This is a dire situation already, but somebody came along and created rainbow tables. Rainbow tables are huge files that sometimes can barely fit on the hard drive of a computer. But, they save a lot of time for a would-be hacker. Rainbow tables are pre-calculated MD5 hashes. This means that, even though there is no way to reverse the MD5 hash to achieve the original password, the hacker can look up the hash in a rainbow table and discover what the original password was, since the rainbow table contains many of the possibilities. At this point, it seems like no one stands a chance if a server gets compromised.
Salts
Not to be outdone, computer security experts devised a way to get around the idea of a rainbow table. The rainbow table tries many of the combinations and compiles a dictionary of MD5 hashes and passwords. Clearly, however, the table couldn’t contain all the passwords that a hash could represent: there are simply too many passwords. This idea is the basis for a salt.
Salts are randomly-generated user-specific strings that are added to the end of a password before being hashed. Lets dig deeper into this. Let’s say I create an account on a website that uses salts. I want my password to be “bananas”. When I create the account, the website creates a random salt – say “12345″ – and adds it to the end of my password, making the entire password becomes “bananas12345″. It stores the salt for me (so that it knows when I login to add “12345″ before authenticating me), and the hash of “bananas12345″. The odds are much lower that the rainbow table contains “bananas12345″ than they are of it just containing “bananas”.
There is a reason that each user has a different salt. The reason is that a rainbow table could be generated based on the original rainbow table that has all the words they want to include but with “12345″ added. If this table is generated, the entire database of users would be compromised and their passwords’ cracked. If each user has a different salt, however, rainbow tables could not be generated. Well, they could, but you would be generating a rainbow table for each user, which defeats the purpose of a rainbow table.
I hope this all makes sense to you. I think its important to understand how passwords are secured on the company’s website itself, and how it is equally their responsibility to make sure our data is secured. In fact, a law may be made in the United States that makes it illegal to store passwords in plaintext – that is, without hashing and / or salting. There are a lot more things I could have gone into detail on, include some of the more secure hashes (such as SHA-256 and, even more secure, bcrypt), but this post was getting long enough as it is. Thanks for reading!
