ChatGPT解决这个技术问题 Extra ChatGPT

How does password salt help against a rainbow table attack?

I'm having some trouble understanding the purpose of a salt to a password. It's my understanding that the primary use is to hamper a rainbow table attack. However, the methods I've seen to implement this don't seem to really make the problem harder.

I've seen many tutorials suggesting that the salt be used as the following:

$hash =  md5($salt.$password)

The reasoning being that the hash now maps not to the original password, but a combination of the password and the salt. But say $salt=foo and $password=bar and $hash=3858f62230ac3c915f300c664312c63f. Now somebody with a rainbow table could reverse the hash and come up with the input "foobar". They could then try all combinations of passwords (f, fo, foo, ... oobar, obar, bar, ar, ar). It might take a few more milliseconds to get the password, but not much else.

The other use I've seen is on my linux system. In the /etc/shadow the hashed passwords are actually stored with the salt. For example, a salt of "foo" and password of "bar" would hash to this: $1$foo$te5SBM.7C25fFDu6bIRbX1. If a hacker somehow were able to get his hands on this file, I don't see what purpose the salt serves, since the reverse hash of te5SBM.7C25fFDu6bIRbX is known to contain "foo".

Thanks for any light anybody can shed on this.

EDIT: Thanks for the help. To summarize what I understand, the salt makes the hashed password more complex, thus making it much less likely to exist in a precomputed rainbow table. What I misunderstood before was that I was assuming a rainbow table existed for ALL hashes.

Also, updated here - use of md5 hashing is no longer best practice. stackoverflow.com/questions/12724935/salt-and-passwords
Thanks for the Edit. I had the same doubt which is now clarified. So the point of 'Salt' really is to make it highly unlikely for a Rainbow table to contain the hash of the adulterated (salted) password, at the first place. :D

S
Stef Heylen

A public salt will not make dictionary attacks harder when cracking a single password. As you've pointed out, the attacker has access to both the hashed password and the salt, so when running the dictionary attack, she can simply use the known salt when attempting to crack the password.

A public salt does two things: makes it more time-consuming to crack a large list of passwords, and makes it infeasible to use a rainbow table.

To understand the first one, imagine a single password file that contains hundreds of usernames and passwords. Without a salt, I could compute "md5(attempt[0])", and then scan through the file to see if that hash shows up anywhere. If salts are present, then I have to compute "md5(salt[a] . attempt[0])", compare against entry A, then "md5(salt[b] . attempt[0])", compare against entry B, etc. Now I have n times as much work to do, where n is the number of usernames and passwords contained in the file.

To understand the second one, you have to understand what a rainbow table is. A rainbow table is a large list of pre-computed hashes for commonly-used passwords. Imagine again the password file without salts. All I have to do is go through each line of the file, pull out the hashed password, and look it up in the rainbow table. I never have to compute a single hash. If the look-up is considerably faster than the hash function (which it probably is), this will considerably speed up cracking the file.

But if the password file is salted, then the rainbow table would have to contain "salt . password" pre-hashed. If the salt is sufficiently random, this is very unlikely. I'll probably have things like "hello" and "foobar" and "qwerty" in my list of commonly-used, pre-hashed passwords (the rainbow table), but I'm not going to have things like "jX95psDZhello" or "LPgB0sdgxfoobar" or "dZVUABJtqwerty" pre-computed. That would make the rainbow table prohibitively large.

So, the salt reduces the attacker back to one-computation-per-row-per-attempt, which, when coupled with a sufficiently long, sufficiently random password, is (generally speaking) uncrackable.


I'm not sure what I said in my answer to imply that they were?
erickson, I think the edit was confusing--I don't think most people consider a rainbow table attack to be a kind of dictionary attack. Let me know if there's something specific you think is confusing in my answer, and I'll try to correct it.
I wish a could give more then one upvote! Especially for the first Paragraph. That one sum it up all IMHO
I know this is old, but your description of rainbow tables is incorrect. You're describing hash tables instead. For a rainbow table see security.stackexchange.com/questions/379/…. A hash table has 1 to 1 mapping of passwords to hashes (as you describe), but rainbow tables require a reducing function which transforms a hash back to plaintext, to then be rehashed thousands of times, storing only the initial plaintext and final hash. Searching is computationally longer than hash tables, but 'captures' many plaintexts per hash.
This answer misses the fact that not using a salt (bound to the creation of a password hash for a specific user) also exposes duplicate passwords, even over multiple tables storing these passwords. At the very minimum you would be able to identify passwords reused by a person, but even worse you would also identify passwords used by different persons, over different databases.
C
Community

The other answers don't seem to address your misunderstandings of the topic, so here goes:

Two different uses of salt

I've seen many tutorials suggesting that the salt be used as the following: $hash = md5($salt.$password) [...] The other use I've seen is on my linux system. In the /etc/shadow the hashed passwords are actually stored with the salt.

You always have to store the salt with the password, because in order to validate what the user entered against your password database, you have to combine the input with the salt, hash it and compare it to the stored hash.

Security of the hash

Now somebody with a rainbow table could reverse the hash and come up with the input "foobar". [...] since the reverse hash of te5SBM.7C25fFDu6bIRbX is known to contain "foo".

It is not possible to reverse the hash as such (in theory, at least). The hash of "foo" and the hash of "saltfoo" have nothing in common. Changing even one bit in the input of a cryptographic hash function should completely change the output.

This means you cannot build a rainbow table with the common passwords and then later "update" it with some salt. You have to take the salt into account from the beginning.

This is the whole reason for why you need a rainbow table in the first place. Because you cannot get to the password from the hash, you precompute all the hashes of the most likely used passwords and then compare your hashes with their hashes.

Quality of the salt

But say $salt=foo

"foo" would be an extremely poor choice of salt. Normally you would use a random value, encoded in ASCII.

Also, each password has it's own salt, different (hopefully) from all other salts on the system. This means, that the attacker has to attack each password individually instead of having the hope that one of the hashes matches one of the values in her database.

The attack

If a hacker somehow were able to get his hands on this file, I don't see what purpose the salt serves,

A rainbow table attack always needs /etc/passwd (or whatever password database is used), or else how would you compare the hashes in the rainbow table to the hashes of the actual passwords?

As for the purpose: let's say the attacker wants to build a rainbow table for 100,000 commonly used english words and typical passwords (think "secret"). Without salt she would have to precompute 100,000 hashes. Even with the traditional UNIX salt of 2 characters (each is one of 64 choices: [a–zA–Z0–9./]) she would have to compute and store 4,096,000,000 hashes... quite an improvement.


Really nice answer. It helped me understand things so much better. +1
If a hacker had access to the salt and how it was used in the hashing function, couldn't they just use that to generate a table of salted hashes and compare those hashes with the rainbow table?
@Jonny there is no "the salt". the whole point is that the salt is different for every password entry.
C
Carl Seleborg

The idea with the salt is to make it much harder to guess with brute-force than a normal character-based password. Rainbow tables are often built with a special character set in mind, and don't always include all possible combinations (though they can).

So a good salt value would be a random 128-bit or longer integer. This is what makes rainbow-table attacks fail. By using a different salt value for each stored password, you also ensure that a rainbow table built for one particular salt value (as could be the case if you're a popular system with a single salt value) does not give you access to all passwords at once.


+1: Salt can be a portion of the hex digest of some random string built by the random number generator. Each bit is random.
"Rainbow tables are one form of dictionary attack that gives up some speed to save storage space." - its actually the opposite, a good rainbow table can take over GB to store, in order to save time re-hashing all possible values.
Agreed - @erickson, I think your edit is wrong there. A rainbow table requires huge amounts of storage, but makes it fast to get the message behind the hash.
Well, you are both right. Compared to a standard dictionary attack, rainbow tables sacrifices speed in order to save storage space. On the other hand, compared to a brute force attack, rainbow tables uses (lots of) space to gain speed. Today, rainbow tables are almost synonymous with dictionary ...
... attacks, but you don't need rainbow tables for dictionary attacks.
A
Adam Liss

Yet another great question, with many very thoughtful answers -- +1 to SO!

One small point that I haven't seen mentioned explicitly is that, by adding a random salt to each password, you're virtually guaranteeing that two users who happened to choose the same password will produce different hashes.

Why is this important?

Imagine the password database at a large software company in the northwest US. Suppose it contains 30,000 entries, of which 500 have the password bluescreen. Suppose further that a hacker manages to obtain this password, say by reading it in an email from the user to the IT department. If the passwords are unsalted, the hacker can find the hashed value in the database, then simply pattern-match it to gain access to the other 499 accounts.

Salting the passwords ensures that each of the 500 accounts has a unique (salt+password), generating a different hash for each of them, and thereby reducing the breach to a single account. And let's hope, against all probability, that any user naive enough to write a plaintext password in an email message doesn't have access to the undocumented API for the next OS.


Same for two users that choose a different password, and it is probable that they have the same hashed password stored in the db. (Useless...I know)
M
MytyMyky

I was searching for a good method to apply salts and found this excelent article with sample code:

http://crackstation.net/hashing-security.htm

The author recomends using random salts per user, so that gaining access to a salt won't render the entire list of hashes as easy to crack.

To Store a Password: Generate a long random salt using a CSPRNG. Prepend the salt to the password and hash it with a standard cryptographic hash function such as SHA256. Save both the salt and the hash in the user's database record. To Validate a Password : Retrieve the user's salt and hash from the database. Prepend the salt to the given password and hash it using the same hash function. Compare the hash of the given password with the hash from the database. If they match, the password is correct. Otherwise, the password is incorrect.


Hashcat can try almost 17 billion salted SHA256 hashes per second using a single PC. The author of the linked article talks about this under the heading "Making Password Cracking Harder: Slow Hash Functions". scrypt, bcrypt, and PBKDF2 are good choices and more than worth the extra CPU cycles on the server IMHO. Argon2 is currently the state of the art, but not as battle-tested as the others.
q
quamrana

The reason a salt can make a rainbow-table attack fail is that for n-bits of salt, the rainbow table has to be 2^n times larger than the table size without the salt.

Your example of using 'foo' as a salt could make the rainbow-table 16 million times larger.

Given Carl's example of a 128-bit salt, this makes the table 2^128 times larger - now that's big - or put another way, how long before someone has portable storage that big?


Even if you use a single electron to store a bit, it will be quite a while before anyone produces portable storage with that capacity... unless you consider a solar system moving through the galaxy portable.
W
Wedge

Most methods of breaking hash based encryption rely on brute force attacks. A rainbow attack is essentially a more efficient dictionary attack, it's designed to use the low cost of digital storage to enable creation of a map of a substantial subset of possible passwords to hashes, and facilitate the reverse mapping. This sort of attack works because many passwords tend to be either fairly short or use one of a few patterns of word based formats.

Such attacks are ineffective in the case where passwords contain many more characters and do not conform to common word based formats. A user with a strong password to start with won't be vulnerable to this style of attack. Unfortunately, many people do not pick good passwords. But there's a compromise, you can improve a user's password by adding random junk to it. So now, instead of "hunter2" their password could become effectively "hunter2908!fld2R75{R7/;508PEzoz^U430", which is a much stronger password. However, because you now have to store this additional password component this reduces the effectiveness of the stronger composite password. As it turns out, there's still a net benefit to such a scheme since now each password, even the weak ones, are no longer vulnerable to the same pre-computed hash / rainbow table. Instead, each password hash entry is vulnerable only to a unique hash table.

Say you have a site which has weak password strength requirements. If you use no password salt at all your hashes are vulnerable to pre-computed hash tables, someone with access to your hashes would thus have access to the passwords for a large percentage of your users (however many used vulnerable passwords, which would be a substantial percentage). If you use a constant password salt then pre-computed hash tables are no longer valuable, so someone would have to spend the time to compute a custom hash table for that salt, they could do so incrementally though, computing tables which cover ever greater permutations of the problem space. The most vulnerable passwords (e.g. simple word based passwords, very short alphanumeric passwords) would be cracked in hours or days, less vulnerable passwords would be cracked after a few weeks or months. As time goes on an attacker would gain access to passwords for an ever growing percentage of your users. If you use a unique salt for every password then it would take days or months to gain access to each one of those vulnerable passwords.

As you can see, when you step up from no salt to a constant salt to a unique salt you impose a several orders of magnitude increase in effort to crack vulnerable passwords at each step. Without a salt the weakest of your users' passwords are trivially accessible, with a constant salt those weak passwords are accessible to a determined attacker, with a unique salt the cost of accessing passwords is raised so high that only the most determined attacker could gain access to a tiny subset of vulnerable passwords, and then only at great expense.

Which is precisely the situation to be in. You can never fully protect users from poor password choice, but you can raise the cost of compromising your users' passwords to a level that makes compromising even one user's password prohibitively expensive.


r
recursive

One purpose of salting is to defeat precomputed hash tables. If someone has a list of millions of pre-computed hashes, they aren't going to be able to look up $1$foo$te5SBM.7C25fFDu6bIRbX1 in their table even though they know the hash and the salt. They'll still have to brute force it.

Another purpose, as Carl S mentions is to make brute forcing a list of hashes more expensive. (give them all different salts)

Both of these objectives are still accomplished even if the salts are public.


r
rgargente

As far as I know, the salt is intended to make dictionary attacks harder.

It's a known fact that many people will use common words for passwords instead of seemingly random strings.

So, a hacker could use this to his advantage instead of using just brute force. He will not look for passwords like aaa, aab, aac... but instead use words and common passwords (like lord of the rings names! ;) )

So if my password is Legolas a hacker could try that and guess it with a "few" tries. However if we salt the password and it becomes fooLegolas the hash will be different, so the dictionary attack will be unsuccessful.

Hope that helps!


d
daniel

I assume that you are using PHP --- md5() function, and $ preceded variables --- then, you can try looking this article Shadow Password HOWTO Specially the 11th paragraph.

Also, you are afraid of using message digest algorithms, you can try real cipher algorithms, such as the ones provided by the mcrypt module, or more stronger message digest algorithms, such as the ones that provide the mhash module (sha1, sha256, and others).

I think that stronger message digest algorithm are a must. It's known that MD5 and SHA1 are having collision problems.


关注公众号,不定期副业成功案例分享
Follow WeChat

Success story sharing

Want to stay one step ahead of the latest teleworks?

Subscribe Now