Phusion white papers Phusion overview

Securely store passwords with bcrypt-ruby; now compatible with JRuby and Ruby 1.9

By Hongli Lai on August 13th, 2009

When writing web applications, or any application for that manner, any passwords should be stored securely. As a rule of thumb, one should never store passwords as clear text in the database for the following reasons:

  • If the database ever gets leaked out, then all accounts are compromised until every single user resets his password. Imagine that you’re an MMORPG developer; leaking out the database with clear text passwords allows the attacker to delete every player’s characters.
  • Many people use the same password for multiple sites. Imagine that the password stored in your database is also used for the user’s online banking account. Even if the database does not get leaked out, the password is still visible to the system administrator; this can be a privacy breach.

There are several “obvious” alternatives, which aren’t quite secure enough:

Storing passwords as MD5/SHA1/$FAVORITE_ALGORITHM hashes
These days MD5 can be brute-force cracked with relatively little effort. SHA1, SHA2 and other algorithms are harder to brute-force, but the attacker can still crack these hashes by using rainbow tables: precomputed tables of hashes with which the attacker can look up the input for a hash with relative ease. This rainbow table does not have to be very large: it just has to contain words from the dictionary, because many people use dictionary words as passwords.

Using plain hashes also makes it possible for an attacker to determine whether two users have the same password.

Encrypting the password
This is not a good idea because if the attacker was able to steal the database, then there’s a possibility that he’s able to steal the key file as well. Plus, the system administrator is able to read everybody’s passwords, unless he’s restricted access to either the key file or the database.

The solution is to store passwords as salted hashes. One calculates a salted hash as follows:

salted_hash = hashing_algorithm(salt + cleartext_password)

Here, salt is a random string. After calculating the salted hash, one should store the salted hash in the database, along with the (cleartext) salt. It is not necessary to keep the salt secret or to obfuscate it.

When a user logs in, one can verify his password by re-computing the salted hash and comparing it with the salted hash in the database:

salted_hash = hashing_algorithm(salt_from_database + user_provided_password)
if (salted_hash == salted_hash_from_database):
    user is logged in
else:
    password incorrect

The usage of the salt forces the attacker to either brute-force the hash or to use a ridiculously large rainbow table. In case of the latter, the sheer size of the required rainbow table can make it unpractical to generate. The larger the salt, the more difficult it becomes for the cracker to use rainbow tables.

However, even with salting, one should still not use SHA1, SHA2, Whirlpool or most other hashing algorithms because these algorithms are designed to be fast. Although brute forcing SHA2 and Whirlpool is hard, it’s still possible given sufficient resources. Instead, one should pick a hashing algorithm that’s designed to be slow so that brute forcing becomes unfeasible. Bcrypt is such a slow hashing algorithm. A speed comparison on a MacBook Pro with 2 Ghz Intel Core 2 Duo:

  • SHA-1: 118600 hashes per second.
  • Bcrypt (with cost = 10): 7.7 hashes per second.

Theoretically it would take 4*10^35 years for a single MacBook Pro core to crack an SHA-1 hash, assuming that the attacker does not harness any weaknesses in SHA-1. To crack a bcrypt hash one would need 6*10^39 years, or 10000 more times. Therefore, we recommend the use of bcrypt to store passwords securely.

There’s even a nice Ruby implementation of this algorithm: bcrypt-ruby! Up until recently, bcrypt-ruby was only available for MRI (“Matz Ruby Interpreter”, the C implementation that most people use). However, we’ve made it compatible with JRuby! The code can be found in our fork at Github. The current version also has issues with Ruby 1.9, which we’ve fixed as well. The author of bcrypt-ruby has already accepted our changes and will soon release a new version with JRuby and Ruby 1.9 support.

Further recommended reading

How to Safely Store a Password by Coda Hale.

  • http://blog.youngbloods.org Carl Youngblood

    Just a quick question. Are bcrypt hashes designed to take the same amount of time no matter what CPU you are running on? If not, the only problem I see with your time estimates is that Moore’s Law will change them as time goes by.

  • http://www.phusion.nl/ hongli

    No. But you can increase the cost of hash generation via a simple parameter. Each increment doubles the necessary amount of CPU, so it grows exponentially. For example:

    • Cost 10 (default): 1 second
    • Cost 11: 2 seconds
    • Cost 12: 4 seconds
    • Cost 13: 8 seconds

    Even Moore’s law will have a hard time with this.

    The alternative is to use scrypt. It’s like bcrypt but it’s designed to force a certain amount of memory usage; this memory usage can be dictated by the creator of the hash. This will significantly increase the cost of cracking hardware, even compared with bcrypt. But the reason why I don’t use it is that there aren’t many good widespread implementations yet.

  • http://www.phusion.nl/ Ninh Bui

    I don’t think there is any hash algorithm out there that wouldn’t run faster on a faster CPU (more instructions per timeunit etc… but please correct me if I’m wrong). To “combat moore’s law” you essentially need to keep the cost higher than moore’s law for “a couple of years” seeing as it grows exponentially as hongli has shown. This would make it very infeasable to find a collision.

  • http://blog.youngbloods.org Carl Youngblood

    I’m thinking it would be cool if a hash algorithm could gather information about the system it’s running on so that it adjusted its cost based on the CPU speed. Basically, I’m looking for a way to future-proof (as much as possible) a hash algorithm. I have been using SHA-256 but perhaps I should start using bcrypt like you’re saying.

  • http://blog.youngbloods.org Carl Youngblood

    I should add, of course the hash would need to be updated periodically for this to work.

  • http://www.phusion.nl/ hongli

    The reference scrypt implementation actually detects the system’s speed and determines the corresponding resource usage that hash calculation should use.

    That said, I don’t think it’s possible to make an algorithm that runs equally fast on all CPUs. Whatever throttling code you add, others can remove. However bcrypt is already about 10000 times slower than SHA-256 so I think you’ll be fine for a while.

  • http://blog.youngbloods.org Carl Youngblood

    Yeah, thanks for the info. I’m not concerned about making it run exactly the same speed; it should just be within an order of magnitude so that future hackers are more easily kept at bay. scrypt sounds really cool. Does that mean that the same hash will consume more CPU cycles on future machines to compensate for their increased computing capacity?

  • http://www.phusion.nl/ hongli

    No, scrypt just allows one to increase the required memory usage, kind of like what the ‘cost’ parameter does for bcrypt. Future hardware can calculate scrypt hashes faster, but because scrypt can require more memory, it’s less economical for an attacker to crack scrypt hashes.

  • http://blog.youngbloods.org Carl Youngblood

    One potential solution would be to require users to update their passwords every two years and adjust scrypt’s memory parameter to always be a percentage of the total available server memory.

  • fractious

    Somewhat relatedly, does anyone know if ruby/rails web apps might be vulnerable to timing attacks, as outlined here:

    http://codahale.com/a-lesson-in-timing-attacks/

    I guess it’s all down to the individual authentication system/algorithm used but can anyone comment on whether the popular rails authentication plugins might be affected by this vector?

  • http://www.phusion.nl/ hongli

    @factious: I don’t think so. Web applications communicate with the client over a network, so the time taken to authenticate someone depends heavily on network latency. This makes timing attacks a lot harder to perform.

    Apparently I was wrong because some researchers succeeded.

  • Menno

    Is there any best practice to follow if you want to migrate from an old (md5 no salting) approach to a safer approach?

    The only method I can think of is replacing the old hash with the new hash when a user logs in, as that’s the only time you have the cleartext password available.

    Any experiences with this?

  • http://barkingiguana.com/ Craig Webster

    Inspired by seeing yet another site using MD5 to “secure” passwords I wrote an article that gives a brief overview of how easy it is to attack passwords that are secured using several common approaches:

    http://barkingiguana.com/blog/2009/08/03/securing-passwords-with-salt-pepper-and-rainbows/

  • http://marcostoledo.com Marcos Toledo

    There’s also the scrypt library, which has been created to be even more resistant to brute force attacks than bcrypt. From their website, http://www.tarsnap.com/scrypt/

    “We estimate that on modern (2009) hardware, if 5 seconds are spent computing a derived key, the cost of a hardware brute-force attack against scrypt is roughly 4000 times greater than the cost of a similar attack against bcrypt (to find the same password), and 20000 times greater than a similar attack against PBKDF2. ”

    coderrr has written ffi extensions wrapping calls to scrypt. They can be found here: http://github.com/coderrr/scrypt-ruby

  • John

    @11 not true:

    “Crosby et al. conclude: We have shown that, even though the Internet induces significant timing jitter, we can reliably distinguish remote timing differences as low as 20µs. A LAN environment has lower timing jitter, allowing us to reliably distinguish remote timing differences as small as 100ns (possibly even smaller). These precise timing differences can be distinguished with only hundreds or possibly thousands of measurements.” http://news.ycombinator.com/item?id=761636

  • http://coderrr.wordpress.com coderrr

    I did some bindings for scrypt when it first was announced: http://github.com/coderrr/scrypt (fixes to get it compiling on linux), http://github.com/coderrr/scrypt-ruby (ruby bindings)

    and @fractious, @hongli, yes timing attacks are still possible across even laggy networks, they just take more data points, (according to multiple people who have done research on the topic)

  • http://ra66i.org raggi

    Hey,

    Can you explain how you would use rainbow tables against salted SHA1?

    Also can you explain how you’d go about protecting your users if someone gained access to the hashes?

    How long do you need to protect this data for?

    Food for thought…

  • http://ra66i.org raggi

    Oh and anyone who uses this and doesn’t use SSL deserves to be laughed at!

  • http://www.jeffemminger.com/ jeff

    So is bcrypt supposed to be more secure than SHA*? Otherwise why not just throw a sleep(1) between login attempts? Or lock out an account after 5 bad attempts?

  • http://dancroak.com Dan Croak

    “Theoretically it would take 4*10^35 years for a single MacBook Pro core to crack an SHA-1 hash.” This seems like a reasonable amount of security for most web applications.

  • http://www.phusion.nl/ hongli

    @jeff: someone already answered this for you on Reddit:

    If an attacker gets hold of your database, he can use a modest GPU to crack SHA1 at about 70 million hashes per second. I’m not aware of any GPU bcrypt crackers; he might manage a few hundred per second with the same resources, and even if he has an FPGA to play with he’s not going to do much better than that.

    Sure, you can make the login delays fake, but why deliberately pick the solution that’s a million times weaker for a similar amount of implementation effort?

    scrypt is an even more secure alternative, since it’s not just processing-hard, but memory-hard, thus eating up even more gates on an FPGA and limiting possible concurrency on a GPU (if you can even implement it there).

    @Dan: That’s assuming there are no weaknesses in SHA-1 to exploit. In practice there are, and the exploits will probably become better over time.

  • http://cloudscaling.com Randy Bias

    Why are you letting your end users provide trivially guessable dictionary words for passwords? That doesn’t make any sense at all. You will probably get more mileage from simply enforcing semi-sane password picking behavior.

  • http://www.phusion.nl/ hongli

    @Randy: I don’t think forcing a password strength will work. Some people are lazy, or just not in the mood of coming up with a good password, or whatever. If you make the registration process harder by enforcing a password strength you may lose users. In the end I think users themselves are responsible for their password’s strength.

  • http://www.mccune.org.uk Rory McCune

    A couple of points on your article…

    – The goal of a salted hash is to protect against offline brute-force attacks (attacker has access to the database and the hashes). It’s worth noting that the ideally the salt should be different per user. Otherwise the attacker can generate a set of rainbow tables for that combination (@hash + word) for a dictionary, relatively easily. With per-user hashing they need to start from scratch for each user (so no point in creating a rainbow table as it’s no faster than a straight brute-force)

    Also the length of the hash doesn’t (IMO) increase the time required to create the rainbow table (assuming that the attacker has access to the hash, which if they’ve got access to the database is a reasonable assumption).

    – Also on the point of password strength, it’s worth noting that if people choose simple dictionary words for passwords then no matter how slow the algorithm you choose, if the attacker just has to create ~50,000 hashes (a typical large password dictionary) then they’ll be able to do that… One option might be to provide some advice to users about creating mnemonic passwords as a good route to create easy to remember passwords that aren’t going to fall to a dictionary attack.

  • Pingback: Double Shot #519 « A Fresh Cup

  • Josh

    I would simply like to say thanks. Phusion has done so much to help the Ruby community and we are all very grateful.

  • Pingback: Ruby on Rails: 9 Articles on Rails Authentication

  • http://www.twitter.com/ian_cal Ian

    You should think about how the hacker would recreate the hashes not about bruteforcing. As bcrypt is open source and if your hacker had a very large dictionary word or known password list, how hard would it be for him to make a bcrypt version of his list and look for matches?

    Good salt for sure is always needed no matter what the crypto when you look at this type of attack method. As really the salt is the only strength no matter what algorithm.

  • Pingback: Caffeine Driven Development » Blog Archive » L33t Links #2

  • Luca

    An easier way (and most likely safer), would just to be hand off the authentication to someone else. Use OpenID!

  • Pingback: Simple suggestions for implementing passwords correctly | Zen and the Art of Programming

  • http://www.23impressions.com 23 Impressions

    AUTHLOGIC and restful are far more easier to implement , we should try that .

  • Pingback: Ruby on Rails Programmer -

  • Pingback: Ruby on Rails: 9 Articles on Rails Authentication — Teach Me To Code