Sometimes You Don t Need to Store a Secret | Writing Secure Code, Second Edition

Sometimes You Don't Need to Store a Secret

If you store a secret for the purpose of verifying that another entity also knows the secret, you probably don't need to store the secret itself. Instead, you can store a verifier, which often takes the form of a cryptographic hash of the secret. For example, if an application needs to verify that a user knows a password, you can compare the hash of the secret entered by the user with the hash of the secret stored by the application. In this case, the secret is not stored by the application only the hash is stored. This presents less risk because even if the system is compromised, the secret itself cannot be retrieved (other than by brute force) and only the hash can be accessed.

What Is a Hash?

A hash function, also called a digest function, is a cryptographic algorithm that produces a different output, called a message digest, for each unique element of data. Identical data has the same message digest, but if even one of the bits of a document changes, the message digest changes. Message digests are usually 128 bits or 160 bits in length, depending on the algorithm used. For example, MD5, created by RSA Data Security, Inc., creates a 128-bit digest. SHA-1, developed by the National Institute of Standards and Technology (NIST) and the National Security Agency (NSA), creates a 160-bit digest. (Currently SHA-1 is the hash function of choice. However, NIST has proposed three new variations of SHA-1: SHA-256, SHA-384, and SHA-512. Microsoft CryptoAPI supports MD4, MD5, and SHA-1, and the .NET Framework supports MD5, SHA-1, SHA-256, SHA-384, and SHA-512. Go to csrc.ncsl.nist.gov/cryptval/shs.html for more information about the newer SHA algorithms.)

Not only is it computationally infeasible to determine the original data by knowing just its message digest, but it's also infeasible to create data that will match any given hash. (A good analogy is your thumbprint. Your thumbprint uniquely identifies you, but by itself it does not reveal anything about you.) Note that this is especially true for large sets of data figuring out that a given hash represents a short word is fairly trivial.

Creating a Salted Hash

To make things a little more difficult for an attacker, you can also salt the hash. A salt is a random number that is added to the hashed data to eliminate the use of precomputed dictionary attacks, making an attempt to recover the original secret extremely expensive. A dictionary attack is an attack in which the attacker tries every possible secret key to decrypt encrypted data. The salt is stored, unencrypted, with the hash. The salt should be cryptographically random and generated using good random number generation techniques, such as those outlined in Chapter 8, Cryptographic Foibles.

Creating a salted hash, or a simple verifier, is easy with CryptoAPI. The following C/C++ code fragment shows how to do this:

//Create the hash; hash the secret data and the salt. if (!CryptCreateHash(hProv, CALG_SHA1, 0, 0, &hHash)) throw GetLastError(); if (!CryptHashData(hHash, (LPBYTE)bSecret, cbSecret, 0)) throw GetLastError(); if (!CryptHashData(hHash, (LPBYTE)bSalt, cbSalt, 0)) throw GetLastError(); //Get the size of the resulting salted hash. DWORD cbSaltedHash = 0; DWORD cbSaltedHashLen = sizeof (DWORD); if (!CryptGetHashParam(hHash, HP_HASHSIZE, (BYTE*)&cbSaltedHash, &cbSaltedHashLen, 0)) throw GetLastError(); //Get the salted hash. BYTE *pbSaltedHash = new BYTE[cbSaltedHash]; if (NULL == *pbSaltedHash) throw; if(!CryptGetHashParam(hHash, HP_HASHVAL, pbSaltedHash, &cbSaltedHash, 0)) throw GetLastError();

You can achieve the same goal in managed code using the following C# code:

using System; using System.Security.Cryptography; using System.IO; using System.Text; ... static byte[] HashPwd(byte[] pwd, byte[] salt) { SHA1 sha1 = SHA1.Create(); UTF8Encoding utf8 = new UTF8Encoding(); CryptoStream cs = new CryptoStream(Stream.Null, sha1, CryptoStreamMode.Write); cs.Write(pwd,0,pwd.Length); cs.Write(salt,0,salt.Length); cs.FlushFinalBlock(); return sha1.Hash; }

The complete code listings are available with the book's sample files in the folder Secureco2\Chapter09\SaltedHash. Determining whether the user knows the secret is easy. Take the user's secret, add the salt to it, hash them together, and compare the value you stored with the newly computed value. The Windows API CryptGetHashParam adds data to a hash and rehashes it, which is effectively the same thing. If the two match, the user knows the secret. The good news is that you never stored the secret; you stored only a verifier. If an attacker accessed the data, he wouldn't have the secret data, only the verifier, and hence couldn't access your system, which requires a verifier to be computed from the secret. The attacker would have to attack the system by using a dictionary or brute-force attack. If the data (passwords) is well chosen, this type of attack is computationally infeasible.

Using PKCS #5 to Make the Attacker's Job Harder

As I've demonstrated, many applications hash a password first and often apply a salt to the password before using the result as the encryption key or authenticator. However, there's a more formal way to derive a key from a human-readable password, a method called PKCS #5. Public-Key Cryptography Standard (PKCS) #5 is one of about a dozen standards defined by RSA Data Security and other industry leaders, including Microsoft, Apple, and Sun Microsystems. PKCS #5 is also outlined in RFC2898 at http://www.ietf.org/rfc/rfc2898.txt.

PKCS#5 works by hashing a salted password a number of times; often, the iteration count is in the order of 100s if not 1000s of iterations. Or, more accurately, the most common mode of PKCS #5 named Password-Based Key Derivation Function #1 (PBKDF1) works this way. The other mode, PBKDF2, is a little different and uses a pseudorandom number generator. For the purposes of this book, I mean PBKDF1 when referring to PKCS #5 generically.

The main threat PKCS #5 helps mitigate is dictionary attacks. It takes a great deal of CPU time and effort to perform a dictionary attack against a password when the password-cracking software must perform the millions of instructions required by PKCS #5 to determine whether a single password is what the attacker thinks it is. Many applications simply store a password by hashing it first and comparing the hash of the password entered by the user with the hash stored in the system. You can make the attacker's work substantially harder by storing the PKCS #5 output instead.

To determine the password, the attacker would have to perform the following steps:

Get a copy of the password file.
Generate a password (p) to check.
Choose a salt (s).
Choose an iteration count (n).
Perform n-iterations of the hash function determined by PKCS #5.

If the salt keyspace is large say, at least 64 bits of random data the attacker has to try potentially 2^64 (or 2^63, assuming she can determine the salt in 50 percent of the attempts) more keys to determine the password. And if the iteration count is high, the attacker has to perform a great deal of work to establish whether the password and salt combination are correct.

Using PKCS #5, you can store the iteration count, the salt, and the output from PKCS #5. When the user enters her password, you compute the PKCS #5 based on the iteration count, salt, and password. If the two results match, you can assume with confidence the user knows the password.

The following sample code written in C# shows how to generate a key from a passphrase:

static byte[] DeriveBytes(string pwd, byte[] salt, int iter) { PasswordDeriveBytes p = new PasswordDeriveBytes(pwd,salt,"SHA1",iter); return p.GetBytes(16); }

Note that the default CryptoAPI providers included with Windows do not support PKCS #5 directly; however, CryptDeriveKey offers similar levels of protection.

As you can see, you might be able to get away with not storing a secret, and this is always preferable to storing one.

IMPORTANT
There's a fly in the ointment: the salt value might be worthless! Imagine you decide to use PKCS #5 or a hash function to prove the user is who they say they are. To be highly secure, the application stores a large, random salt on behalf of the user in an authentication database. If the attacker can attempt to log on as a user, he need not attempt to guess the salt; he could simply guess the password. Why? Because the salt is applied by the application, it does not come from the user. The salt in this case protects against an attacker attacking the password database directly; it does not prevent an attack where the application performs some of the hashing on behalf of the user.