Cryptography: Theory and Practice:Hash Functions

cryptography: theory and practice Cryptography: Theory and Practice
by Douglas Stinson
CRC Press, CRC Press LLC
ISBN: 0849385210   Pub Date: 03/17/95
  

Previous Table of Contents Next


THEOREM 7.1

Suppose h : XZ is a hash function where |X| and |Z| are finite and |X| ≥ 2|Z|. Suppose A is an inversion algorithm for h. Then there exists a probabilistic Las Vegas algorithm which finds a collision for h with probability at least 1/2.

PROOF  Consider the algorithm B presented in Figure 7.2. Clearly B is a probabilistic algorithm of the Las Vegas type, since it either finds a collision or returns no answer. Thus our main task is to compute the probability of success. For any xX, define x ~ x1 if h(x) = h(x1). It is easy to see that ~ is an equivalence relation. Define

Each equivalence class [x] consists of the inverse image of an element of Z, so the number of equivalence classes is at most |Z|. Denote the set of equivalence classes by C.

Now, suppose x is the element of X chosen in step 1. For this x, there are |[x]| possible x1’s that could be returned in step 3. |[x]| - 1 of these x1’s are different from x and thus lead to success in step 4. (Note that the algorithm A does not know the representative of the equivalence class [x] that was chosen in step 1.) So, given a particular choice xX, the probability of success is (|[x]| - 1)/|[x]|.


Figure 7.2  Using an inversion algorithm A to find collisions for a hash function h

The probability of success of the algorithm B is computed by averaging over all possible choices for x:

Hence we have constructed a Las Vegas algorithm with success probability at least 1/2.

Hence, it is sufficient that a hash function satisfy the strongly collision-free property, since it implies the other two properties. So in the remainder of this chapter we restrict our attention to strongly collision-free hash functions.

7.3 The Birthday Attack

In this section, we determine a necessary security condition for hash functions that depends only on the cardinality of the set Z (equivalently, on the size of the message digest). This necessary condition results from a simple method of finding collisions which is informally known as the birthday attack. This terminology arises from the so-called birthday paradox, which says that in a group of 23 random people, at least two will share a birthday with probability at least 1/2. (Of course this is not a paradox, but it is probably counter-intuitive). The reason for the terminology “birthday attack” will become clear as we progress.

As before, let us suppose that h : XZ is a hash function, X and Z are finite, and |X| ≥ 2|Z|. Denote |X| = m and |Z| = n. It is not hard to see that there are at least n collisions — the question is how to find them. A very naive approach is to choose k random distinct elements x1,...,xkX, compute zi = h(xi), 1 ≤ ik, and then determine if a collision has taken place (by sorting the zi’s, for example).

This process is analogous to throwing k balls randomly into n bins and then checking to see if some bin contains at least two balls. (The k balls correspond to the k random xi’s, and the n bins correspond to the n possible elements of Z.)

We will compute a lower bound on the probability of finding a collision by this method. This lower bound will depend on k and n, but not on m. Since we are interested in a lower bound on the collision probability, we will make the assumption that for all zZ. (This is a reasonable assumption: if the inverse images are not approximately equal, then the probability of finding a collision will increase.)

Since the inverse images are all (roughly) the same size and the xi’s are chosen at random, the resulting zi’s can be thought of as random (not necessarily distinct) elements of Z. But it is a simple matter to compute the probability that k random elements z1,...,zkZ are distinct. Consider the zi’s in the order z1,...,zk. The first choice z1 is arbitrary; the probability that z2z1 is 1 - 1/n; the probability that z3 is distinct from z1 and z2 is 1 - 2/n, etc.

Hence, we estimate the probability of no collisions to be

If x is a small real number, then . This estimate is derived by taking the first two terms of the series expansion

Then our estimated probability of no collisions is

So we estimate the probability of at least one collision to be

If we denote this probability by ∈, then we can solve for k as a function of n and ∈.

If we ignore the term -k, then we estimate

If we take ∈ = .5, then our estimate is

So this says that hashing just over random elements of X yields a collision with a probability of 50%. Note that a different choice of ∈ leads to a different constant factor, but k will still be proportional to

If X is the set of all human beings, Y is the set of 365 days in a non-leap year (i.e., excluding February 29), and h(x) denotes the birthday of person x, then we are dealing with the birthday paradox. Taking n = 365 in our estimate, we get Hence, as mentioned earlier, there will be at least one duplicated birthday among 23 random people with probability at least 1/2.


Previous Table of Contents Next

Copyright © CRC Press LLC



Cryptography. Theory and Practice
Modern Cryptography: Theory and Practice
ISBN: 0130669431
EAN: 2147483647
Year: 1995
Pages: 133
Authors: Wenbo Mao

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net