Cryptography: Theory and Practice:Hash Functions | Modern Cryptography: Theory and Practice

Cryptography: Theory and Practice
by Douglas Stinson
CRC Press, CRC Press LLC
ISBN: 0849385210 Pub Date: 03/17/95

Table of Contents

Several hash functions of this type have been proposed, and many of them have been shown to be insecure (independent of whether or not the underlying cryptosystem is secure). However, four variations of this theme that appear to be secure are as follows:

Figure 7.6 Constructing M in MD4

7.7 The MD4 Hash Function

The MD4 Hash Function was proposed in 1990 by Rivest, and a strengthened version, called MD5, was presented in 1991. The Secure Hash Standard (or SHS) is more complicated, but it is based on the same underlying methods. It was published in the Federal Register on January 31, 1992, and adopted as a standard on May 11, 1993. (A proposed revision was put forward on July 11, 1994, to correct a “technical flaw” in the SHS.) All of the above hash functions are very fast, so they are practical for signing very long messages.

In this section, we will describe MD4 in detail, and discuss some of the modifications that are employed in MD5 and the SHS.

Given a bitstring x, we will first produce an array

where each M[i] is a bitstring of length 32 and N ≡ 0 mod 16. We will call each M[i] a word. M is constructed from x using the algorithm presented in Figure 7.6.

In the construction of M, we append a single 1 to x, then we concatenate enough 0’s so that the length becomes congruent to 448 modulo 512, and finally we concatenate 64 bits that contain the binary representation of the (original) length of x (reduced modulo 2⁶⁴, if necessary). The resulting string M has length divisible by 512. So when we break M up into 32-bit words, the resulting number of words, denoted by N, will be divisible by 16.

Now we proceed to construct a 128-bit message digest. A high-level description of the algorithm is presented in Figure 7.7. The message digest is constructed as the concatenation of the four words A, B, C and D, which we refer to as registers. The four registers are initialized in step 1. Now we process the array M 16 words at a time. In each iteration of the loop in step 2, we first take the “next” 16 words of M and store them in an array X (step 3). The values of the four registers are then stored (step 4). Then we perform three “rounds” of hashing. Each round consists of one operation on each of the 16 words in X (we will describe these operations in more detail shortly). The operations done in the three rounds produce new values in the four registers. Finally, the four registers are updated in step 8 by adding back the values that were stored in step 4. This addition is defined to be addition of positive integers, reduced modulo 2³².

Figure 7.7 The MD4 hash function

The three rounds in MD4 are different (unlike DES, say, where the 16 rounds are identical). We first describe several different operations that are employed in these three rounds. In the following description, X and Y denote input words, and each operation produces a word as output. Here are the operations employed:

Note that all of these operations are very fast, and the only arithmetic operation that is used is addition modulo 2³². If MD4 is actually implemented, it will be necessary to take into account the underlying architecture of the computer it is run on in order to perform addition correctly. Suppose a₁a₂a₃a₄ are the four bytes in a word. We think of each a_i as being an integer in the range 0,...,255, represented in binary. In a big-endian architecture (such as a Sun SPARCstation), this word represents the integer

In a little-endian architecture (such as the Intel 80xxx line), this word represents the integer

MD4 assumes a little-endian architecture. It is important that the message digest is independent of the underlying architecture. So if we wish to run MD4 on a big-endian computer, it will be necessary to perform the addition operation X + Y as follows:

1. Interchange x₁ and x₄; x₂ and x₃; y₁ and y₄; and y₂ and y₃.

2. Compute Z = X + Y mod 2³²

3. Interchange z₁ and z₄; and z₂ and z₃.

Rounds 1, 2, and 3 of MD4 respectively use three functions f, g and h. Each of f, g and h is a bitwise boolean function that takes three words as input and produces a word as output. They are defined as follows:

The complete description of Rounds 1, 2 and 3 of MD4 are presented in Figures 7.8-7.10.

MD4 was designed to be very fast, and indeed, software implementations on Sun SPARCstations attain speeds of 1.4 Mbytes/sec. On the other hand, it is difficult to say something concrete about the security of a hash function such as MD4 since it is not “based” on a well-studied problem such as factoring or the Discrete Log problem. So, as is the case with DES, confidence in the security of the system can only be attained over time, as the system is studied and (one hopes) not found to be insecure.

Figure 7.8 Round 1 of MD4

Although MD4 has not been broken, weakened versions that omit either the first or the third round can be broken without much difficulty. That is, it is easy to find collisions for these two-round versions of MD4. A strengthened version of MD4, called MD5, was proposed in 1991. MD5 uses four rounds instead of three, and runs about 30% slower than MD4 (about .9 Mbytes/sec on a SPARCstation).

Table of Contents