Section 9.2. The RSA Algorithm

[Page 268 (continued)]

9.2. The RSA Algorithm

The pioneering paper by Diffie and Hellman [DIFF76b] introduced a new approach to cryptography and, in effect, challenged cryptologists to come up with a cryptographic algorithm that met the requirements for public-key systems. One of the first of the responses to the challenge was developed in 1977 by Ron Rivest, Adi Shamir, and Len Adleman at MIT and first published in 1978 [RIVE78].^[4] The Rivest-Shamir-Adleman (RSA) scheme has since that time reigned supreme as the most widely accepted and implemented general-purpose approach to public-key encryption.

^[4] Apparently, the first workable public-key system for encryption/decryption was put forward by Clifford Cocks of Britain's CESG in 1973 [COCK73]; Cocks's method is virtually identical to RSA.

The RSA scheme is a block cipher in which the plaintext and ciphertext are integers between 0 and n 1 for some n. A typical size for n is 1024 bits, or 309 decimal digits. That is, n is less than 2¹⁰²⁴. We examine RSA in this section in some detail, beginning with an explanation of the algorithm. Then we examine some of the computational and cryptanalytical implications of RSA.

Description of the Algorithm

The scheme developed by Rivest, Shamir, and Adleman makes use of an expression with exponentials. Plaintext is encrypted in blocks, with each block having a binary value less than some number n. That is, the block size must be less than or equal to log₂(n); in practice, the block size is i bits, where 2ⁱ < n 2^{M and ciphertext block C:}

C = M^e mod n

M = C^d mod n = (M^e)^d mod n = M^ed mod n

Both sender and receiver must know the value of n. The sender knows the value of e, and only the receiver knows the value of d. Thus, this is a public-key encryption algorithm with a public key of PU = {e, n} and a private key of PU = {d, n}. For this algorithm to be satisfactory for public-key encryption, the following requirements must be met:

[Page 269]

It is possible to find values of e, d, n such that M^ed mod n = M for all M < n.
It is relatively easy to calculate mod M^e mod n and C^d for all values of M < n.
It is infeasible to determine d given e and n.

For now, we focus on the first requirement and consider the other questions later. We need to find a relationship of the form

M^ed mod n = M

The preceding relationship holds if e and d are multiplicative inverses modulo f(n), where f(n) is the Euler totient function. It is shown in Chapter 8 that for p, q prime, f(pq) = (p 1)(q 1) The relationship between e and d can be expressed as

Equation 9-1

This is equivalent to saying

ed 1 mod f(n)

That is, e and d are multiplicative inverses mod f(n). Note that, according to the rules of modular arithmetic, this is true only if d (and therefore e) is relatively prime to f(n). Equivalently, gcd(f(n),d) = 1. See Appendix 9A for a proof that Equation (9.1) satisfies the requirement for RSA.

We are now ready to state the RSA scheme. The ingredients are the following:

p,q, two prime numbers	(private, chosen)
n = pq	(public, calculated)
e, with gcd(f(n),e) = 1;1 < e < f(n)	(public, chosen)
d f(n))	(private, calculated)

The private key consists of {d, n} and the public key consists of {e, n}. Suppose that user A has published its public key and that user B wishes to send the message M to A. Then B calculates C = M^e mod n and transmits C. On receipt of this ciphertext, user A decrypts by calculating M = C^d mod n.

Figure 9.5 summarizes the RSA algorithm. An example, from [SING99], is shown in Figure 9.6. For this example, the keys were generated as follows:

Select two prime numbers, p = 17 and q = 11.
Calculate n = pq = 17 x 11 = 187.

[Page 270]

Calculate f(n) = (p 1)(q 1) = 16 x 10 = 160.
Select e such that e is relatively prime to f(n) = 160 and less than f(n) we choose e = 7.
Determine d such that de 1 (mod 160) and d = 23, because 23 x 7 = 161 = 10 x 160 + 1; d can be calculated using the extended Euclid's algorithm (Chapter 4).

Figure 9.5. The RSA Algorithm

Figure 9.6. Example of RSA Algorithm

The resulting keys are public key PU = {7,187} and private key PR = {23,187}. The example shows the use of these keys for a plaintext input of M = 88. For encryption, we need to calculate C = 88⁷ mod 187. Exploiting the properties of modular arithmetic, we can do this as follows:

[Page 271]

88⁷ mod 187 = [(88⁴ mod 187) x (88² mod 187) x (88¹ mod 187)] mod 187

88¹ mod 187 = 88

88² mod 187 = 7744 mod 187 = 77

88⁴ mod 187 = 59,969,536 mod 187 = 132

88⁷ mod 187 = (88 x 77 x 132) mod 187 = 894,432 mod 187 = 11

For decryption, we calculate M = 11²³ mod 187:

11²³ mod 187 = [(11¹ mod 187) x (11² mod 187) x (11⁴ mod 187) x (11⁸ mod 187) x (11⁸ mod 187)] mod 187

11¹ mod 187 = 11

11² mod 187 = 121

11⁴ mod 187 = 14,641 mod 187 = 55

11⁸ mod 187 = 214,358,881 mod 187 = 33

11²³ mod 187 = (11 x 121 x 55 x 33 x 33) mod 187 = 79,720,245 mod 187 = 88

Computational Aspects

We now turn to the issue of the complexity of the computation required to use RSA. There are actually two issues to consider: encryption/decryption and key generation. Let us look first at the process of encryption and decryption and then consider key generation.

Exponentiation in Modular Arithmetic

Both encryption and decryption in RSA involve raising an integer to an integer power, mod n. If the exponentiation is done over the integers and then reduced modulo n, the intermediate values would be gargantuan. Fortunately, as the preceding example shows, we can make use of a property of modular arithmetic:

[(a mod n) x (b mod n)] mod n = (a x b) mod n

Thus, we can reduce intermediate results modulo n. This makes the calculation practical.

Another consideration is the efficiency of exponentiation, because with RSA we are dealing with potentially large exponents. To see how efficiency might be increased, consider that we wish to compute x¹⁶. A straightforward approach requires 15 multiplications:

x¹⁶ = x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x

However, we can achieve the same final result with only four multiplications if we repeatedly take the square of each partial result, successively forming x², x⁴, x⁸, x¹⁶. As another example, suppose we wish to calculate x¹¹ mod n for some integers x and n. Observe that x¹¹ = x¹⁺²⁺⁸ = (x)(x²)(x⁸). In this case we compute x mod n, x² mod n, x⁴ mod n, and x⁸ mod n and then calculate [(x mod n) x (x² mod n) x (x⁸ mod n) mod n.

[Page 272]

More generally, suppose we wish to find the value a^b with a and b positive integers. If we express b as a binary number b_kb_k1 ... b₀ then we have

Therefore,

We can therefore develop the algorithm^[5] for computing a^b mod n, shown in Figure 9.7. Table 9.3 shows an example of the execution of this algorithm. Note that the variable c is not needed; it is included for explanatory purposes. The final value of c is the value of the exponent.

^[5] The algorithm has a long history; this particular pseudocode expression is from [CORM01].

Figure 9.7. Algorithm for Computing a^b mod n

Note: The integer b is expressed as a binary number b_kb_k1 ... b₀

Table 9.3. Result of the Fast Modular Exponentiation Algorithm for a^b mod n, where a = 7, b = 560 = 1000110000, n = 561
i	9	8	7	6	5	4	3	2	1	0
b_i	1	0	0	0	1	1	0	0	0	0
c	1	2	4	8	17	35	70	140	280	560
f	7	49	157	526	160	241	298	166	67	1

[Page 273]

Efficient Operation Using the Public Key

To speed up the operation of the RSA algorithm using the public key, a specific choice of e is usually made. The most common choice is 65537 (2¹⁶ 1); two other popular choices are 3 and 17. Each of these choices has only two 1 bits and so the number of multiplications required to perform exponentiation is minimized.

However, with a very small public key, such as e = 3, RSA becomes vulnerable to a simple attack. Suppose we have three different RSA users who all use the value e = 3 but have unique values of n, namely n₁, n₂, n₃. If user A sends the same encrypted message M to all three users, then the three ciphertexts are C₁ = M³ mod n₁; C₂ = M³ mod n₂; C₃ = M³ mod n₃. It is likely that n₁, n₂, and n₃ are pairwise relatively prime. Therefore, one can use the Chinese remainder theorem (CRT) to compute M³ mod (n₁n₂n₃). By the rules of the RSA algorithm, M is less than each of the n_i therefore M³ < n₁n₂n₃. Accordingly, the attacker need only compute the cube root of M³. This attack can be countered by adding a unique pseudorandom bit string as padding to each instance of M to be encrypted. This approach is discussed subsequently.

The reader may have noted that the definition of the RSA algorithm (Figure 9.5) requires that during key generation the user selects a value of e that is relatively prime to f(n). Thus, for example, if a user has preselected e = 65537 and then generated primes p and q, it may turn out that gcd(f(n),e) 1, Thus, the user must reject any value of q that is not congruent to 1 (mod 65537).

Efficient Operation Using the Private Key

We cannot similarly choose a small constant value of d for efficient operation. A small value of d is vulnerable to a brute-force attack and to other forms of cryptanalysis [WIEN90]. However, there is a way to speed up computation using the CRT. We wish to compute the value M = C^d mod n. Let us define the following intermediate results:

V_p = C^d mod p	V_q = C^d mod q

Following the CRT, Equation (8.8), define the quantities:

X_p = q x (q¹ mod p)	X_q = p x (p¹ mod q)

The CRT then shows, using Equation (8.9), that

M = (V_pX_p + V_qX_q) mod n

Further, we can simplify the calculation of V_p and V_q using Fermat's theorem, which states that a^p1 1 (mod p and a are relatively prime. Some thought should convince you that the following are valid:

V_p = C^d mod p = C^{d mod (p1)} mod p	V_q = C^d mod q = C^{d mod (q1)} mod q

The quantities d mod (P1) and d mod (q1) can be precalculated. The end result is that the calculation is approximately four times as fast as evaluating M = C^d mod n directly [BONE02].

Key Generation

Before the application of the public-key cryptosystem, each participant must generate a pair of keys. This involves the following tasks:

Determining two prime numbers, p and q
Selecting either e or d and calculating the other

[Page 274]

First, consider the selection of p and q. Because the value of n = pq will be known to any potential adversary, to prevent the discovery of p and q by exhaustive methods, these primes must be chosen from a sufficiently large set (i.e., p and q must be large numbers). On the other hand, the method used for finding large primes must be reasonably efficient.

At present, there are no useful techniques that yield arbitrarily large primes, so some other means of tackling the problem is needed. The procedure that is generally used is to pick at random an odd number of the desired order of magnitude and test whether that number is prime. If not, pick successive random numbers until one is found that tests prime.

A variety of tests for primality have been developed (e.g., see [KNUT98] for a description of a number of such tests). Almost invariably, the tests are probabilistic. That is, the test will merely determine that a given integer is probably prime. Despite this lack of certainty, these tests can be run in such a way as to make the probability as close to 1.0 as desired. As an example, one of the more efficient and popular algorithms, the Miller-Rabin algorithm, is described in Chapter 8. With this algorithm and most such algorithms, the procedure for testing whether a given integer n is prime is to perform some calculation that involves n and a randomly chosen integer a. If n "fails" the test, then n is not prime. If n "passes" the test, then n may be prime or nonprime. If n passes many such tests with many different randomly chosen values for a, then we can have high confidence that n is, in fact, prime.

In summary, the procedure for picking a prime number is as follows.

1.	Pick an odd integer n at random (e.g., using a pseudorandom number generator).
2.	Pick an integer a < n at random.
3.	Perform the probabilistic primality test, such as Miller-Rabin, with a as a parameter. If n fails the test, reject the value n and go to step 1.
4.	If n has passed a sufficient number of tests, accept n; otherwise, go to step 2.

This is a somewhat tedious procedure. However, remember that this process is performed relatively infrequently: only when a new pair (PU, PR) is needed.

It is worth noting how many numbers are likely to be rejected before a prime number is found. A result from number theory, known as the prime number theorem, states that the primes near N are spaced on the average one every (ln N) integers. Thus, on average, one would have to test on the order of ln(N) integers before a prime is found. Actually, because all even integers can be immediately rejected, the correct figure is ln(N)/2. For example, if a prime on the order of magnitude of 2²⁰⁰ were sought, then about ln(2²⁰⁰)/2 = 70 trials would be needed to find a prime.

Having determined prime numbers p and q, the process of key generation is completed by selecting a value of e and calculating d or, alternatively, selecting a value of d and calculating e. Assuming the former, then we need to select an e such that gcd(f(n), e) = 1 and then calculate d f(n)). Fortunately, there is a single algorithm that will, at the same time, calculate the greatest common divisor of two integers and, if the gcd is 1, determine the inverse of one of the integers modulo the other. The algorithm, referred to as the extended Euclid's algorithm, is explained in Chapter 8. Thus, the procedure is to generate a series of random numbers, testing each against f(n) until a number relatively prime to f(n) is found. Again, we can ask the question: How many random numbers must we test to find a usable number, that is, a number relatively prime to f(n)? It can be shown easily that the probability that two random numbers are relatively prime is about 0.6; thus, very few tests would be needed to find a suitable integer (see Problem 8.2).

[Page 275]

The Security of RSA

Four possible approaches to attacking the RSA algorithm are as follows:

Brute force: This involves trying all possible private keys.
Mathematical attacks: There are several approaches, all equivalent in effort to factoring the product of two primes.
Timing attacks: These depend on the running time of the decryption algorithm.
Chosen ciphertext attacks: This type of attack exploits properties of the RSA algorithm.

The defense against the brute-force approach is the same for RSA as for other cryptosystems, namely, use a large key space. Thus, the larger the number of bits in d, the better. However, because the calculations involved, both in key generation and in encryption/decryption, are complex, the larger the size of the key, the slower the system will run.

In this subsection, we provide an overview of mathematical and timing attacks.

The Factoring Problem

We can identify three approaches to attacking RSA mathematically:

Factor n into its two prime factors. This enables calculation of f(n) = (p 1) x (q 1), which, in turn, enables determination of d f(n)).
Determine f(n) directly, without first determining p and q. Again, this enables determination of d f(n)).
Determine d directly, without first determining f(n).

Most discussions of the cryptanalysis of RSA have focused on the task of factoring n into its two prime factors. Determining f(n) given n is equivalent to factoring n [RIBE96]. With presently known algorithms, determining d given e and n appears to be at least as time-consuming as the factoring problem [KALI95]. Hence, we can use factoring performance as a benchmark against which to evaluate the security of RSA.

For a large n with large prime factors, factoring is a hard problem, but not as hard as it used to be. A striking illustration of this is the following. In 1977, the three inventors of RSA dared Scientific American readers to decode a cipher they printed in Martin Gardner's "Mathematical Games" column [GARD77]. They offered a $100 reward for the return of a plaintext sentence, an event they predicted might not occur for some 40 quadrillion years. In April of 1994, a group working over the Internet claimed the prize after only eight months of work [LEUT94]. This challenge used a public key size (length of n) of 129 decimal digits, or around 428 bits. In the meantime, just as they had done for DES, RSA Laboratories had issued challenges for the RSA cipher with key sizes of 100, 110, 120, and so on, digits. The latest challenge to be met is the RSA-200 challenge with a key length of 200 decimal digits, or about 663 bits. Table 9.4 shows the results to date. The level of effort is measured in MIPS-years: a million-instructions-per-second processor running for one year, which is about 3 x 10¹³ instructions executed. A 1 GHz Pentium is about a 250-MIPS machine.

[Page 276]

Table 9.4. Progress in Factorization
Number of Decimal Digits	Approximate Number of Bits	Date Achieved	MIPS-years	Algorithm
100	332	April 1991	7	Quadratic sieve
110	365	April 1992	75	Quadratic sieve
120	398	June 1993	830	Quadratic sieve
129	428	April 1994	5000	Quadratic sieve
130	431	April 1996	1000	Generalized number field sieve
140	465	February 1999	2000	Generalized number field sieve
155	512	August 1999	8000	Generalized number field sieve
160	530	April 2003		Lattice sieve
174	576	December 2003		Lattice sieve
200	663	May 2005		Lattice sieve

A striking fact about Table 9.4 concerns the method used. Until the mid-1990s, factoring attacks were made using an approach known as the quadratic sieve. The attack on RSA-130 used a newer algorithm, the generalized number field sieve (GNFS), and was able to factor a larger number than RSA-129 at only 20% of the computing effort.

The threat to larger key sizes is twofold: the continuing increase in computing power, and the continuing refinement of factoring algorithms. We have seen that the move to a different algorithm resulted in a tremendous speedup. We can expect further refinements in the GNFS, and the use of an even better algorithm is also a possibility. In fact, a related algorithm, the special number field sieve (SNFS), can factor numbers with a specialized form considerably faster than the generalized number field sieve. Figure 9.8 compares the performance of the two algorithms. It is reasonable to expect a breakthrough that would enable a general factoring performance in about the same time as SNFS, or even better [ODLY95]. Thus, we need to be careful in choosing a key size for RSA. For the near future, a key size in the range of 1024 to 2048 bits seems reasonable.

Figure 9.8. MIPS-years Needed to Factor (This item is displayed on page 277 in the print version)

In addition to specifying the size of n, a number of other constraints have been suggested by researchers. To avoid values of n that may be factored more easily, the algorithm's inventors suggest the following constraints on p and q:

p and q should differ in length by only a few digits. Thus, for a 1024-bit key (309 decimal digits), both p and q should be on the order of magnitude of 10⁷⁵ to 10¹⁰⁰.
Both (p 1) and (q 1) should contain a large prime factor.
gcd(p 1, q 1) should be small.

In addition, it has been demonstrated that if e < n and d < n^¼, then d can be easily determined [WIEN90].

[Page 277]

Timing Attacks

If one needed yet another lesson about how difficult it is to assess the security of a cryptographic algorithm, the appearance of timing attacks provides a stunning one. Paul Kocher, a cryptographic consultant, demonstrated that a snooper can determine a private key by keeping track of how long a computer takes to decipher messages [KOCH96, KALI96b]. Timing attacks are applicable not just to RSA, but to other public-key cryptography systems. This attack is alarming for two reasons: It comes from a completely unexpected direction and it is a ciphertext-only attack.

A timing attack is somewhat analogous to a burglar guessing the combination of a safe by observing how long it takes for someone to turn the dial from number to number. We can explain the attack using the modular exponentiation algorithm of Figure 9.7, but the attack can be adapted to work with any implementation that does not run in fixed time. In this algorithm, modular exponentiation is accomplished bit by bit, with one modular multiplication performed at each iteration and an additional modular multiplication performed for each 1 bit.

[Page 278]

As Kocher points out in his paper, the attack is simplest to understand in an extreme case. Suppose the target system uses a modular multiplication function that is very fast in almost all cases but in a few cases takes much more time than an entire average modular exponentiation. The attack proceeds bit-by-bit starting with the leftmost bit, b_k. Suppose that the first j bits are known (to obtain the entire exponent, start with j = 0 and repeat the attack until the entire exponent is known). For a given ciphertext, the attacker can complete the first j iterations of the for loop. The operation of the subsequent step depends on the unknown exponent bit. If the bit is set, d (d x a) mod n will be executed. For a few values of a and d, the modular multiplication will be extremely slow, and the attacker knows which these are. Therefore, if the observed time to execute the decryption algorithm is always slow when this particular iteration is slow with a 1 bit, then this bit is assumed to be 1. If a number of observed execution times for the entire algorithm are fast, then this bit is assumed to be 0.

In practice, modular exponentiation implementations do not have such extreme timing variations, in which the execution time of a single iteration can exceed the mean execution time of the entire algorithm. Nevertheless, there is enough variation to make this attack practical. For details, see [KOCH96].

Although the timing attack is a serious threat, there are simple countermeasures that can be used, including the following:

Constant exponentiation time: Ensure that all exponentiations take the same amount of time before returning a result. This is a simple fix but does degrade performance.
Random delay: Better performance could be achieved by adding a random delay to the exponentiation algorithm to confuse the timing attack. Kocher points out that if defenders don't add enough noise, attackers could still succeed by collecting additional measurements to compensate for the random delays.
Blinding: Multiply the ciphertext by a random number before performing exponentiation. This process prevents the attacker from knowing what ciphertext bits are being processed inside the computer and therefore prevents the bit-by-bit analysis essential to the timing attack.

RSA Data Security incorporates a blinding feature into some of its products. The private-key operation M = C^d mod n is implemented as follows:

Generate a secret random number r between 0 and n 1.
Compute C' = C(r^e) mod n, where e is the public exponent.
Compute M' = (C')^d mod n with the ordinary RSA implementation.
Compute M = M'r¹ mod n. In this equation, r¹ is the multiplicative inverse of r mod n; see Chapter 8 for a discussion of this concept. It can be demonstrated that this is the correct result by observing that r^ed mod n =r mod n.

RSA Data Security reports a 2 to 10% performance penalty for blinding.

[Page 279]

Chosen Ciphertext Attack and Optimal Asymmetric Encryption Padding

The basic RSA algorithm is vulnerable to a chosen ciphertext attack (CCA). CCA is defined as an attack in which adversary chooses a number of ciphertexts and is then given the corresponding plaintexts, decrypted with the target's private key. Thus, the adversary could select a plaintext, encrypt it with the target's public key and then be able to get the plaintext back by having it decrypted with the private key. Clearly, this provides the adversary with no new information. Instead, the adversary exploits properties of RSA and selects blocks of data that, when processed using the target's private key, yield information needed for cryptanalysis.

A simple example of a CCA against RSA takes advantage of the following property of RSA:

Equation 9-2

We can decrypt C = M^e using a CCA as follows.

Compute X = (C x 2^e) mod n.
Submit X as a chosen ciphertext and receive back Y = X^d mod n.
But now note the following:

X	= (C mod n) x (2^e mode n)
	= (M^e mod n) x (2^e mode n)
	= (2M)^e mod n

Therefore, Y = (2M) mod n From this, we can deduce M. To overcome this simple attack, practical RSA-based cryptosystems randomly pad the plaintext prior to encryption. This randomizes the ciphertext so that Equation (9.2) no longer holds. However, more sophisticated CCAs are possible and a simple padding with a random value has been shown to be insufficient to provide the desired security. To counter such attacks RSA Security Inc., a leading RSA vendor and former holder of the RSA patent, recommends modifying the plaintext using a procedure known as optimal asymmetric encryption padding (OAEP). A full discussion of the threats and OAEP are beyond our scope; see [POIN02] for an introduction and [BELL94a] for a thorough analysis. Here, we simply summarize the OAEP procedure.

Figure 9.9 depicts OAEP encryption. As a first step the message M to be encrypted is padded. A set of optional parameters P is passed through a hash function H.^[6] The output is then padded with zeros to get the desired length in the overall data block (DB). Next, a random seed is generated and passed through another hash function, called the mask generating function (MGF). The resulting hash value is bit-by-bit XORed with DB to produce a maskedDB. The maskedDB is in turn passed through the MGF to form a hash that is XORed with the seed to produce the masked seed. The concatenation of the maskedseed and the maskedDB forms the encoded message EM. Note that the EM includes the padded message, masked by the seed, and the seed, masked by the maskedDB. The EM is then encrypted using RSA.

^[6] A hash function maps a variable-length data block or message into a fixed-length value called a hash code. Hash functions are discussed in depth in Chapters 11 and 12.

[Page 280]