16.2 The RSA Algorithm

Team-Fly

16.2 The RSA Algorithm

All that is merely probable is probably false.

René Descartes

We shall now present a brief outline of the mathematical properties of the RSA algorithm, and we shall see how the RSA procedure can be implemented both as an asymmetric cryptosystem and an asymmetric signature scheme. Following the mathematical principles of the RSA algorithm we shall develop C++ classes with RSA functions for encryption and decryption as well as for the generation and authentication of digital signatures. In this way we shall clarify the possibilities offered for the implementation of the methods of our LINT class.

The most important aspect of the RSA algorithm is the pair of keys, which have a particular mathematical form: An RSA key pair consists of three basic components: the modulus n, the public key component e (encryption), and the secret key component d (decryption). The pairs <e, n> and <d, n> then form a public and private key.

We first generate the modulus n as the product n = pq of two prime numbers p and q. If φ (n) = (p 1)(q 1) denotes the Euler phi function (cf. page 175), then for a given n the public key component e can be chosen such that e < φ(n) and gcd (e, φ(n)) = 1. The secret component d corresponding to n and e is obtained by calculating the inverse d = e 1 mod φ(n) (cf. Section 10.2).

We illustrate this principle with the help of a small example: We choose p = 7 and q = 11. Then we have n = 77 and φ(n) = 22 · 3 · 5 = 60. Due to the condition gcd (e, φ(n)) = 1, the least possible value for the key component e is 7, from which we derive the value d = 43 for the key component d, since 7 · 43 301 1 mod 60. With these values we can apply the RSA algorithm to a toy example, in which we calculate, say, that the "message" 5 is encrypted to 57 mod 77 = 47 and the decryption 4743 mod 77 = 5 restores the original message.

Equipped now with such keys (we shall soon discuss what constitutes a realistic size for the various key components) and appropriate software, communication partners can securely exchange information with each other. To demonstrate the procedure we consider the process by which a participant Ms. A sends an RSA-encoded message to a communication partner Mr. B:

  1. B generates his RSA key with components nB, dB, and eB. He then gives the public key <eB, nB> to A.

  2. Now A would like to send an encrypted message M to B (0 M < nB). Since A has received from B his public key <eB, nB>, A calculates

    and sends the encrypted value C to B.

  3. After B has obtained the encrypted message C from A, then B decodes this message by calculating

    using his secret key <dB, nB>. Now B possesses the plain text M of the message.

It is not difficult to see why this works. Because d · e 1 mod φ(n), there exists an integer k such that d · e = 1 + k · φ(n). We thus obtain

(16.3) click to expand

where we have made use of the theorem of Euler quoted on page 175, from which we deduce that Mφ(n) 1 mod n if gcd(M, n) = 1.

It is clear that the security of the RSA algorithm depends on the ease of factorability of n. If n can be factored into its components p and q, then the secret key d can be determined from the public key e. Conversely, the factorization of n can be easily accomplished if both key components d and e are known: If k := de 1, then k is a multiple of φ(n), and therefore we have k = r · 2t with odd r and t 1. For every g Î we have gk gde 1 gg1 1 mod n, and therefore gk/2 is a square root of 1 modulo n, of which there are four: In addition to ±1 there are the roots ±x with x 1 mod p and x 1 mod q. Thus p | (x 1) and q | (x + 1) (cf. Section 10.4.3). By calculating p = gcd(x 1, n) one thus obtains the factorization of n (cf. page 208).

Possibilities for attacking the RSA algorithm other than factorization of the modulus are either as expensive as this or rely on the special weaknesses of individual protocols used in implementing the RSA cryptosystem, but do not rely on the RSA algorithm itself. Based on the current state of know ledge the following conditions lead to opportunities to attack the RSA algorithm:

  1. Common modulus

    The use of a common modulus for several participants leads to an obvious weakness: Based on what we have just said, each participant can use his or her own key components e and d to factorize the common modulus n = pq. From a knowledge of the factors p and q as well as the public key components of other participants with the same modulus their secret keys can be calculated.

  2. Small public exponents

    Since the computational time for the RSA algorithm for a given modulus depends directly on the size of the exponents e and d, it would appear attractive to choose these as small as possible. For example, 3, as the smallest possible exponent, requires only one squaring and one multiplication modulo n, so why not save valuable computing time in this way?

    Let us assume that an attacker was able to capture three encoded messages C1, C2 and C3, each of which encodes the same plain text M, encoded with the keys <3, ni> of three different recipients:

    click to expand

    It is highly probable that gcd (ni, nj) = 1 for i j, so that the attacker can use the Chinese remainder theorem (cf. page 199) to find a value C for which

    Since it is also true that M3 < n1n2n3, we have that C is actually equal to M3, and the attacker can obtain M directly by calculating . Such assaults, known as broadcast attacks, can always be carried out when the number of cryptograms Ci is greater than the public exponent, and this holds even if the plain texts to be encoded are not identical but are merely linearly dependent on one another, that is, if relations like Mi = a + b · Mj hold (cf. [Bone]). To avoid such an attack it is therefore necessary to choose the public exponents not too small (in no case less than 216 + 1 = 65537) and in addition to add random redundancy to broadcast messages before their encryption. This can take place by filling in the message in some appropriate way up to a suitable value less than the modulus. Such a process is called padding (cf. page 335 and [Schn], Section 9.1).

  3. Small secret exponents

    Even more problematic than small public exponents are small secret exponents: M. Wiener [Wien] showed already in 1990 how, given a key <e, n> with e < φ(n), the associated private key component d can be calculated if d is too small. Wiener's result was recently sharpened by D. Boneh and G. Durfee [Bone], who showed that d can be computed from <e, n> if d < n0.292. However, it is conjectured that the result holds as well for d < n0.5. We thus have the following lower bounds for d for typical sizes of RSA moduli:

    n

    d

    2768

    2384

    21024

    2512

    22048

    21024

    24096

    22048

  4. Weaknesses in implementation

    In addition to the weaknesses caused by the choice of parameters there is a host of potential implementation problems that can adversely affect the security of the RSA algorithm, as is the case with any encryption procedure. Certainly, the greatest care must be taken with implementations completely in software that are not protected from outside attack by measures implemented in hardware. Reading from memory contents, observation of bus activity or CPU states, can lead to the disclosure of secret key information. At the minimum, all data in main memory that in any way are correlated with the secret components of the RSA algorithm (or any other cryptosystem) should be erased immediately after use with active overwriting (for example, with the function purge_l(), which appears on page 162).

    The functions in the FLINT/C library are already equipped for this purpose. In secure mode local variables and allocated memory are overwritten with zeros before termination of the function and are thereby deleted. Here a certain degree of care is necessary, since the optimization capabilities of the compiler are so highly developed that a simple instruction at the end of a function that can be seen to have no effect on the function's termination might simply be ignored. And there is more: One must take into account that calls to the function memset() in the C standard library are ignored if the compiler cannot recognize any useful purpose in calling it.

    The following example illustrates these effects. The function f_l() uses two automatic variables: CLINT key_l and USHORT secret. At the end of the function, whose contents are of no further interest here, memory should be overwritten by assigning 0 to secret and, respectively, by a call to memset() in the case of key_l. The C code looks like this:

       int   f_l (CLINT n_l)   {     CLINT key_l;     USHORT secret;      ...      ...      ...     /* overwrite the variables */     secret = 0;     memset (key_l, 0, sizeof (key_l));     return 0;   } 

    And what does the compiler make of this (Microsoft Visual C/C++ 6.0, compilation with cl -c -FAs -02)?

       PUBLIC   _f   ;        COMDAT _f   _TEXT    SEGMENT   _ key _l$ = -516   _ secret $ = -520   _f       PROC NEAR                                ; COMDAT   ; 5    :    CLINT key _l;   ; 6    :    USHORT secret;      ...      ...      ...   ; 18   :    /* overwrite the variables */   ; 19   :    secret = 0;   ; 20   :    memset (key_l, 0, sizeof (key _l));   ; 21   :   ; 22   :    return 0;            xor eax, eax   ; 23 : }            add esp, 532                             ; 00000214H            ret 0   _f       ENDP   _TEXT    ENDS 

    The assembler code generated by the compiler documents that the instructions to delete the variables key_l and secret are passed over without effect. From the point of view of optimization this is a desirable result. Even the inline version of the function memset() is simply optimized away. For security-critical applications, however, this strategy is simply too clever.

    The active deletion of security-critical variables by overwriting must therefore be implemented in such a way that it is actually carried out. One should note that in this case assertions can prevent the checking for effectiveness, since the presence of the assertions forces the compiler to execute the code. When the assertions are turned off, then optimization again goes into effect.

    For the FLINT/C package the following function is implemented, which accepts a variable number of arguments and treats them according to their size as standard integer types and sets them to 0, or for other data structures calls memset() and lets it do the overwriting:

       static void purgevars_l (int noofvars, ...)   {     va_list ap;     size_t size;     va_start (ap, noofvars);     for (; noofvars > 0; -noofvars)       {          switch (size = va_arg (ap, size_t))            {               case 1:*va_arg (ap, char *) = 0;                      break;               case 2:*va_arg (ap, short *) = 0;                      break;               case 4:*va_arg (ap, long *) = 0;                      break;               default: Assert (size >= CLINTMAXBYTE);                      memset (va_arg(ap, char *), 0, size);            }       }     va_end (ap);   } 

    The function expects pairs of the form (byte length of the variable, pointer to the variable) as arguments, prefixed in noofvars by the number of such pairs.

    As an extension of this function the macro PURGEVARS_L() is defined by

       #ifdef FLINT_SECURE   #define PURGEVARS_L(X) purgevars_l X   #else   #define PURGEVARS_L(X) (void)0   #endif /* FLINT_SECURE */ 

    so that security mode can be turned on and off as required. Deletion of the variables in f() can take place as follows:

       /* overwrite the variables */   PURGEVARS_L ((2, sizeof (secret), & secret,                   sizeof (key_l), key_l)); 

    The compiler cannot ignore the call to this function on the principle of optimization strategy, which could be accomplished only by an extraordinarily effective global optimization. In any case, the effect of such security measures should be checked by means of code inspection:

       PUBLIC   _f   EXTRN    _purgevars_l:NEAR   ;        COMDAT _f   _TEXT    SEGMENT   _key_l$ = -516   _secret$ = -520   _f       PROC NEAR                                    ; COMDAT   ; 9 : {            sub  esp, 520                                ; 00000208H   ; 10   :    CLINT key_l;   ; 11   :    USHORT secret;      ...      ...      ...   ; 18   :        /* overwrite the variables */   ; 19   :        PURGEVARS_L ((2, sizeof (secret), &secret,                                   sizeof (key_l), key_l));            lea  eax, DWORD PTR _key_l$[esp+532]            push eax            lea ecx, DWORD PTR _secret$[esp+536]            push 514                                     ; 00000202H            push ecx            push 2            push 2            call _purgevars_l   ; 20   :   ; 21   :        return 0;            xor  eax, eax   ; 22   : }            add  esp, 552                                ; 00000228H            ret  0   _f       ENDP   _TEXT    ENDS 

    As a further protective measure in connection with the implementation of security-critical applications we should mention a comprehensive mechanism for error handling that sees to it that even in the case of invalid arguments or other exceptional situations no sensitive information is divulged. Likewise, suitable measures should be considered for establishing the authenticity of the code of a cryptographic application, so that the insertion of Trojan horses is prevented or at least detected before the code can be executed. Taking a cue from the story of the Trojan War, Trojan horse is the name given to software that has been altered in such a way that it apparently functions correctly but has additional undesirable effects such as the transmittal of secret key information to an attacker via an Internet connection.

    To get a grip on such problems, frequently, in practice, for cryptographic operations, "security boxes," or "S-Boxes" for short, are implemented, whose hardware is protected against attack by encapsulation in conjunction with detectors or sensors.

If all these traps are avoided, then there remains only the risk that the modulus will be factored, and this risk can be effectively eliminated by choosing sufficiently large prime numbers. To be sure, it has not been proved that there is no easier method to break the RSA algorithm than factorization, and there is also no proof that factorization of large integers is truly as difficult a problem as it presently seems to be, but these issues have not adversely affected the practical application of the algorithm to date: The RSA algorithm is the most commonly implemented asymmetric cryptosystem worldwide, and its use over the Internet has steadily increased, particularly for authentication in the exchange of information.

Based on the current state of knowledge in reference to factorability, the important requirements in the generation of RSA keys are that the prime factors p and q have approximately the same number of digits, that they be sufficiently large, and that p and q be nonetheless different in a sufficient number of binary digits. This avoids the situation where the modulus n can be easily factored on account of p q by being divided by natural numbers close to .

There are many places in the literature where it is recommended that so-called strong primes p and q be used in order to protect the modulus against some of the simpler factorization methods. A prime p is called strong in this connection if

  1. p 1 has a large prime divisor r,

  2. p + 1 has a large prime divisor s,

  3. r 1 has a large prime divisor t.

There are varying opinions as to the importance of strong primes to the security of the RSA algorithm, and their necessity is not everywhere equally emphasized. Recently, the majority of opinions appear to state that while the use of strong prime numbers is not harmful, it also does not accomplish a great deal (cf. [MOV], Section 8.2.3, Note 8.8, as well as [RegT], Appendix 1.4) or even that they should not be used (see [Schn], Section 11.5). In the program example that follows we shall therefore do without the generation of strong primes. For those who are nonetheless interested, we sketch here a procedure for constructing such primes:

  1. The first step in the construction of a strong prime p with p binary digits consists in the search for primes s and t satisfying log2(s) log2(t) p log2p. Then we search for a prime r for which r 1 is divisible by t, by testing sequentially numbers of the form r = k · 2t + 1, k = 1, 2, . . . , for primality until we encounter a prime number. This almost always occurs in at most 2 ln 2t steps (cf. [HKW], page 418).

  2. We now invoke the Chinese remainder theorem (see page 199) to calculate a solution of the simultaneous congruences x 1 mod r and x 1 mod s, by setting x0 := 1 2r1 s mod rs, where r1 is the multiplicative inverse of r modulo s.

  3. For our prime number search we use an odd initial value: We generate a random number z with a number of digits close to but less than (sometimes denoted by ) the desired length of p, and set x0 x0 + z + rs (z mod rs). If x0 is even, then we set x0 x0 + rs. With x0 in hand we begin our determination of p. We test the values p = x0 + k · 2rs, k = 0, 1, . . . , until the desired number of digits p for p is reached and p is prime. If an RSA key is to contain a specified public exponent e, then it is worthwhile to ensure additionally that gcd(p 1, e) = 1. The above conditions on p have now been completely fulfilled. For the prime number tests we use the Miller-Rabin test implemented in the function prime_l().

Whether or not strong primes are used for keys, in every case it is practical to have available a function that generates primes of a specified length or within a specified interval. A procedure for this that additionally ensures that a prime p thus generated satisfies the further condition gcd(p 1, f) = 1 for a specified number f is given in [IEEE], page 73. Here is the algorithm in a slightly altered form.

Algorithm to generate a prime p such that pmin p pmax

  1. Generate a random number p, pmin p pmax.

  2. If p is even, set p p + 1.

  3. If p > pmax, set p pmin + p mod (pmax + 1) and go to step 2.

  4. Compute d := gcd(p 1, f) (cf. Section 10.1). If d = 1, test p for primality (cf. Section 10.5). If p is prime, then output p and terminate the algorithm. Otherwise, set p p + 2 and go to step 3.

A realization of this procedure as a C++ function is contained in the FLINT/C package (source file flintpp.cpp).

start sidebar

Function:

generation of a prime number p within an interval [pmin, pmax] satisfying the additional condition gcd(p 1, f) = 1, f an odd positive integer

Syntax:

 const LINT findprime(const LINT& pmin, const LINT& pmax, const LINT& f); 

Input:

pmin: smallest permissible value

pmax: largest permissible value

f: odd positive integer, which should be relatively prime to p 1

Return:

LINT prime p determined by a probabilistic test (cf. Section 10.5) with gcd(p 1, f)

end sidebar

   const LINT findprime (const LINT& pmin, const LINT& pmax, const LINT& f)   {     if (!pmin.init) LINT::panic (E_LINT_VAL, "findprime", 1, __LINE__);     if (!pmax.init) LINT::panic (E_LINT_VAL, "findprime", 2, __LINE__);     if (pmin > pmax) LINT::panic (E_LINT_VAL, "findprime", 1, __LINE__);     if (!f.init) LINT::panic (E_LINT_VAL, "findprime", 3, __LINE__);     // 0 < f must be odd     if (f.iseven()) LINT::panic (E_LINT_VAL, "findprime", 3, __LINE__);     LINT p = randBBS (pmin, pmax);     LINT t = pmax - pmin;     if (p.iseven())       {          ++p;       }     if (p > pmax)       {          p = pmin + p % (t + 1);       }     while ((gcd (p - 1, f) != 1) || !p.isprime())       {          ++p;          ++p;          while (p > pmax)            {               p = pmin + p % (t + 1);               if (p.iseven())                 {                   ++p;                 }            }       }     return p;   } 

Additionally, the function findprime() is overloaded so that instead of the interval boundaries pmin and pmax a binary length can be set.

start sidebar

Function:

generation of a prime number p within the interval [2ℓ−1, 2 1] satisfying the additional condition gcd(p 1, f) = 1, f an odd positive integer

Syntax:

 const LINT findprime(const USHORT l, const LINT& f); 

Input:

l: desired binary length

f: odd positive integer, which should be relatively prime to p 1

Return:

LINT prime p with gcd(p 1, f)

end sidebar

With regard to the key length to be chosen a look at the worldwide state of attempts at factorization is most informative: In April 1996 after months-long cooperative work at universities and research laboratories in the USA and Europe under the direction of A.K. Lenstra[3] the RSA modulus

click to expand

with 130 decimal places was factored as

click to expand

Then in February 1999 the modulus

click to expand

was factored into the two 70-digit factors

click to expand

This success was accomplished under the direction of Herman J. J. te Riele of CWI in the Netherlands with teams from the Netherlands, Australia, France, Great Britain, and the USA.[4] RSA-130 and RSA-140 came from a list of 42 RSA moduli published in 1991 by the firm RSA Data Security, Inc. as encouragement to the cryptographic research community.[5] The calculations that led to the factorization of RSA-130 and RSA-140 were divided among a large number of workstations and the results collated. The calculational effort to factor RSA-140 was estimated to be 2000 MIPS years[6] (for RSA-130 it was about 1000 MIPS years).

Only a short time later, namely at the end of August 1999, news of the factorization of RSA-155 flashed across the globe. At a cost of about 8000 MIPS years the next number in the list of RSA challenges had been laid to rest, again under the direction of Herman te Riele with international participation. With the factorization of

click to expand

into the two 78-digit factors

click to expand

the magical threshold of 512 was crossed, a length that for many years had been considered safe for key lengths.

Table 16.1: Recommended key lengths according to Lenstra and Verheul

Year

Key length (in bits)

2001

990

2005

1149

2010

1369

2015

1613

2020

1881

2025

2174

The question of what key length is to be considered adequate for the RSA algorithm is revised each time progress in factorization is made. A. K. Lenstra and Eric R. Verheul [LeVe] provide some concrete advice in this regard in their description of a model for the determination of recommended key lengths for many types of cryptosystems. Beginning with a set of well-founded and conservative assumptions, combined with current findings, they calculate some prognoses as to minimum key lengths to recommend in the future and display them in tabular form. The values shown in Table 16.1, which are valid for asymmetric procedures like RSA, El-Gamal, and Diffie-Hellman, are taken from their results.

We may conclude that an RSA key should possess no fewer than 1024 binary digits if it is to provide a comfortable margin of safety for critical applications. We may conclude as well, however, that successes in factorization are gradually approaching this value, and one must keep careful tabs on developments. It is therefore worthwhile to distinguish different application purposes and for particularly sensitive applications to consider using 2048 or more binary digits (cf. [Schn], Chapter 7, and [RegT], Appendix 1.4).[7] With the FLINT/C package we are well equipped to produce such key lengths. We need not be too concerned that the expense of factorization decreases in proportion to the speed of new hardware, since this same hardware allows us to create longer keys as required. The security of the RSA algorithm can thus always be ensured by keeping sufficiently ahead of the progress in factorization.

How many such keys are available? Are there enough so that every man, woman, and child on the planet (and perhaps even their pet cats and dogs) can be provided one or more RSA keys? To this the prime number theorem provides the answer, according to which the number of prime numbers less than an integer x is asymptotically approximated by x/ ln x (cf. page 216): Moduli of length 1024 bits are generated as products of two prime numbers each of length approximately 512 bits. There are about 2512/512 such primes, that is, about 10151, each pair of which forms a modulus. If we let N = 10151, then there are N(N 1)/2 such pairs, which comes to about 10300 different moduli for which additionally that number again can be chosen for secret key components. This overwhelmingly large number is difficult to grasp, but consider, for example, that the entire visible universe contains "only" about 1080 elementary particles (cf. [Saga], Chapter 9). To put it another way, if every person on Earth were given ten new moduli every day, then the process could continue for 10287 years without a modulus being reused, and to date Earth has existed "only" a few billion years.

Finally, that an arbitrary text can be represented as a positive integer is obvious: By associating a unique integer to each letter of an alphabet texts can be interpreted in any number of ways as integers. A common example is the numerical representation of characters via ASCII (American Standard Code for Information Interchange) code. An ASCII-encoded text can be turned into a single integer by considering the code value of an individual character as a digit to base 256. The probability that such a process would result in an integer M for which gcd(M, n) > 1, that is, such that M contains as a factor one of the factors p, q, of an RSA key n, is vanishingly small. If M is too large for the modulus n of an RSA key, that is, larger than n 1, then the text can be divided into blocks whose numerical representations M1, M2, M3, . . . all are less than n. These blocks must then be individually encrypted.

For texts of considerable length this becomes tiresome, and therefore one seldom uses the RSA algorithm for the encryption of long texts. For this purpose one may employ symmetric cryptosystems (such as Triple-DES, IDEA, or Rijndael; see Chapter 19 and [Schn], Chapters 12, 13, 14), with which the process goes much more quickly and with equivalent security. For an encrypted transmittal of the necessary key, which must be kept secret with symmetric procedures, the RSA algorithm is perfectly suited.

[3]Lenstra: Arjen K.: Factorization of RSA-130 using the Number Field Sieve, http://dbs.cwi.nl.herman.NFSrecords/RSA-130; see also [Cowi].

[4]E-mail from <Herman.te.Riele@cwi.nl> in Number Theory Network of 4 February 1999. See also http://www.rsasecurity.com/rsalabs/html/status.HTML.

[5]http://www.rsasecurity.com/rsalabs/html/factoring.html.

[6]MIPS = mega instructions per second measures the speed of a computer. A computer works at 1 MIPS if is can execute 700,000 additions and 300,000 multiplications per second.

[7]It is useful to choose RSA keys of binary length a multiple of 8, in conformity with the convention that such keys should end at the end of a byte.


Team-Fly


Cryptography in C and C++
Cryptography in C and C++
ISBN: 189311595X
EAN: 2147483647
Year: 2001
Pages: 127

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net