6.2 M-ary Exponentiation

Team-Fly

6.2 M-ary Exponentiation

Through a generalization of the binary algorithm on page 80 the number of modular multiplications for exponentiation can be reduced even further. The idea is to represent the exponent in a base greater than 2 and to replace multiplication by a in step 3 by multiplication by powers of a. Thus let the exponent e be given by e = (en1 en2...e0)M, to a base M yet to be determined. The following algorithm calculates the powers ae mod m.

M-ary algorithm for exponentiation ae mod m

  1. Calculate and store a2 mod m, a3 mod m,..., aM1 mod m as a table.

  2. Set p mod m and i n 2.

  3. Set p pM mod m.

  4. If ei 0, set p mod m.

  5. Set i i 1; if i 0, go to step 3.

  6. Output p.

The number of necessary multiplications evidently depends on the number of digits of the exponent e and thus on the choice of M. Therefore, we would like to determine M such that the exponentiation in step 3 can be computed to the greatest extent possible by means of squaring, as in the example above for 216, and such that the number of multiplications by the precomputed powers of a be minimized to a justifiable cost of storage space for the table.

The first condition suggests that we choose M as a power of 2: M = 2k. In view of the second condition we consider the number of modular multiplications as a function of M:

We require

(6.1) 

squares in step 3 and on average

(6.2) click to expand

modular multiplications in step 4, where

is the probability that a digit ei of e is nonzero. If we include the M 2 multiplications for the computation of the table, then the M-ary algorithm requires on average

(6.3) click to expand

(6.4) 

modular squarings and multiplications.

For exponents e and moduli m of, say, 512 binary places and M = 2k we obtain the numbers of modular multiplications for the calculation of ae mod m as shown in Table 6.1. The table shows as well the memory requirement for the precomputed powers of a mod m, which result from the product (2k 2)CLINTMAXSHORT · sizeof(USHORT).

Table 6.1: Requirements for exponentiation

k

Multiplications

Memory (in bytes)

1

766

0

2

704

1028

3

666

3084

4

644

7196

5

640

15420

6

656

31868

We see from the table that the average number of multiplications reaches a minimum of 640 at k = 5, where the required memory for each larger k grows by approximately a factor of 2. But what are the time requirements for other orders of magnitude of the exponents?

Table 6.2 gives information about this. It displays the requirements for modular multiplication for exponentiation with exponents with various numbers of binary digits and for various values of M = 2k. The exponent length of 768 digits was included because it is a frequently used key length for the RSA cryptosystem (see Chapter 16). The favorable numbers of multiplications appear in boldface type.

Table 6.2: Numbers of multiplications for typical sizes of exponents and various bases 2k

Number of binary digits in the exponent

k

32

64

128

512

768

1024

2048

4096

1

45

93

190

766

1150

1534

3070

6142

2

44

88

176

704

1056

1408

2816

5632

3

46

87

170

666

996

1327

2650

5295

4

52

91

170

644

960

1276

2540

5068

5

67

105

181

640

945

1251

2473

4918

6

98

135

209

656

954

1252

2444

4828

7

161

197

271

709

1001

1294

2463

4801

8

288

324

396

828

1116

1404

2555

4858

In consideration of the ranges of numbers for which the FLINT/C package was developed, it appears that with k = 5 we have found the universal base M = 2k, with which, however, there is a rather high memory requirement of 15 kilobytes for the powers a2, a3,..., a31 to base a that are to be precomputed. The M-ary algorithm can be improved, however, according to [Cohe], Section 1.2, to the extent that we can employ not M 2, but only M/2, premultiplications and thus require only half the memory. The task now additionally consists in the calculation of the power ae mod m, where e = (en1en2...e0)M is the representation of the exponent to the base M = 2k.

M-ary Algorithm for exponentiation with reduced number of premultiplications

  1. Compute and store a3 mod m, a5 mod m, a7 mod m,..., mod m.

  2. If en1 = 0, set p 1.

    If en1 0, factor en1 = 2tu with odd u. Set p au mod m.

    In each case set i n 2.

  3. If ei = 0, set p mod m by calculating mod m (k-fold squaring modulo m).

    If ei 0, factor ei = 2tu with odd u; set p mod m and then p pau mod m; now set p mod m.

  4. Set i i 1; if i 0, go to step 3.

  5. Output p.

The trick of this algorithm consists in dividing up the squarings required in step 3 in a clever way, such that the exponentiation of a is taken care of together with the even part 2t of ei. Within the squaring process the exponentiation of a by the odd part u of ei remains. The balance between multiplication and squaring is shifted to the more favorable squaring, and only the powers of a with odd exponent need to be precomputed and stored.

For this splitting one requires the uniquely determined representation ei = 2tu, u odd, of the exponent digit ei. For rapid access to t and u a table is used, which, for example, for k = 5 is displayed in Table 6.3.

Table 6.3: Values for the factorization of the exponent digits into products of a power of 2 and an odd factor

ei

t

u

ei

t

u

ei

t

u

0

0

0

11

0

11

22

1

11

1

0

1

12

2

3

23

0

23

2

1

1

13

0

13

24

3

3

3

0

3

14

1

7

25

0

25

4

2

1

15

0

15

26

1

13

5

0

5

16

4

1

27

0

27

6

1

3

17

0

17

28

2

7

7

0

7

18

1

9

29

0

29

8

3

1

19

0

19

30

1

15

9

0

9

20

2

5

31

0

31

10

1

5

21

0

21

   

To calculate these values we can use the auxiliary function twofact_l(), which will be introduced in Section 10.4.1. Before we can program the improved M-ary algorithm there remains one problem to be solved: How, beginning with the binary representation of the exponent or the representation to base B = 216, do we efficiently obtain its representation to base M = 2k for a variable k > 0? It will be of use here to do a bit of juggling with the various indices, and we can "mask out" the required digits ei to base M from the representation of e to base B. For this we set the following: Let (εr1εr2...ε0)2 be the representation of the exponent e to base 2 (we need this on account of the number r of binary digits). Let (eu1eu2...e0)B be the representation of e as a CLINT type to base B = 216, and let be the representation of e to the base M = 2k, k 16 (M should not be greater than our base B). The representation of e in memory as a CLINT object e_l corresponds to the sequence [u + 1],[e0],[e1],...,[eu1],[0] of USHORT values e_l[i] for i = 0,..., u + 1; one should note that we have added a leading zero.

Let f := , and for i = 0,..., f let si := and di := ki mod 16. With these settings the following statements hold:

  1. There are f + 1 digits in ; that is, n 1 = f.

  2. contains the least-significant bit of the digit .

  3. di specifies the position of the least-significant bit of in (counting of positions begins with 0). If i < f and di > 16 k, then not all the binary digits of are in ; the remaining (higher-valued) bits of are in +1. The desired digit thus corresponds to the k least-significant binary digits of

As a result we have for i Î {0,..., f} the following expression for determining :

(6.5) click to expand

Since for the sake of simplicity we set e_l [sf + 2] 0, this expression holds as well for i = f.

We have thus found an efficient method for accessing the digits of the exponent in its CLINT representation, which arise from its representation in a power-of-two base 2k with k 16, whereby we are saved an explicit transformation of the exponent. The number of necessary multiplications and squarings for the exponentiation is now

(6.6) click to expand

where in comparison to μ1(k) (see page 86) the expenditure for the precomputations has been reduced by half. The table for determining the favorable values of k (Table 6.4) now has a somewhat different appearance.

Table 6.4: Numbers of multiplications for typical sizes of exponents and various bases 2k

Number of binary digits in the exponent

k

32

64

128

512

768

1024

2048

4096

1

47

95

191

767

1151

1535

3071

6143

2

44

88

176

704

1056

1408

2816

5632

3

44

85

168

664

994

1325

2648

5293

4

46

85

164

638

954

1270

2534

5066

5

53

91

167

626

931

1237

2459

4904

6

68

105

179

626

924

1222

2414

4798

7

99

135

209

647

939

1232

2401

4739

8

162

198

270

702

990

1278

2429

4732

Starting with 768 binary digits of the exponent, the favorable values of k are larger by 1 than those given in the table for the previous version of the exponentiation algorithm, while the number of required modular multiplications has easily been reduced. It is to be expected that this procedure is on the whole more favorable than the variant considered previously. Nothing now stands in the way of an implementation of the algorithm.

To demonstrate the implementation of these principles we select an adaptive procedure that uses the appropriate optimal value for k. To accomplish this we rely again on [Cohe] and look for, as is specified there, the smallest integer value k that satisfies the inequality

(6.7) 

which comes from the formula μ2(k) given previously for the number of necessary multiplications based on the condition μ2(k + 1) μ2(k) 0. The constant number of modular squarings log2 e for all algorithms for exponentiation introduced thus far is eliminated; here only the "real" modular multiplications, that is, those that are not squarings, are considered.

The implementation of exponentiation with variable k requires a large amount of main memory for storing the precomputed powers of a; for k = 8 we require about 64 Kbyte for 127 CLINT variables (this is arrived at via (27 1) * sizeof(USHORT) * CLINTMAXSHORT), where two additional automatic CLINT fields were not counted. For applications with processors or memory models with segmented 16-bit architecture this already has reached the limit of what is possible (see in this regard, for example, [Dunc], Chapter 12, or [Petz], Chapter 7).

Depending on the system platform there are thus various strategies appropriate for making memory available. While the necessary memory for the function mexp5_l() is taken from the stack (as automatic CLINT variables), with each call of the following function mexpk_l() memory is allocated from the heap. To save the expenditure associated with this, one may imagine a variant in which the maximum needed memory is reserved during a one-time initialization and is released only at the end of the entire program. In each case it is possible to fit memory management to the concrete requirements and to orient oneself to this in the commentaries on the following code.

One further note for applications: It is recommended always to check whether it suffices to employ the algorithm with the base M = 25. The savings in time that comes with larger values of k is relatively not so large in comparison to the total calculation time so as to justify in all cases the greater demand on memory and the thereby requisite memory management. Typical calculation times for various exponentiation algorithms, on the basis of which one can decide whether to use them, are given in Appendix D.

The algorithm, implemented with M = 25, is contained in the FLINT/C package as the function mexp5_l(). With the macro EXP_L() contained in flint.h one can set the exponentiation function to be used: mexp5_l() or the following function mexpk_l() with variable k.

start sidebar

Function:

modular exponentiation

Syntax:

 int mexpk_l (CLINT bas_l, CLINT exp_l,                CLINT p_l, CLINT m_l); 

Input:

bas_l (base)

exp_l (exponent)

m_l (modulus)

Output:

p_l (power residue)

Return:

E_CLINT_OK if all is ok

E_CLINT_DBZ if division by 0

E_CLINT_MAL if malloc() error

end sidebar

start sidebar

We begin with a segment of the table for representing ei = 2tu, u odd, 0 ei < 28. The table is represented in the form of two vectors. The first, twotab[], contains the exponents t of the two-factor 2t, while the second, oddtab[], holds the odd part u of a digit 0 ei < 25. The complete table is contained, of course, in the FLINT/C source code.

end sidebar

 static int twotab[] = {0,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,4,0,1,0,2,0,1,0,3,0,1,0,2,0,1,0,5, ...}; static USHORT oddtab[]= {0,1,1,3,1,5,3,7,1,9,5,11,3,13,7,15,1,17,9,19,5,21,11,23,3,25,13, ...}; int mexpk_l (CLINT bas_l, CLINT exp_l, CLINT p_l, CLINT m_l) { 

start sidebar

The definitions reserve memory for the exponents plus the leading zero, as well as a pointer clint **aptr_l to the memory still to be allocated, which will take pointers to the powers of bas_l to be precomputed. In acc_l the intermediate results of the exponentiation will be stored.

end sidebar

   CLINT a_l, a2_l;   clint e_l[CLINTMAXSHORT + 1];   CLINTD acc_l;   clint **aptr_l, *ptr_l;   int noofdigits, s, t, i;   unsigned int k, lge, bit, digit, fk, word, pow2k, k_mask; 

start sidebar

Then comes the usual checking for division by 0 and reduction by 1.

end sidebar

   if (EQZ_L (m_l))     {       return E_CLINT_DBZ;     }   if (EQONE_L (m_l))     {       SETZERO_L (p_l);    /* modulus = 1 ==> residue = 0 */       return E_CLINT_OK;     } 

start sidebar

Base and exponent are copied to the working variables a_l and e_l, and any leading zeros are purged.

end sidebar

   cpy_l (a_l, bas_l);   cpy_l (e_l, exp_l); 

start sidebar

Now we process the simple cases a0 = 1 and 0e = 0 (e > 0).

end sidebar

   if (EQZ_L (e_l))     {       SETONE_L (p_l);       return E_CLINT_OK;     }   if (EQZ_L (a_l))     {       SETZERO_L (p_l);       return E_CLINT_OK;     } 

start sidebar

Next, the optimal value for k is determined; the value 2k is stored in pow2k, and in k_mask the value 2k1. For this the function ld_l() is used, which returns the number of binary digits of its argument.

end sidebar

   lge = ld_l (e_l);   k = 8;   while (k > 1 && ((k - 1) * (k << ((k - 1) << 1))/((1 << k ) - k - 1)) >= lge - 1)     {       --k;     }   pow2k = 1U << k;   k_mask = pow2k - 1U; 

start sidebar

Memory is allocated for the pointers to the powers of a_l to be computed. The base a_l is reduced modulo m_l.

end sidebar

   if ((aptr_l = (clint **) malloc (sizeof(clint *) * pow2k)) == NULL)     {       return E_CLINT_MAL;     }   mod_l (a_l, m_l, a_l);   aptr_l[1] = a_l; 

start sidebar

If k > 1, then memory is allocated for the powers to be computed. This is not necessary for k = 1, since then no powers have to be precomputed. In the following setting of the pointer aptr_l[i] one should note that in the addition of an offset to a pointer p a scaling of the offset by the compiler takes place, so that it counts objects of the pointer type of p.

We have already mentioned that the allocation of working memory can be carried out alternatively in a one-time initialization. The pointers to the CLINT objects would in this case be contained in global variables outside of the function or in static variables within mexpk_l().

end sidebar

     if (k > 1)       {          if ((ptr_l = (clint *) malloc (sizeof(CLINT) * ((pow2k >> 1) - 1))) == NULL)            {               return E_CLINT_MAL;            }          aptr_l[2] = a2_l;          for (aptr_l[3] = ptr_l, i = 5; i < (int)pow2k; i+=2)            {               aptr_l[i] = aptr_l[i - 2] + CLINTMAXSHORT;            } 

start sidebar

Now comes the precomputation of the powers of the value a stored in a_l. The values a3, a5, a7,...,ak 1 are computed (a2 is needed only in an auxiliary role).

end sidebar

     msqr_l (a_l, aptr_l[2], m_l);     for (i = 3; i < (int)pow2k; i += 2)        {          mmul_l (aptr_l[2], aptr_l[i - 2], aptr_l[i], m_l);        }    } 

start sidebar

This ends the case distinction for k > 1. The exponent is lengthened by the leading zero.

end sidebar

   *(MSDPTR_L (e_l) + 1) = 0; 

start sidebar

The determination of the value f (represented by the variable noofdigits).

end sidebar

   noofdigits = (lge - 1)/k;   fk = noofdigits * k; 

start sidebar

Word position si and bit position di of the digit ei in the variables word and bit.

end sidebar

   word = fk >> LDBITPERDGT;    /* fk div 16 */   bit = fk & (BITPERDGT-1U);    /* fk mod 16 */ 

start sidebar

Calculation of the digit en1 with the above-derived formula; en1 is represented by the variable digit.

end sidebar

   switch (k)     {       case 1:       case 2:       case 4:       case 8:          digit = ((ULONG)(e_l[word + 1] ) >> bit) & k_mask;          break;       default:          digit = ((ULONG)(e_l[word + 1] | ((ULONG)e_l[word + 2]                                                << BITPERDGT)) >> bit) & k_mask;     } 

start sidebar

First run through step 3 of the algorithm, the case digit = en1 0.

end sidebar

     if (digit != 0)    /* k-digit > 0 */        {          cpy_l (acc_l, aptr_l[oddtab[digit]]); 

start sidebar

Calculation of ; t is set to the two-part of en1 via twotab[en1]; p is represented by acc_l.

end sidebar

          t = twotab[digit];          for (; t > 0; t--)             {               msqr_l (acc_l, acc_l, m_l);             }             }          else    /* k-digit == 0 */             {               SETONE_L (acc_l);             } 

start sidebar

Loop over noofdigits beginning with f - 1.

end sidebar

   for (--noofdigits, fk -= k; noofdigits >= 0; noofdigits--, fk -= k)     { 

start sidebar

Word position si and bit position di of the digit ei in the variables word and bit.

end sidebar

      word = fk >> LDBITPERDGT;     /* fk div 16 */      bit = fk & (BITPERDGT - 1U);     /* fk mod 16 */ 

start sidebar

Computation of the digit ei with the formula derived above; ei is represented by the variable digit.

end sidebar

      switch (k)         {           case 1:           case 2:           case 4:           case 8:              digit = ((ULONG)(e_l[word + 1] ) >> bit) & k_mask;              break;           default:              digit = ((ULONG)(e_l[word + 1] | ((ULONG)e_l[word + 2]                                                    << BITPERDGT)) >> bit) & k_mask;         } 

start sidebar

Step 3 of the algorithm, the case digit = ei 0; t is set via the table twotab[ei] to the two-part of ei.

end sidebar

      if (digit != 0)    /* k-digit > 0 */         {           t = twotab[digit]; 

start sidebar

Calculation of au in acc_l. Access to au with the odd part u of ei is via aptr_l [oddtab [ei]].

end sidebar

          for (s = k - t; s > 0; s--)             {               msqr_l (acc_l, acc_l, m_l);             }          mmul_l (acc_l, aptr_l[oddtab[digit]], acc_l, m_l); 

start sidebar

Calculation of ; p is still represented by acc_l.

end sidebar

          for (; t > 0; t--)             {               msqr_l (acc_l, acc_l, m_l);             }        }     else    /* k-digit == 0 */        { 

start sidebar

Step 3 of the algorithm, case ei = 0: Calculate .

end sidebar

          for (s = k; s > 0; s--)             {               msqr_l (acc_l, acc_l, m_l);             }         }     } 

start sidebar

End of the loop; output of acc_l as power residue modulo m_l.

end sidebar

   cpy_l (p_l, acc_l); 

start sidebar

At the end, allocated memory is released.

end sidebar

   free (aptr_l);   if (ptr_l != NULL)     free (ptr_l);   return E_CLINT_OK; } 

The various processes of M-ary exponentiation can be clarified with the help of a numerical example. To this end let us examine the calculation of the power 1234667 mod 18577, which will be carried out by the function mexpk_l() in the following steps:

  1. Precomputations

    The representation of the exponent e = 667 can be expressed to the base 2k with k = 2 (cf. the algorithm for M-ary exponentiation on page 86), whereby the exponent e has the representation e = .

    The power a3 mod 18577 has the value 17354. Further powers of a do not arise in the precomputation because of the small size of the exponent.

  2. Exponentiation loop

    exponent digit ei = 2tu

    21 · 1

    21 · 1

    20 · 1

    21 · 1

    20 · 3

    p p2 mod n

    -

    14132

    13261

    17616

    13599

    p mod n

    -

    -

    4239

    -

    17343

    p pau mod n

    1234

    13662

    10789

    3054

    4445

    p p2 mod n

    18019

    7125

    -

    1262

    -

  3. Result

    p = 1234667 mod 18577 = 4445.

As an extension to the general case we shall introduce a special version of exponentiation with a power of two 2k as exponent. From the above considerations we know that this function can be implemented very easily by means of k-fold exponentiation. The exponent 2k will be specified by k.

start sidebar

Function:

modular exponentiation with exponent a power of 2

Syntax:

 int mexp2_l (CLINT a_l, USHORT k, CLINT p_l,                CLINT m_l); 

Input:

a_l (base)

k (exponent of 2)

m_l (modulus)

Output:

p_l (residue of mod m_l)

Return:

E_CLINT_OK if all is ok

E_CLINT_DBZ if division by 0

end sidebar

 int mexp2_l (CLINT a_l, USHORT k, CLINT p_l, CLINT m_l) {   CLINT tmp_l;   if (EQZ_L (m_l))     {       return E_CLINT_DBZ;     } 

start sidebar

If k > 0, then a_l is squared k times modulo m_l.

end sidebar

   if (k > 0)     {       cpy_l (tmp_l, a_l);       while (k-- > 0)          {            msqr_l (tmp_l, tmp_l, m_l);          }       cpy_l (p_l, tmp_l);     }   else 

start sidebar

Otherwise, if k = 0, we need only to reduce modulo m_l.

end sidebar

     {       mod_l (a_l, m_l, p_l);     }   return E_CLINT_OK; } 


Team-Fly


Cryptography in C and C++
Cryptography in C and C++
ISBN: 189311595X
EAN: 2147483647
Year: 2001
Pages: 127

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net