Section 5.2. The AES Cipher


[Page 140]

5.2. The AES Cipher[2]

[2] Much of the material in this section originally appeared in [STAL02].

The Rijndael proposal for AES defined a cipher in which the block length and the key length can be independently specified to be 128, 192, or 256 bits. The AES specification uses the same three key size alternatives but limits the block length to 128 bits. A number of AES parameters depend on the key length (Table 5.3). In the description of this section, we assume a key length of 128 bits, which is likely to be the one most commonly implemented.

Table 5.3. AES Parameters

Key size (words/bytes/bits)

4/16/128

6/24/192

8/32/256

Plaintext block size (words/bytes/bits)

4/16/128

4/16/128

4/16/128

Number of rounds

10

12

14

Round key size (words/bytes/bits)

4/16/128

4/16/128

4/16/128

Expanded key size (words/bytes)

44/176

52/208

60/240


Rijndael was designed to have the following characteristics:

  • Resistance against all known attacks

  • Speed and code compactness on a wide range of platforms

  • Design simplicity

Figure 5.1 shows the overall structure of AES. The input to the encryption and decryption algorithms is a single 128-bit block. In FIPS PUB 197, this block is depicted as a square matrix of bytes. This block is copied into the State array, which is modified at each stage of encryption or decryption. After the final stage, State is copied to an output matrix. These operations are depicted in Figure 5.2a. Similarly, the 128-bit key is depicted as a square matrix of bytes. This key is then expanded into an array of key schedule words; each word is four bytes and the total key schedule is 44 words for the 128-bit key (Figure 5.2b). Note that the ordering of bytes within a matrix is by column. So, for example, the first four bytes of a 128-bit plaintext input to the encryption cipher occupy the first column of the in matrix, the second four bytes occupy the second column, and so on. Similarly, the first four bytes of the expanded key, which form a word, occupy the first column of the w matrix.


[Page 141]

Figure 5.1. AES Encryption and Decryption



[Page 142]

Figure 5.2. AES Data Structures



[Page 143]

Before delving into details, we can make several comments about the overall AES structure:

  1. One noteworthy feature of this structure is that it is not a Feistel structure. Recall that in the classic Feistel structure, half of the data block is used to modify the other half of the data block, and then the halves are swapped. Two of the AES finalists, including Rijndael, do not use a Feistel structure but process the entire data block in parallel during each round using substitutions and permutation.

  2. The key that is provided as input is expanded into an array of forty-four 32-bit words, w[i]. Four distinct words (128 bits) serve as a round key for each round; these are indicated in Figure 5.1.

  3. Four different stages are used, one of permutation and three of substitution:

    • Substitute bytes: Uses an S-box to perform a byte-by-byte substitution of the block

    • ShiftRows: A simple permutation

    • MixColumns: A substitution that makes use of arithmetic over GF(28)

    • AddRoundKey: A simple bitwise XOR of the current block with a portion of the expanded key

  4. The structure is quite simple. For both encryption and decryption, the cipher begins with an AddRoundKey stage, followed by nine rounds that each includes all four stages, followed by a tenth round of three stages. Figure 5.3 depicts the structure of a full encryption round.

    Figure 5.3. AES Encryption Round
    (This item is displayed on page 144 in the print version)

  5. Only the AddRoundKey stage makes use of the key. For this reason, the cipher begins and ends with an AddRoundKey stage. Any other stage, applied at the beginning or end, is reversible without knowledge of the key and so would add no security.

  6. The AddRoundKey stage is, in effect, a form of Vernam cipher and by itself would not be formidable. The other three stages together provide confusion, diffusion, and nonlinearity, but by themselves would provide no security because they do not use the key. We can view the cipher as alternating operations of XOR encryption (AddRoundKey) of a block, followed by scrambling of the block (the other three stages), followed by XOR encryption, and so on. This scheme is both efficient and highly secure.

  7. Each stage is easily reversible. For the Substitute Byte, ShiftRows, and MixColumns stages, an inverse function is used in the decryption algorithm. For the AddRoundKey stage, the inverse is achieved by XORing the same round key to the block, using the result that A A B = B.

  8. Once it is established that all four stages are reversible, it is easy to verify that decryption does recover the plaintext. Figure 5.1 lays out encryption and decryption going in opposite vertical directions. At each horizontal point (e.g., the dashed line in the figure), State is the same for both encryption and decryption.


    [Page 145]

  9. The final round of both encryption and decryption consists of only three stages. Again, this is a consequence of the particular structure of AES and is required to make the cipher reversible.

We now turn to a discussion of each of the four stages used in AES. For each stage, we describe the forward (encryption) algorithm, the inverse (decryption) algorithm, and the rationale for the stage. This is followed by a discussion of key expansion.

As was mentioned in Chapter 4, AES uses arithmetic in the finite field GF(28), with the irreducible polynomial[3] m(x) = x8 + x4 + x3 + x + 1. The developers of Rijndael give as their motivation for selecting this one of the 30 possible irreducible polynomials of degree 8 that it is the first one on the list given in [LIDL94].

[3] In the remainder of this discussion, references to GF(28) refer to the finite field defined with this polynomial.

Substitute Bytes Transformation

Forward and Inverse Transformations

The forward substitute byte transformation, called SubBytes, is a simple table lookup (Figure 5.4a). AES defines a 16 x 16 matrix of byte values, called an S-box (Table 5.4a), that contains a permutation of all possible 256 8-bit values. Each individual byte of State is mapped into a new byte in the following way: The leftmost 4 bits of the byte are used as a row value and the rightmost 4 bits are used as a column value. These row and column values serve as indexes into the S-box to select a unique 8-bit output value. For example, the hexadecimal value[4] {95} references row 9, column 5 of the S-box, which contains the value {2A}. Accordingly, the value {95} is mapped into the value {2A}.

[4] In FIPS PUB 197, a hexadecimal number is indicated by enclosing it in curly brackets. We use that convention in this chapter.


[Page 147]

Figure 5.4. AES Byte-Level Operations
(This item is displayed on page 145 in the print version)


Table 5.4. AES S-Boxes
(This item is displayed on page 146 in the print version)


Here is an example of the SubBytes transformation:

The S-box is constructed in the following fashion:

  1. Initialize the S-box with the byte values in ascending sequence row by row. The first row contains {00}, {01}, {02},.... {0F}; the second row contains {10}, {11}, etc.; and so on. Thus, the value of the byte at row x, column y is {xy}.

  2. Map each byte in the S-box to its multiplicative inverse in the finite field GF(28); the value {00} is mapped to itself.

  3. Consider that each byte in the S-box consists of 8 bits labeled (b7, b6, b5, b4, b3, b2, b1, b0). Apply the following transformation to each bit of each byte in the S-box:

    Equation 5-1


    where ci is the ith bit of byte c with the value {63}; that is, (c7c6c5c4c3c2c1c0) = (01100011). The prime (') indicates that the variable is to be updated by the value on the right. The AES standard depicts this transformation in matrix form as follows:

    Equation 5-2



[Page 148]

Equation (5.2) has to be interpreted carefully. In ordinary matrix multiplication,[5] each element in the product matrix is the sum of products of the elements or one row and one column. In this case, each element in the product matrix is the bitwise XOR of products of elements of one row and one column. Further, the final addition shown in Equation (5.2) is a bitwise XOR.

[5] For a brief review of the rules of matrix and vector multiplication, see the Math Refresher document and the Computer Science Student Resource site at williamstallings.com/StudentSupport.html.

As an example, consider the input value {95}. The multiplicative inverse in GF(28) is {95}1 = {8A}, which is 10001010 in binary. Using Equation (5.2),


The result is {2A}, which should appear in row {09} column {05} of the S-box. This is verified by checking Table 5.4a.

The inverse substitute byte transformation, called InvSubBytes, makes use of the inverse S-box shown in Table 5.4b. Note, for example, that the input {2A} produces the output {95} and the input {95} to the S-box produces {2A}. The inverse S-box is constructed by applying the inverse of the transformation in Equation (5.1) followed by taking the multiplicative inverse in GF(28). The inverse transformation is:

bi' = b(i + 2) mod 8 i + 5) mod 8 i + 7) mod 8 i

where byte d = {05}, or 00000101. We can depict this transformation as follows:


To see that InvSubBytes is the inverse of SubBytes, label the matrices in SubBytes and InvSubBytes as X and Y, respectively, and the vector versions of constants c and d as C and D, respectively. For some 8-bit vector B, Equation (5.2) becomes B' = XB Y(XB B. Multiply out, we must show YXB B. This becomes


[Page 149]


We have demonstrated that YX equals the identity matrix, and the YC = D, so that YC Rationale

The S-box is designed to be resistant to known cryptanalytic attacks. Specifically, the Rijndael developers sought a design that has a low correlation between input bits and output bits, and the property that the output cannot be described as a simple mathematical function of the input [DAEM01]. In addition, the constant in Equation (5.1) was chosen so that the S-box has no fixed points [S-box(a) = a] and no "opposite fixed points" [S-box(a) = ], where is the bitwise complement of Of course, the S-box must be invertible, that is, IS-box[S-box(a)] = a. However, the S-box is not self-inverse in the sense that it is not true that S-box(a) = IS-box(a). For example, [S-box({95}) = {2A}, but IS-box({95}) = {AD}.


[Page 150]

ShiftRows Transformation

Forward and Inverse Transformations

The forward shift row transformation, called ShiftRows, is depicted in Figure 5.5a. The first row of State is not altered. For the second row, a 1-byte circular left shift is performed. For the third row, a 2-byte circular left shift is performed. For the fourth row, a 3-byte circular left shift is performed. The following is an example of ShiftRows:

Figure 5.5. AES Row and Column Operations


The inverse shift row transformation, called InvShiftRows, performs the circular shifts in the opposite direction for each of the last three rows, with a one-byte circular right shift for the second row, and so on.

Rationale

The shift row transformation is more substantial than it may first appear. This is because the State, as well as the cipher input and output, is treated as an array of four 4-byte columns. Thus, on encryption, the first 4 bytes of the plaintext are copied to the first column of State, and so on. Further, as will be seen, the round key is applied to State column by column. Thus, a row shift moves an individual byte from one column to another, which is a linear distance of a multiple of 4 bytes. Also note that the transformation ensures that the 4 bytes of one column are spread out to four different columns. Figure 5.3 illustrates the effect.


[Page 151]

MixColumns Transformation

Forward and Inverse Transformations

The forward mix column transformation, called MixColumns, operates on each column individually. Each byte of a column is mapped into a new value that is a function of all four bytes in that column. The transformation can be defined by the following matrix multiplication on State (Figure 5.5b):

Equation 5-3


Each element in the product matrix is the sum of products of elements of one row and one column. In this case, the individual additions and multiplications[6] are performed in GF(28). The MixColumns transformation on a single column j(0 3) of [6] We follow the convention of FIPS PUB 197 and use the symbol · to indicate multiplication over the finite field GF(28) and to indicate bitwise XOR, which corresponds to addition in GF(28).

Equation 5-4


The following is an example of MixColumns:

Let us verify the first column of this example. Recall from Section 4.6 that, in GF(28), addition is the bitwise XOR operation and that multiplication can be performed according to the rule established in Equation (4.10). In particular, multiplication of a value by x (i.e., by {02}) can be implemented as a 1-bit left shift followed by a conditional bitwise XOR with (0001 1011) if the leftmost bit of the original value (prior to the shift) is 1. Thus, to verify the MixColumns transformation on the first column, we need to show that

({02} · {87})

({03} · {6E})

{46}

{A6}

= {47}

{87}

({02} · {6E})

({03} · {46})

{A6}

= {37}

{87}

{6E}

({02} · {46}

({03} · {A6})

= {94}

({03} · {87})

{6E}

{46}

({02} · {A6}

= {ED}



[Page 152]

For the first equation, we have {02} · {87} = (0000 1110) (0001 1011) = (0001 0101); and {03} · {6E} = {6E} ({02} · {6E}) = (0110 1110) (1101 1100) = (1011 0010). Then

 

{02} · {87}

=

0001 0101

 
 

{03} · {6E}

=

1011 0010

 
 

{46}

=

0100 0110

 
 

{A6}

=

1010 0110

   

0100 0111 = {47}


The other equations can be similarly verified.

The inverse mix column transformation, called InvMixColumns, is defined by the following matrix multiplication:

Equation 5-5


It is not immediately clear that Equation (5.5) is the inverse of Equation (5.3). We need to show that:


which is equivalent to showing that:

Equation 5-6


That is, the inverse transformation matrix times the forward transformation matrix equals the identity matrix. To verify the first column of Equation (5.6), we need to show that:

({0E} · {02}) {0B} {0D} ({09} · {03}) = {01}

({09} · {02}) {0E} {0B} ({0D} · {03}) = {00}

({0D} · {02}) {09} {0E} ({0B} · {03}) = {00}

({0B} · {02}) {0D} {09} ({0E} · {03}) = {00}


[Page 153]

For the first equation, we have {0E} · {02}) 00011100; and {09} · {03} = {09} ({09} · {02}) = 00001001 00010010 = 00011011. Then

 

{0E} · {02}

=

00011100

 

{0B}

=

00001011

 

{0D}

=

00001101

 

{09} · {03}

=

00011011

   

00000001


The other equations can be similarly verified.

The AES document describes another way of characterizing the MixColumns transformation, which is in terms of polynomial arithmetic. In the standard, MixColumns is defined by considering each column of State to be a four-term polynomial with coefficients in GF(28). Each column is multiplied modulo (x4 + 1) by the fixed polynomial a(x), given by

Equation 5-7


Appendix 5A demonstrates that multiplication of each column of State by a(x) can be written as the matrix multiplication of Equation (5.3). Similarly, it can be seen that the transformation in Equation (5.5) corresponds to treating each column as a four-term polynomial and multiplying each column by b(x), given by

Equation 5-8


It can readily be shown that b(x) = a1 (x) mod (x4 + 1).

Rationale

The coefficients of the matrix in Equation (5.3) are based on a linear code with maximal distance between code words, which ensures a good mixing among the bytes of each column. The mix column transformation combined with the shift row transformation ensures that after a few rounds, all output bits depend on all input bits. See [DAEM99] for a discussion.

In addition, the choice of coefficients in MixColumns, which are all {01}, {02}, or {03}, was influenced by implementation considerations. As was discussed, multiplication by these coefficients involves at most a shift and an XOR. The coefficients in InvMixColumns are more formidable to implement. However, encryption was deemed more important than decryption for two reasons:

  1. For the CFB and OFB cipher modes (Figures 6.5 and 6.6; described in Chapter 6), only encryption is used.

  2. As with any block cipher, AES can be used to construct a message authentication code (Part Two), and for this only encryption is used.

AddRoundKey Transformation

Forward and Inverse Transformations

In the forward add round key transformation, called AddRoundKey, the 128 bits of State are bitwise XORed with the 128 bits of the round key. As shown in Figure 5.4b, the operation is viewed as a columnwise operation between the 4 bytes of a State column and one word of the round key; it can also be viewed as a byte-level operation. The following is an example of AddRoundKey:


[Page 154]

The first matrix is State, and the second matrix is the round key.

The inverse add round key transformation is identical to the forward add round key transformation, because the XOR operation is its own inverse.

Rationale

The add round key transformation is as simple as possible and affects every bit of State. The complexity of the round key expansion, plus the complexity of the other stages of AES, ensure security.

AES Key Expansion

Key Expansion Algorithm

The AES key expansion algorithm takes as input a 4-word (16-byte) key and produces a linear array of 44 words (176 bytes). This is sufficient to provide a 4-word round key for the initial AddRoundKey stage and each of the 10 rounds of the cipher. The following pseudocode describes the expansion:

[View full width]

KeyExpansion (byte key[16], word w[44]) { word temp for (i = 0; i < 4; i++) w[i] = (key[4*i], key[4*i+1], key[4*i+2], key[4*i+3]); for (i = 4; i < 44; i++) { temp = w[i 1]; if (i mod 4 = 0) temp = SubWord (RotWord (temp)) Rcon[i/4]; w[i] = w[i4] temp } }



The key is copied into the first four words of the expanded key. The remainder of the expanded key is filled in four words at a time. Each added word w[i] depends on the immediately preceding word, w[i 1], and the word four positions back,w[i 4]. In three out of four cases, a simple XOR is used. For a word whose position in the w array is a multiple of 4, a more complex function is used. Figure 5.6 illustrates the generation of the first eight words of the expanded key, using the symbol g to represent that complex function. The function g consists of the following subfunctions:


    [Page 155]
  1. RotWord performs a one-byte circular left shift on a word. This means that an input word [b0, b1, b2, b3] is transformed into [b1, b2, b3, b0].

  2. SubWord performs a byte substitution on each byte of its input word, using the S-box (Table 5.4a).

  3. The result of steps 1 and 2 is XORed with a round constant, Rcon[j].

Figure 5.6. AES Key Expansion


The round constant is a word in which the three rightmost bytes are always 0. Thus the effect of an XOR of a word with Rcon is to only perform an XOR on the leftmost byte of the word. The round constant is different for each round and is defined as Rcon[j] = (RC[j], 0, 0, 0), with RC[1] = 1, RC[j] = 2 · RC[j - 1] and with multiplication defined over the field GF(28). The values of RC[j] in hexadecimal are

j

1

2

3

4

5

6

7

8

9

10

RC[j]

01

02

04

08

10

20

40

80

1B

36


For example, suppose that the round key for round 8 is

EA D2 73 21 B5 8D BA D2 31 2B F5 60 7F 8D 29 2F

Then the first 4 bytes (first column) of the round key for round 9 are calculated as follows:

i (decimal)

temp

After RotWord

After SubWord

Rcon (9)

After XOR with Rcon

w[i 4]

w[i] = temp w[i 4]

36

7F8D292F

8D292F7F

5DA515D2

1B000000

46A515D2

EAD27321

AC7766F3


Rationale

The Rijndael developers designed the expansion key algorithm to be resistant to known cryptanalytic attacks. The inclusion of a round-dependent round constant eliminates the symmetry, or similarity, between the ways in which round keys are generated in different rounds. The specific criteria that were used are as follows [DAEM99]:


[Page 156]
  • Knowledge of a part of the cipher key or round key does not enable calculation of many other round key bits

  • An invertible transformation [i.e., knowledge of any Nk consecutive words of the Expanded Key enables regeneration the entire expanded key (Nk = key size in words)]

  • Speed on a wide range of processors

  • Usage of round constants to eliminate symmetries

  • Diffusion of cipher key differences into the round keys; that is, each key bit affects many round key bits

  • Enough nonlinearity to prohibit the full determination of round key differences from cipher key differences only

  • Simplicity of description

The authors do not quantify the first point on the preceding list, but the idea is that if you know less than Nk consecutive words of either the cipher key or one of the round keys, then it is difficult to reconstruct the remaining unknown bits. The fewer bits one knows, the more difficult it is to do the reconstruction or to determine other bits in the key expansion.

Equivalent Inverse Cipher

As was mentioned, the AES decryption cipher is not identical to the encryption cipher (Figure 5.1). That is, the sequence of transformations for decryption differs from that for encryption, although the form of the key schedules for encryption and decryption is the same. This has the disadvantage that two separate software or firmware modules are needed for applications that require both encryption and decryption. There is, however, an equivalent version of the decryption algorithm that has the same structure as the encryption algorithm. The equivalent version has the same sequence of transformations as the encryption algorithm (with transformations replaced by their inverses). To achieve this equivalence, a change in key schedule is needed.

Two separate changes are needed to bring the decryption structure in line with the encryption structure. An encryption round has the structure SubBytes, ShiftRows, MixColumns, AddRoundKey. The standard decryption round has the structure InvShiftRows, InvSubBytes, AddRoundKey, InvMixColumns. Thus, the first two stages of the decryption round need to be interchanged, and the second two stages of the decryption round need to be interchanged.

Interchanging InvShiftRows and InvSubBytes

InvShiftRows affects the sequence of bytes in State but does not alter byte contents and does not depend on byte contents to perform its transformation. InvSubBytes affects the contents of bytes in State but does not alter byte sequence and does not depend on byte sequence to perform its transformation. Thus, these two operations commute and can be interchanged. For a given State Si,

InvShiftRows [InvSubBytes (Si)] = InvSubBytes [InvShiftRows (Si)]


[Page 157]
Interchanging AddRoundKey and InvMixColumns

The transformations AddRoundKey and InvMixColumns do not alter the sequence of bytes in State. If we view the key as a sequence of words, then both AddRoundKey and InvMixColumns operate on State one column at a time. These two operations are linear with respect to the column input. That is, for a given State Si and a given round key wj:

InvMixColumns (Si j) = [InvMixColumns (Si)] [InvMixColumns (j)]

To see this, suppose that the first column of State Si is the sequence (y0, y1, y2, y3) and the first column of the round key wj is (k0, k1, k2, k3). Then we need to show that


Let us demonstrate that for the first column entry. We need to show that:

[{0E} · (y0 k0)] [{0B} · (y1 k1)] [{0D} · (y2 k2)] [{09} · (y3 k3)]

= [{0E} · y0] [{0B} · y1] [{0D} · y2] [{09} · y3]

[[{0E} · k0] ] [{0B} · k1] [{0D} · k2] [{09} · k3]

This equation is valid by inspection. Thus, we can interchange AddRoundKey and InvMixColumns, provided that we first apply InvMixColumns to the round key. Note that we do not need to apply InvMixColumns to the round key for the input to the first AddRoundKey transformation (preceding the first round) nor to the last AddRoundKey transformation (in round 10). This is because these two AddRoundKey transformations are not interchanged with InvMixColumns to produce the equivalent decryption algorithm.

Figure 5.7 illustrates the equivalent decryption algorithm.

Figure 5.7. Equivalent Inverse Cipher
(This item is displayed on page 158 in the print version)


Implementation Aspects

The Rijndael proposal [DAEM99] provides some suggestions for efficient implementation on 8-bit processors, typical for current smart cards, and on 32-bit processors, typical for PCs.

8-Bit Processor

AES can be implemented very efficiently on an 8-bit processor. AddRoundKey is a bytewise XOR operation. ShiftRows is a simple byte shifting operation. SubBytes operates at the byte level and only requires a table of 256 bytes.

The transformation MixColumns requires matrix multiplication in the field GF(28), which means that all operations are carried out on bytes. MixColumns only requires multiplication by {02} and {03}, which, as we have seen, involved simple shifts, conditional XORs, and XORs. This can be implemented in a more efficient way that eliminates the shifts and conditional XORs. Equation Set (5.4) shows the equations for the MixColumns transformation on a single column. Using the identity {03} · x = ({02} · x)


[Page 158]

Equation 5-9


Equation Set (5.9) is verified by expanding and eliminating terms.


[Page 159]

The multiplication by {02} involves a shift and a conditional XOR. Such an implementation may be vulnerable to a timing attack of the sort described in Section 3.4. To counter this attack and to increase processing efficiency at the cost of some storage, the multiplication can be replaced by a table lookup. Define the 256-byte table X2, such that X2[i] = {02} · i. Then Equation Set (5.9) can be rewritten as

Tmp = so, j j s2, s3, s'0, j = s0, j Tmp so, j j]

s'1, c = s1, j Tmp s1, j j]

s'2, c = s2, j Tmp s2, j j]

s'3, j = s3, j Tmp s3, j j]

32-Bit Processor

The implementation described in the preceding subsection uses only 8-bit operations. For a 32-bit processor, a more efficient implementation can be achieved if operations are defined on 32-bit words. To show this, we first define the four transformations of a round in algebraic form. Suppose we begin with a State matrix consisting of elements ai,j and a round key matrix consisting of elements ki,j. Then the transformations can be expressed as follows:

SubBytes

bi,j = S[ai,j]

ShiftRows

MixColumns

AddRoundKey


In the ShiftRows equation, the column indices are taken mod 4. We can combine all of these expressions into a single equation:



[Page 160]

In the second equation, we are expressing the matrix multiplication as a linear combination of vectors. We define four 256-word (1024-byte) tables as follows:


Thus, each table takes as input a byte value and produces a column vector (a 32-bit word) that is a function of the S-box entry for that byte value. These tables can be calculated in advance.

We can define a round function operating on a column in the following fashion:


As a result, an implementation based on the preceding equation requires only four table lookups and four XORs per column per round, plus 4 Kbytes to store the table. The developers of Rijndael believe that this compact, efficient implementation was probably one of the most important factors in the selection of Rijndael for AES.




Cryptography and Network Security Principles and Practices
Cryptography and Network Security (4th Edition)
ISBN: 0131873164
EAN: 2147483647
Year: 2005
Pages: 209

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net