The key-mixing function creates a new key for every packet transmitted. It was introduced for two reasons:
The approach is to combine the session key, the IV and the source MAC address together with a hash function to produce a mixed key. Including the source MAC address provides added protection against forgery and also separates out the key space of the two communicating devices that share the session key.
Performing a hash operation for every packet to be encrypted or decrypted is a major overhead and hard work for the low-power MAC processor typical in earlier Wi-Fi LAN systems. To ease the burden, the mixing has been divided into two phases. During phase 1, the session key and source address are hashed. The result remains constant during the session. In phase 2, performed for every packet, the IV is mixed with the result of the first phase to produce an encryption key. This key is then used to initialize the RC4 encryption hardware.
Note that even the second part of the key-mixing computation can be performed in advance because the IV increases monotonically. Therefore, a station knows which values of IV will be coming up shortly and could precompute a number of keys in advance. This step avoids the need for a real-time computation when a packet is received. The two phases of key mixing are shown in Figure 11.3. As you can see in this figure, 104 bits of mixed key material are needed to form the total RC4 key of 128 bits when the IV is added.
There is nothing very glamorous about the computations for the key mixing but let's quickly look at the algorithm.
The inputs to the computation are abbreviated:
The two phases can be written as the following functions:
When the system starts up or a new key exchange occurs, the TSC is reset to 0. The system typically computes the value of P1K right away and stores it for use in generating P2K. P1K needs to be recomputed every time TSCU changes, which is every 65536 packets. There is no reason why the next value of P1K can't be computed in advance so there will be no delay when TCSU actually changes.
Substitution Table or S-Box
Both phase 1 and phase 2 require a byte substitution table or S-box. The substitution table is to the computer what logarithm tables used to be to schoolchildren before calculators. You take a value and look up a corresponding value in a table. The calculations have been done in advance to determine the correct values in the table. A typical substitution table for a byte value is 256 bytes long so there is one entry for each value of the byte.
However, key mixing uses 16-bit word values. A full substitution table for a 16-bit value would be 216 or 65,336 words long a total of 128Kbytes! However, this full table is needed only if you want to be able to have a reversible function and create a second table that will "undo" the substitution and get you back where you started. You do not need such a table for hashing functions; in fact, this type of reversal is something you want to prevent. Therefore, the key-mixing algorithm uses a partial substitution table with 512 word entries, which you can think of as two tables, each with 256 words. To make the substitution for a 16-bit word X, we use the upper byte of X as an index into the first table and the lower byte of X as an index into the second table. Then we exclusive OR the two words from the tables to produce a final 16-bit word substitution. This is denoted in the algorithm by the function:
i = S[j]
where i is the result of substituting for j.
The 512 values for the substitution table are listed in the standard. The same substation tables must be used by everyone for this approach to work.
Phase 1 Computation
The output of phase 1 is only 80 bits long (not 128), but it uses all 128 bits of the temporal key in the computation. The result is computed in an array of five 16-bit words called P1K0, P1K1, P1K2, P1K3, and P1K4. The following terminology is used in the algorithm:
the expression XY is used to denote combining two bytes into a 16-bit word so that:
S[ ] denotes the result from the 16-bit substitution table:
PHASE1_STEP1 P1K0 = TSC1 P1K1 = TSC2 P1K2 = TA1 TA0 P1K3 = TA3 TA2 P1K4 = TA5 TA4 PHASE1_STEP2: FOR i = 0 to 3 BEGIN P1K0 = P1K0 + S[ P1K4 (TK1 TK0) ] P1K1 = P1K1 + S[ P1K0 (TK5 TK4) ] P1K2 = P1K2 + S[ P1K1 (TK9 TK8) ] P1K3 = P1K3 + S[ P1K2 (TK13 TK12) ] P1K4 = P1K4 + S[ P1K3 (TK1 TK0) ] + i P1K0 = P1K0 + S[ P1K4 (TK3 TK2) ] P1K1 = P1K1 + S[ P1K0 (TK7 TK6) ] P1K2 = P1K2 + S[ P1K1 (TK11 TK10) ] P1K3 = P1K3 + S[ P1K2 (TK15 TK14) ] P1K4 = P1K4 + S[ P1K3 (TK3 TK2) ] + 2*i + 1 END
Although this is quite a bit of computation certainly more than in phase 2 the arithmetic comprises entirely shifts, adds, and exclusive OR operations.
Phase 2 Computation
On the face of it, phase 2 looks more complicated than phase 1. However, although there are more steps, there is no repeating loop in the computation. The result is computed in an array of six 16-bits words called: PPK0, PPK1, PPK2, PPK3, and PPK4. The following terminology is used in the algorithm:
The expression XY is used to denote combining two bytes into a 16-bit word so that:
The expression >>>(word) means that the 16-bit word is rotated one place right.
The expression (word) means that the 16-bit word is shifted one place right.
S[ ] denotes the result from the 16-bit substitution table.
RC4Keyn means the nth byte of the RC4 key used for encryption.
PHASE2,STEP1: PPK0 = P1K0 PPK1 = P1K1 PPK2 = P1K2 PPK3 = P1K3 PPK4 = P1K4 PPK5 = P1K5 + TSC0 PHASE2,STEP2: PPK0 = PPK0 + S[ PPK5 (TK1 TK0) ] PPK1 = PPK1 + S[ PPK0 (TK3 TK2) ] PPK2 = PPK2 + S[ PPK1 (TK5 TK4) ] PPK3 = PPK3 + S[ PPK2 (TK7 TK6) ] PPK4 = PPK4 + S[ PPK3 (TK9 TK8) ] PPK5 = PPK5 + S[ PPK4 (TK11 TK10) ] PPK0 = PPK0 + >>>(PPK5 (TK13 TK12)) PPK1 = PPK1 + >>>(PPK0 (TK15 TK14)) PPK2 = PPK2 + >>>(PPK1) PPK3 = PPK3 + >>>(PPK2) PPK4 = PPK4 + >>>(PPK3) PPK5 = PPK5 + >>>(PPK4) PHASE2,STEP3: RC4Key0 = UpperByte(TSC0) RC4Key1 = (UpperByte (TSC0) | 0x20) & 0x7F RC4Key2 = LowerByte(TSC0) RC4Key3 = LowerByte ((PPK5 (TK1 TK0)) RC4Key4 = LowerByte (PPK0) RC4Key5 = UpperByte (PPK0) RC4Key6 = LowerByte (PPK1) RC4Key7 = UpperByte (PPK1) RC4Key8 = LowerByte (PPK2) RC4Key9 = UpperByte (PPK2) RC4Key10 = LowerByte (PPK3) RC4Key11 = UpperByte (PPK3) RC4Key12 = LowerByte (PPK4) RC4Key13 = UpperByte (PPK4) RC4Key14 = LowerByte (PPK5) RC4Key15 = UpperByte (PPK5)
The final output of phase 2 is an array of 16 bytes containing the RC4 key to be used in encryption. This can be loaded into the legacy WEP encryption engine prior to processing the MPDU for transmission. The first 3 bytes of this key are transmitted as the WEP IV field (24 bits) and contain the lower 16 bits of the TKIP IV value and the TSC. The second byte of the WEP IV is a repeat of the first byte, except that bit 5 is forced to 1 and bit 4 is forced to 0. Forcing these bits prevents generation of the major class of weak keys. This byte is ignored on receipt.