[Page 478] Both PGP and S/MIME make use of an encoding technique referred to as radix-64 conversion. This technique maps arbitrary binary input into printable character output. The form of encoding has the following relevant characteristics: The range of the function is a character set that is universally representable at all sites, not a specific binary encoding of that character set. Thus, the characters themselves can be encoded into whatever form is needed by a specific system. For example, the character "E" is represented in an ASCII-based system as hexadecimal 45 and in an EBCDIC-based system as hexadecimal C5. The character set consists of 65 printable characters, one of which is used for padding. With 26 = 64 available characters, each character can be used to represent 6 bits of input. No control characters are included in the set. Thus, a message encoded in radix 64 can traverse mail-handling systems that scan the data stream for control characters. The hyphen character ("-")is not used. This character has significance in the RFC 822 format and should therefore be avoided. Table 15.9 shows the mapping of 6-bit input values to characters. The character set consists of the alphanumeric characters plus "+" and "/". The "=" character is used as the padding character. Table 15.9. Radix-64 Encoding6-Bit | Character Encoding |
---|
0 | A | 1 | B | 2 | C | 3 | D | 4 | E | 5 | F | 6 | G | 7 | H | 8 | I | 9 | J | 10 | K | 11 | L | 12 | M | 13 | N | 14 | O | 15 | P | 16 | Q | 17 | R | 18 | S | 19 | T | 20 | U | 21 | V | 22 | W | 23 | X | 24 | Y | 25 | Z | 26 | a | 27 | b | 28 | c | 29 | d | 30 | e | 31 | f | 32 | g | 33 | h | 34 | i | 35 | j | 36 | k | 37 | l | 38 | m | 39 | n | 40 | o | 41 | p | 42 | q | 43 | r | 44 | s | 45 | t | 46 | u | 47 | v | 48 | w | 49 | x | 50 | y | 51 | z | 52 | 0 | 53 | 1 | 54 | 2 | 55 | 3 | 56 | 4 | 57 | 5 | 58 | 6 | 59 | 7 | 60 | 8 | 61 | 9 | 62 | + | 63 | / | (pad) | = |
[Page 479]Figure 15.11 illustrates the simple mapping scheme. Binary input is processed in blocks of 3 octets, or 24 bits. Each set of 6 bits in the 24-bit block is mapped into a character. In the figure, the characters are shown encoded as 8-bit quantities. In this typical case, each 24-bit input is expanded to 32 bits of output. Figure 15.11. Printable Encoding of Binary Data into Radix-64 Format For example, consider the 24-bit raw text sequence 00100011 01011100 10010001, which can be expressed in hexadecimal as 235C91. We arrange this input in blocks of 6 bits: 001000 110101 110010 010001 The extracted 6-bit decimal values are 8, 53, 50, 17. Looking these up in Table 15.9 yields the radix-64 encoding as the following characters: I1yR. If these characters are stored in 8-bit ASCII format with parity bit set to zero, we have 01001001 00110001 01111001 01010010 In hexadecimal, this is 49317952. To summarize, Input Data | Binary representation | 00100011 01011100 10010001 | Hexadecimal representation | 235C91 | Radix-64 Encoding of Input Data | Character representation | I1yR | ASCII code (8 bit, zero parity) | 01001001 00110001 01111001 01010010 | Hexadecimal representation | 49317952 |
|