3.4 Data Encoding | Designing Storage Area Networks: A Practical Reference for Implementing Fibre Channel and IP SANs (2nd Edition)

Proper transmission and reception of a digital bit stream would not be possible if raw bytes of data were simply herded into a shift register and shoved one bit at a time onto the transport. Among all the possible combinations of 1's and 0's in an 8-bit byte, and the possible combinations of bytes that could be sent one after the other, bit streams with sustained 1's or 0's could occur. A raw stream of hex FF bytes, for example, would appear as a sustained DC positive voltage on the transmission line. Without a byte encoding scheme to force transitions from positive to negative, it would be impossible to detect when one bit ended and another began in other words, it would be impossible to recover the clocking used to issue the stream.

This problem is exacerbated at gigabit and multigigabit speeds. At more than a thousand million bits per second and a standards-enforced bit error rate of 10¹², the Fibre Channel physical transport must ensure that every bit is properly recovered for reassembly into valid bytes. This is accomplished with an efficient encoding scheme that ensures that no sustained highs or lows will occur in the transport signal.

First developed by IBM, the 8b/10b encoding algorithm converts each 8-bit byte into two possible 10-bit characters. Each of the two 10-bit products of this conversion should have no more than six total 1's or 0's, and about half will have an equal number of 1's and 0's. The 10-bit characters with more 1's than 0's have positive disparity, whereas those with more 0's than 1's have negative disparity. An even balance of 1's and 0's results in neutral disparity.

In processing all 256 possible data bytes, the 8b/10b mechanism allows a maximum of only six bits of the same type per character, with no more than four of the same bit type to occur sequentially (see Figure 3-3). The hex byte x'FF', for example, is encoded from its original binary 1111 1111 into either 101011 0001 or 010100 1110. Both encoding results have neutral disparity, and neither has more than four of the same bit type in sequence.

Figure 3-3. 8b/10b encoding logic

graphics/03fig03.gif

Within the possible combinations that can be generated with 10 bits, a number have no relation to the standard 256 data bytes. Fibre Channel uses one of these nondata characters as a special command character. Known as the K28.5 command character, it has a positive disparity composition of 001111 1010 and a negative disparity of 110000 0101. In both instances, the special character has two bits of one type followed by five of the opposite, something that immediately distinguishes the command character from ordinary encoded data characters.

This unique comma sequence is used by data recovery circuitry as a trigger for defining the boundaries of 10-bit characters. When the comma sequence is detected in a serial bit stream, data recovery can begin picking up entire characters simply by counting the bits to the beginning of the next 10-bit character.

Given that the conversion of 8-bit data bytes produces two 10-bit results, how does the 8b/10b encoder decide which result to use? If the choice were arbitrary, the sequential shipment of some 10-bit characters would, in combination, result in an imbalance of the same bit types (including instances of five bits in sequence) appearing on the bit stream. As a result, the clock and data recovery circuitry at the receiving end would not be able to retrieve valid characters.

The 8b/10b encoder resolves this problem by monitoring the disparity of the previously processed character. If a positive disparity character is transmitted, the next character issued should have negative disparity. By monitoring the running disparity of the preceding event, the encoder can ensure that an overall balance of 1's and 0's is maintained in the serial stream. Data bytes that encode to two equally balanced results (an equal number of 1's and 0's) have one result that is used when the current running disparity is positive, and another when the current running disparity is negative.

8b/10b notation is expressed in dotted decimal subsets of the original data byte before it undergoes encoding. A hex xD7, for example, has a binary bit pattern of 1101 0111. The notation scheme divides this pattern into the first three bits, 110 (decimal 6), followed by the byte's last five bits, 10111 (decimal 23). This division reflects the internal operation of the 8b/10b encoder, because these units are processed in smaller 3b/4b and 5b/6b modules and then swapped to produce the 10-bit results. Consequently, 8b/10b notation refers to xD7 as D23.6. The only intuitive aspect of this notation is the D, which stands for "data." The K in the unique K28.5 command character indicates that it is a control character (Kontrol in German). It is distinguished from D28.5 (xBC) by virtue of its exclusive comma sequence.

The practical impact of 8b/10b encoding on general SAN design becomes clear when we examine Fibre Channel protocol, addressing, and hardware issues. For all the effort to maintain a more even distribution of 1's and 0's in the gigabit stream, some quite valid sequences of encoded characters nonetheless stress the ability of the recovery circuitry to capture data. The 8b/10b algorithm is also responsible for the address space allocated to arbitrated loop, as discussed later.

Gigabit Ethernet adopted the physical layer and byte encoding scheme initially developed by Fibre Channel engineers. Below the link layer, then, Fibre Channel and Gigabit Ethernet are very similar and so share common characteristics such as jitter, CDR, and 8b/10b encoding. Gigabit Ethernet, however, has elected not to follow Fibre Channel's lead on 2Gbps transport and instead has focused on 10Gbps. Having originally done the hard pioneering work on 1Gbps technology, Fibre Channel vendors are now letting Ethernet engineers pave the way to 10Gbps and will produce the Fibre Channel variant after the 10Gbps Ethernet standard has proved successful.