The following section summarizes the required CRC generation/checking as well as the optional protocol, receive buffer overflow, end of chain, chain down, and respons e error handling. CRC ErrorsThe Cycle Redundancy Code (CRC) is used to detect transmission errors on all enabled byte lanes on each link. The 32 bit CRC value is calculated and sent at prescribed intervals by each transmitter, then checked against the CRC value calculated by the corresponding receiver as packets arrive . CRC is calculated by finding the remainder when the sum of packet data (CAD bits plus CTL signal during each bit time) is divided by the CRC polynomial. The polynomial used is: X 32 + X 26 + X 23 + X 22 +X 16 + X 12 + X 11 + X 10 +X 8 +X 7 +X 5 +X 4 +X 2 + X +1 CRC On 8, 16, or 32 bit InterfacesFor interfaces which are 8-, 16-, or 32-bits wide, CRC is independently generated and checked for each byte of CAD width. Figure 10-1 on page 232 illustrates CRC " stuffing " into the CAD packet stream on each 8-bit CAD interface. Figure 10-1. 8/16/32 Bit Interfaces: CRC Inserted Into CAD Stream Every 512 Bit Times
CRC Generation/Checking: 8/16/32 bit links(Refer to Figure 10-1 on page 232)
CRC Generation/Checking: 2/4 bit linksOn links narrower than 8 bits, the CRC value is generated in the same way as for 8-bit links carrying the same value. It simply takes longer to move the packets and CRC value across the link ” causing the calculation window and stuffing point for the CRC value to be stretched accordingly . The extra assertions of the CTL signal (after the first bit time in each byte) are not used by the transmitter or receiver in the CRC calculation. 4 Bit CAD WidthA CAD width of four bits requires twice as many bit times as an 8 bit bus for moving information across the link. Therefore:
2 Bit CAD WidthA CAD width of two bits requires four times as many bit times as an eight bit bus for moving information across the link. Therefore:
Logging CRC ErrorsCRC errors impact both control and data information; if these errors occur on any CAD byte lane, the corresponding error bit(s) will be set in the HyperTransport Advanced Capability block Link Control CSR. The four bits (one for each byte lane) are illustrated in Figure 10-2 on page 234 below. Figure 10-2. Link Control CSR: CRC Error Logging Bits
Programming The CRC Error Reporting PolicyInforming the system of a CRC error on one or more of the links is handled in the manner programmed at boot time in the Advanced Capability Error Handling and Link Control Registers. Options include sending a fatal interrupt message, non-fatal interrupt message, or initiation of a sync flood. CRC InterruptsFigure 10-3 on page 235 below illustrates the CRC interrupt error enable bits in the Error Handling register. If the fatal or non-fatal interrupt enable bit is set (bit 6 in byte 0 and byte 1), the device will generate a WrSized interrupt message into the address range reserved for interrupts. Figure 10-3. Error Handling CSR: CRC Error Interrupt Enables
CRC Sync FloodAn alternative to sending fatal or non-fatal interrupts in response to a CRC error is the use of Sync flood. The bit to enable CRC Sync flood is contained in the Link Control Register as depicted in Figure 10-4 on page 236. When the CFIE bit is set, Sync flood will be issued on any CRC error and the Link Fail bit (bit 4) in this register will also be set. Figure 10-4. Link Control Register: CRC Sync Flood Enable bit
CRC Test ModeIf both devices on a link support the CRC diagnostic testing mode (determined by checking bit 2 in the Feature Capability register for each device), then software may enable a test sequence that allows stress tests of CRC generation and checking. The basic events involved in link CRC testing include:
Protocol ErrorsProtocol errors are failures on the link involving low-level packet violations. These include the following: CTL Signal Four-Byte Boundary ViolationThe CTL signal may only transition between low-high on four byte boundaries. The exception to this rule is during the CRC diagnostic test mode. If an illegal transition is detected, then either the transmitter has lost track of packet start and ending boundaries or the receiver has. CTL Deassertion ViolationOther than when CRC diagnostic test mode is in use, a transmitter only deasserts the CTL signal during data packets associated with earlier requests requiring them. Deasserting CTL when data packets are not in transit is another protocol violation. CTL/Data Interleaving ViolationA transmitter is allowed to interleave new control packets into the data packet associated with an earlier request i f the new control packet does not have any immediate data of its own. If an attempt is made to interleave a control packet with immediate data (e.g. a write request) into a data packet already in transit, this is a protocol violation. Bad Command Code In Control PacketControl packets (request, response, information) have a 6-bit command field in the first byte to encode the intended operation. Some codes are not used, and are reserved. Sending an illegal command code is another protocol violation. CTL Deassertion Timeout ViolationThe HyperTransport specification limits the amount of time the CTL signal may be deasserted. There are two maximum timeout options (1 millisecond or 1 second) and the one in effect is programmed in bit 15 of the Link Error Register (see Figure 10-5). If the transmitter exceeds the programmed maximum CTL deassertion timeout, it is a protocol violation. Figure 10-5. Link Error Register: Protocol Error Logging Bits
CTL Deasserted During CRC TransmissionCTL is always asserted during the transmission of the 32-bit CRC code in each calculation window. If a receiver detects CTL deasserted during a CRC stuffing period, it is a protocol violation. Logging Protocol ErrorsProtocol error checking is optional. If protocol violations are checked, the Link Error register log the errors; refer to Figure 10-5 on page 239. Programming The Protocol Error Reporting PolicyInforming the system of a protocol error on one or more of the links is handled in much the same way as for CRC errors. They may be mapped to a fatal or non-fatal interrupt message, or a sync flood. The reporting strategy is programmed in the Error handling CSR, as shown in Figure 10-6 on page 240. Figure 10-6. Error Handling CSR: Protocol Error Reporting Enables
Receive Buffer Overflow ErrorsReceive buffer overflow errors can occur if a link transmitter no longer maintains an accurate count of available flow control buffers at the receiver. If a flow-controlled packet (posted request, non-posted request, or response) is sent without an available receiver flow control buffer to accept it, the packet will be lost. Logging Receive Buffer Overflow ErrorsIn the event a receive buffer overflow is detected, the Overflow Error bit will be set in the Link Error CSR. Figure 10-7. Link Error Register: Receive Buffer Overflow Error Logging Bits
Programming The Buffer Overflow Error Reporting PolicyAs in the cases of CRC and Protocol errors, buffer overflow errors may be mapped to fatal/non-fatal interrupts, or a sync flood. The reporting strategy is programmed in the Error handling CSR, as shown in Figure 10-8 on page 242. Figure 10-8. Error Handling CSR: Receive Buffer Overflow Error Reporting Enables
End-Of-Chain ErrorsEnd-Of-Chain (EOC) errors result when a packet moving through HyperTransport is either not claimed by, or does not reach, the intended recipient. Other devices which see the packet forward it and eventually it reaches the device at the end of the chain, where the packet must be handled. Some of the possible reasons for EOC errors include; improper address in a request, invalid Unit ID in a response, the target device is broken, or it has not been programmed properly with UnitID or target base address range. EOC errors are analogous to the master abort event in PCI. Unlike PCI, however, "misdirected" transactions must be handled by the EOC device rather than simply having the initiator of the transaction time out after a prescribed amount of time. This is important in HyperTransport because it is a series of point-to-point connections rather than a shared bus, and an initiator simply sends packets to the neighboring device and has no way of immediately "knowing" whether the ultimate recipient receives it. The EOC error handling mechanism helps with link management in two ways:
How A Device Knows It Is At The End Of A ChainSingle link peripherals (also known as End or Cave devices) are always end-of chain-devices. Any packets reaching these device that they are not programmed to accept (by Command type, UnitID, or Address range), are considered lost. No software programming is required for these devices to carry out their EOC function other than setting up the error reporting mechanism to be used. Tunnel and HyperTransport bridge devices have multiple links. PC board layout determines whether they are physically at the end of a chain or not. At reset, each device receives a bit pattern from the device on the other end of the link. If one of the tunnel links or one of the bridge secondary interfaces does not have a device attached to it, all CAD inputs are seen as logic "0". After reset, if a device senses one of its links is unconnected, it sets the EOC bit in the corresponding Link Control register. This is shown in Figure 10-9 on page 244. Figure 10-9. End-Of-Chain Device Determination
After initialization, links may also be enabled and disabled under software control by setting the EOC bit in the Link Control register(s). Logging End-Of-Chain ErrorsIn the event a packet reaches an EOC device in error, the EOC Err bit will be set in the Link Error CSR. Refer to Figure 10-10 on page 245. Note in this figure that this is the Advanced Capability register format for a tunnel device; it has two sets of Link Control, Link Configuration, Link Frequency, Link Error, and Link Frequency capability registers ” one set for each link. An end (cave) device implements only one set of each these registers. Figure 10-10. Link Error Register: End-Of-Chain Error Logging Bits
Programming The EOC Error Reporting PolicyAs in the cases of CRC, protocol, and receive buffer overflow errors, EOC errors may be mapped to a fatal or non-fatal interrupts. There is no option for Sync flood reporting for EOC errors. The reporting strategy is programmed in the Error handling CSR, as shown in Figure 10-11 on page 246. Figure 10-11. Error Handling CSR: End-Of-Chain Error Reporting Enables
Chain Down ErrorsIf a device detects a Sync flood or an error that would cause a Sync flood, it sets the Chain Fail bit in its Error Handling register and waits for a bus reset. The action taken when the chain goes down depends on the device type:
Figure 10-12. Error Handling Register: Chain Fail Bit
Response ErrorsAll non-posted requests that are issued require either a Read or Target Done response. The requester programs UnitID and source tag information into each request packet it issues so that when the response is returned it may be tagged with the same information and find its way back to the original requester. When a downstream response is detected, each device compares the UnitID to its own to see if it should claim the response; if so, it then checks the source tag to determine which of its outstanding transactions is being completed. It is possible a response may return and be claimed by a requester (UnitID is OK), but not be recognized as being valid. Some of the reasons this might happen include:
Response Error Logging And Reporting PolicyThe logging of response errors as well as the reporting policy to be used when they occur is handled in the Error Handling CSR. These errors may be mapped into fatal or non-fatal interrupts There is no Sync flood option for response errors. Figure 10-13 on page 249 depicts the response error logging bit and the fatal and non-fatal interrupt enables. Figure 10-13. Error Handling CSR: Response Error Logging And Reporting Policy Bits
|