The Error Types


The following section summarizes the required CRC generation/checking as well as the optional protocol, receive buffer overflow, end of chain, chain down, and respons e error handling.

CRC Errors

The Cycle Redundancy Code (CRC) is used to detect transmission errors on all enabled byte lanes on each link. The 32 bit CRC value is calculated and sent at prescribed intervals by each transmitter, then checked against the CRC value calculated by the corresponding receiver as packets arrive . CRC is calculated by finding the remainder when the sum of packet data (CAD bits plus CTL signal during each bit time) is divided by the CRC polynomial. The polynomial used is:

X 32 + X 26 + X 23 + X 22 +X 16 + X 12 + X 11 + X 10 +X 8 +X 7 +X 5 +X 4 +X 2 + X +1

CRC On 8, 16, or 32 bit Interfaces

For interfaces which are 8-, 16-, or 32-bits wide, CRC is independently generated and checked for each byte of CAD width. Figure 10-1 on page 232 illustrates CRC " stuffing " into the CAD packet stream on each 8-bit CAD interface.

Figure 10-1. 8/16/32 Bit Interfaces: CRC Inserted Into CAD Stream Every 512 Bit Times

graphics/10fig01.jpg

CRC Generation/Checking: 8/16/32 bit links

(Refer to Figure 10-1 on page 232)

  1. After link initialization, each transmitter begins sending packets (NOP, etc.). CRC calculation is based on "raw" CAD/CTL bit patterns on each CAD byte without regard to the packet types being sent.

  2. 512 bit times after initialization, the first 32-bit CRC value has been calculated for each byte lane. The window for "stuffing" the 32-bit CRC value into its CAD stream is 64 bit times into the next "window". Note: because of this delay, there is no CRC sent during the first window.

  3. Although each window for CRC calculation is 512 bit times, in reality all windows (after the first one) are actually 516 bit times because CRC for each window is inserted into the following one for four additional bit times. Note that the CRC value stuffed into each window is not included in the subsequent CRC calculation for that window.

  4. There is no special signalling associated with CRC transmission; both devices simply count the bit times starting with link initialization and "know" where the CRC payload falls in each window.

  5. CRC is calculated and sent independently for each 8 bits of CAD width. The CTL signal itself is included in the CRC calculation for the lowest byte of CAD (bits 0-7). On a bus wider than 8 bits, the CTL signal is also factored into the CRC calculation for each of the upper CAD bytes, but is assumed to be 0 during all bit times.

  6. During the driving of the CRC value itself, the CTL signal is driven = 1 (Control) by the transmitter. The CRC bits are inverted before being transmitted onto the link.

CRC Generation/Checking: 2/4 bit links

On links narrower than 8 bits, the CRC value is generated in the same way as for 8-bit links carrying the same value. It simply takes longer to move the packets and CRC value across the link ” causing the calculation window and stuffing point for the CRC value to be stretched accordingly . The extra assertions of the CTL signal (after the first bit time in each byte) are not used by the transmitter or receiver in the CRC calculation.

4 Bit CAD Width

A CAD width of four bits requires twice as many bit times as an 8 bit bus for moving information across the link. Therefore:

  • The CRC window size is 1024 bit times.

  • The CRC stuffing point starts128 bit times after the start of a window.

  • It takes 8 bit times to transfer the 32-bit CRC value.

2 Bit CAD Width

A CAD width of two bits requires four times as many bit times as an eight bit bus for moving information across the link. Therefore:

  • The CRC window size is 2048 bit times.

  • The CRC stuffing point starts 256 bit times after the start of a window.

  • It takes 16 bit times to transfer the 32-bit CRC value.

Logging CRC Errors

CRC errors impact both control and data information; if these errors occur on any CAD byte lane, the corresponding error bit(s) will be set in the HyperTransport Advanced Capability block Link Control CSR. The four bits (one for each byte lane) are illustrated in Figure 10-2 on page 234 below.

Figure 10-2. Link Control CSR: CRC Error Logging Bits

graphics/10fig02.jpg

Programming The CRC Error Reporting Policy

Informing the system of a CRC error on one or more of the links is handled in the manner programmed at boot time in the Advanced Capability Error Handling and Link Control Registers. Options include sending a fatal interrupt message, non-fatal interrupt message, or initiation of a sync flood.

CRC Interrupts

Figure 10-3 on page 235 below illustrates the CRC interrupt error enable bits in the Error Handling register. If the fatal or non-fatal interrupt enable bit is set (bit 6 in byte 0 and byte 1), the device will generate a WrSized interrupt message into the address range reserved for interrupts.

Figure 10-3. Error Handling CSR: CRC Error Interrupt Enables

graphics/10fig03.jpg

CRC Sync Flood

An alternative to sending fatal or non-fatal interrupts in response to a CRC error is the use of Sync flood. The bit to enable CRC Sync flood is contained in the Link Control Register as depicted in Figure 10-4 on page 236. When the CFIE bit is set, Sync flood will be issued on any CRC error and the Link Fail bit (bit 4) in this register will also be set.

Figure 10-4. Link Control Register: CRC Sync Flood Enable bit

graphics/10fig04.jpg

CRC Test Mode

If both devices on a link support the CRC diagnostic testing mode (determined by checking bit 2 in the Feature Capability register for each device), then software may enable a test sequence that allows stress tests of CRC generation and checking. The basic events involved in link CRC testing include:

  1. Software writes a "1" to the CRC Start Test bit of the Link Control register (refer to Figure 10-4 on page 236). Setting this bit informs the transmitter interface that it should enter the CRC diagnostic mode for the following 512 bit times on each enabled byte lane. For 4-or 2-bit CAD widths, this time is stretched to 1024 or 2048 bit times, respectively.

  2. The transmitter sends a NOP packet with the Diag bit set; this informs the receiver that it should ignore CAD and CTL signals for the next 512 bit times but still is required to check CRC. Again, for 4-or 2-bit CAD widths, this time is stretched to 1024 or 2048 bit times, respectively.

  3. With the normal buffers suspended , the transmitter may generate any test pattern it wants; CRC is still stuffed into the CAD test pattern stream in the normal way.

  4. CRC errors detected during this time will be logged normally, and if the Sync flood is enabled, it will be performed. All data content is "don't care" during this time and is dropped.

  5. If the CRC Force Error (CFE) bit is also set during the test (see bit 3 in Figure 10-4 on page 236), then the test pattern sent by the transmitter will contain at least one CRC error in each of the active byte lanes.

  6. When the test is complete, hardware automatically clears the CRC Start Test bit. This bit may be polled by software to check completion.

  7. At the end of the CRC Diagnostic test, normal packet transfer resumes.

Protocol Errors

Protocol errors are failures on the link involving low-level packet violations. These include the following:

CTL Signal Four-Byte Boundary Violation

The CTL signal may only transition between low-high on four byte boundaries. The exception to this rule is during the CRC diagnostic test mode. If an illegal transition is detected, then either the transmitter has lost track of packet start and ending boundaries or the receiver has.

CTL Deassertion Violation

Other than when CRC diagnostic test mode is in use, a transmitter only deasserts the CTL signal during data packets associated with earlier requests requiring them. Deasserting CTL when data packets are not in transit is another protocol violation.

CTL/Data Interleaving Violation

A transmitter is allowed to interleave new control packets into the data packet associated with an earlier request i f the new control packet does not have any immediate data of its own. If an attempt is made to interleave a control packet with immediate data (e.g. a write request) into a data packet already in transit, this is a protocol violation.

Bad Command Code In Control Packet

Control packets (request, response, information) have a 6-bit command field in the first byte to encode the intended operation. Some codes are not used, and are reserved. Sending an illegal command code is another protocol violation.

CTL Deassertion Timeout Violation

The HyperTransport specification limits the amount of time the CTL signal may be deasserted. There are two maximum timeout options (1 millisecond or 1 second) and the one in effect is programmed in bit 15 of the Link Error Register (see Figure 10-5). If the transmitter exceeds the programmed maximum CTL deassertion timeout, it is a protocol violation.

Figure 10-5. Link Error Register: Protocol Error Logging Bits

graphics/10fig05.jpg

CTL Deasserted During CRC Transmission

CTL is always asserted during the transmission of the 32-bit CRC code in each calculation window. If a receiver detects CTL deasserted during a CRC stuffing period, it is a protocol violation.

Logging Protocol Errors

Protocol error checking is optional. If protocol violations are checked, the Link Error register log the errors; refer to Figure 10-5 on page 239.

Programming The Protocol Error Reporting Policy

Informing the system of a protocol error on one or more of the links is handled in much the same way as for CRC errors. They may be mapped to a fatal or non-fatal interrupt message, or a sync flood. The reporting strategy is programmed in the Error handling CSR, as shown in Figure 10-6 on page 240.

Figure 10-6. Error Handling CSR: Protocol Error Reporting Enables

graphics/10fig06.jpg

Receive Buffer Overflow Errors

Receive buffer overflow errors can occur if a link transmitter no longer maintains an accurate count of available flow control buffers at the receiver. If a flow-controlled packet (posted request, non-posted request, or response) is sent without an available receiver flow control buffer to accept it, the packet will be lost.

Logging Receive Buffer Overflow Errors

In the event a receive buffer overflow is detected, the Overflow Error bit will be set in the Link Error CSR.

Figure 10-7. Link Error Register: Receive Buffer Overflow Error Logging Bits

graphics/10fig07.jpg

Programming The Buffer Overflow Error Reporting Policy

As in the cases of CRC and Protocol errors, buffer overflow errors may be mapped to fatal/non-fatal interrupts, or a sync flood. The reporting strategy is programmed in the Error handling CSR, as shown in Figure 10-8 on page 242.

Figure 10-8. Error Handling CSR: Receive Buffer Overflow Error Reporting Enables

graphics/10fig08.jpg

End-Of-Chain Errors

End-Of-Chain (EOC) errors result when a packet moving through HyperTransport is either not claimed by, or does not reach, the intended recipient. Other devices which see the packet forward it and eventually it reaches the device at the end of the chain, where the packet must be handled. Some of the possible reasons for EOC errors include; improper address in a request, invalid Unit ID in a response, the target device is broken, or it has not been programmed properly with UnitID or target base address range.

EOC errors are analogous to the master abort event in PCI. Unlike PCI, however, "misdirected" transactions must be handled by the EOC device rather than simply having the initiator of the transaction time out after a prescribed amount of time. This is important in HyperTransport because it is a series of point-to-point connections rather than a shared bus, and an initiator simply sends packets to the neighboring device and has no way of immediately "knowing" whether the ultimate recipient receives it. The EOC error handling mechanism helps with link management in two ways:

  1. For posted requests and responses which inadvertently reach an EOC device, the EOC error bit and reporting mechanism may be used to let the system know a packet never reached its destination ” information that otherwise would be unknown.

  2. For non-posted requests which reach an EOC device in error, the error logging and reporting can also be used. In addition, the EOC device will act as a surrogate for the target and send back a Read or Target Done response to the requestor (with error bits set). For read requests, all of the requested data is also sent back by the EOC device ” although it is obviously invalid (all data values are driven to FFh). Sending back the responses (and data) allows all devices in the path back to the requestor to deallocate internal buffer space and retire the outstanding transaction. The original requester examines the response, decodes the error bits, and takes whatever action is appropriate.

How A Device Knows It Is At The End Of A Chain

Single link peripherals (also known as End or Cave devices) are always end-of chain-devices. Any packets reaching these device that they are not programmed to accept (by Command type, UnitID, or Address range), are considered lost. No software programming is required for these devices to carry out their EOC function other than setting up the error reporting mechanism to be used.

Tunnel and HyperTransport bridge devices have multiple links. PC board layout determines whether they are physically at the end of a chain or not. At reset, each device receives a bit pattern from the device on the other end of the link. If one of the tunnel links or one of the bridge secondary interfaces does not have a device attached to it, all CAD inputs are seen as logic "0". After reset, if a device senses one of its links is unconnected, it sets the EOC bit in the corresponding Link Control register. This is shown in Figure 10-9 on page 244.

Figure 10-9. End-Of-Chain Device Determination

graphics/10fig09.jpg

After initialization, links may also be enabled and disabled under software control by setting the EOC bit in the Link Control register(s).

Logging End-Of-Chain Errors

In the event a packet reaches an EOC device in error, the EOC Err bit will be set in the Link Error CSR. Refer to Figure 10-10 on page 245. Note in this figure that this is the Advanced Capability register format for a tunnel device; it has two sets of Link Control, Link Configuration, Link Frequency, Link Error, and Link Frequency capability registers ” one set for each link. An end (cave) device implements only one set of each these registers.

Figure 10-10. Link Error Register: End-Of-Chain Error Logging Bits

graphics/10fig10.jpg

Programming The EOC Error Reporting Policy

As in the cases of CRC, protocol, and receive buffer overflow errors, EOC errors may be mapped to a fatal or non-fatal interrupts. There is no option for Sync flood reporting for EOC errors. The reporting strategy is programmed in the Error handling CSR, as shown in Figure 10-11 on page 246.

Figure 10-11. Error Handling CSR: End-Of-Chain Error Reporting Enables

graphics/10fig11.jpg

Chain Down Errors

If a device detects a Sync flood or an error that would cause a Sync flood, it sets the Chain Fail bit in its Error Handling register and waits for a bus reset. The action taken when the chain goes down depends on the device type:

  • Host interfaces track outstanding non-posted requests for devices below them. On chain down errors, they flush the state of all internal non-posted requests and return non-NXA error responses to the requesters for each one that is pending.

  • Slave devices have their internal states re- initialized when the RESET# occurs after a chain goes down; there is generally no need for a flush operation of non-posted requests by these devices. If a slave device were implemented that maintained its state through a HyperTransport RESET#, it would need to perform the non-posted request flush operation after the chain goes down as well.

Figure 10-12. Error Handling Register: Chain Fail Bit

graphics/10fig12.jpg

Response Errors

All non-posted requests that are issued require either a Read or Target Done response. The requester programs UnitID and source tag information into each request packet it issues so that when the response is returned it may be tagged with the same information and find its way back to the original requester. When a downstream response is detected, each device compares the UnitID to its own to see if it should claim the response; if so, it then checks the source tag to determine which of its outstanding transactions is being completed.

It is possible a response may return and be claimed by a requester (UnitID is OK), but not be recognized as being valid. Some of the reasons this might happen include:

  1. A read response (RdResponse) is received by a device which carries the correct UnitID, but has an invalid source tag (SrcTag field). The recipient cannot associate the response with any of its outstanding transactions.

  2. A read response (with data) is received with the correct UnitID and SrcTag fields, but the response type is incorrect (requester is expecting a Target Done response).

  3. A Target Done response is received for a RdSized or Atomic RMW request.

  4. A read response (with data) is received for a RdSized or Atomic RMW, but the (data) count field doesn't match what the requester originally asked for.

Response Error Logging And Reporting Policy

The logging of response errors as well as the reporting policy to be used when they occur is handled in the Error Handling CSR. These errors may be mapped into fatal or non-fatal interrupts There is no Sync flood option for response errors. Figure 10-13 on page 249 depicts the response error logging bit and the fatal and non-fatal interrupt enables.

Figure 10-13. Error Handling CSR: Response Error Logging And Reporting Policy Bits

graphics/10fig13.jpg



HyperTransport System Architecture
HyperTransportв„ў System Architecture
ISBN: 0321168453
EAN: 2147483647
Year: 2003
Pages: 182

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net