HyperTransport Protocol Concepts | HyperTransportв„ў System Architecture

Channels and Streams

In HyperTransport, as in other protocols, ordering rules are needed for read, posted/non-posted write transactions, and responses returning from earlier requests. In a point-point fabric, all of these occur over the same link. In addition, transactions from different devices are also merging over the same links. HyperTransport implements Virtual Channels and I/O Streams to differentiate a device's posted requests , non-posted requests, and responses from each other and from those originating from different sources.

Virtual Channels

HyperTransport defines a set of three required virtual channels that dictate transaction management and ordering:

Posted Requests ” Posted write transactions belong to this channel.
Non-Posted Requests ” Reads, non-posted writes , and flushes belong to this channel.
Responses ” Read responses and target done packets belong to this channel.

An additional set of Posted, Non-Posted and Response virtual channels is required for isochronous transactions, if supported. This dedicated set of virtual channels assist in guaranteeing the bandwidth required of isochronous transactions.

When packets are sent over a link, they are sent in one of the virtual channels. Attribute bits in the packets tag them as to which channel they should travel. Each device is responsible for maintaining queues and buffers for managing the virtual channels and enforcing ordering rules.

Each device implements separate command/data buffers for each of the 3 required virtual channels as pictured in Figure 2-14 on page 38. Doing so ensures that transactions moving in one virtual channel do not block transactions moving in another virtual channel. There are I/O ordering rules covering interactions between the three virtual channels of the same I/O stream. Transactions in different I/O streams have no ordering rules (with exception of ordering rules associated with Fence requests). Enforcing ordering rules between transactions in the same I/O stream prevents deadlocks from occurring and guarantees data is transferred correctly. Based on ordering requirements, nodes may not:

Make accepting a request dependent on the ability of that node to issue an outgoing request.
Make accepting a request dependent on the receipt of a response due to a request previously issued by that node.
Make issuing a response dependent on the ability to issue a request.
Make issuing a response dependent upon receipt of a response due to a previous request.

Figure 2-14. HT Virtual Channels

graphics/02fig14.jpg

I/O Streams

In addition to virtual channels, HyperTransport also defines I/O streams. An I/O stream consists of the requests, responses, and data associated with a particular UnitID and HyperTransport link. Ordering rules require that I/O streams be treated independently from each other. When a request/response packet is sent, it is tagged with sender attributes (UnitID, Source Tag, and Sequence ID) that are used by other devices to identify the transaction stream in use, and the required ordering within it. Entries within the virtual channel buffers include the transaction stream identifiers (attributes).

Used properly, the independent I/O streams create the effect of separate connections between devices and the host bridge above them ” much as a shared bus connection appears.

Transactions (Requests, Responses, and Data)

Transfers initiated by HT devices require one or more transactions to complete. These devices may need to perform a variety of operations that include:

sending or forwarding data (write)
requesting that a target return data to it (read)
performing an atomic read/modify/write operation
wanting additional control over ordering of its posted transactions (using Flush and Fence commands)
wanting to broadcast a message to all downstream agents (done by bridges only)

The format of these transactions also vary depending on the type of operation (request) specified as listed below:

Requests that behave like reads and that require a read response and data (i.e., Sized Read, Atomic RMW)
Requests that behave like writes, and require a target done response to confirm completion (i.e. Non-posted Sized Writes)
Posted Requests that behave like writes but don't require any target response or data. (i.e. Posted Sized Writes, Broadcast Message, or Fence)

Transaction Requests

Every transaction begins with the transmission of a Request Packet. Note that the actual format of a request packet varies depending on the particular request, but in general each request contains the following information:

Target address within HyperTransport memory space
The request type (command)
Sender's transaction stream ID (UnitID, SeqID)
The amount of data to be transferred (if any)
Other attributes: virtual channel to use, etc.

HT defines seven basic request types. The characteristics of each request type is discussed in the following sections.

Transaction Responses

Responses are generated by the target device in cases where data is to be returned from the target device, or when confirmation of transaction completion is required. Specifically, in HyperTransport, a response follows all non-posted requests. A target responds to:

Return data to satisfy an earlier read or Atomic Read-Modify Write (RMW) request
Confirm the arrival of non-posted write data
Confirm the completion of a Flush operation
Report errors

The information in a response varies both with the Request that causes it, and with the direction the response is traveling in the HyperTransport fabric. However, content of an HT response generally includes:

Response type (command)
Response direction (upstream or downstream)
Transaction stream (UnitID, Source Tag)
Misc. info : virtual channel to use, error, etc.

Transaction Types

As discussed earlier, HT defines seven basic transaction types. This section introduces the characteristics of each type and defines any sub-types that exist.

Sized Read Transactions

Sized Read transactions permit remote access to a device memory or memory-mapped I/O (MMIO) address space. The operation may be initiated on HT from the host bridge (PIO operation), or an HT device may wish to read data from memory (DMA operation) or from another HT device (peer-to-peer operation). Two types of Sized Read transactions define the different quantities of data to be read.

Sized (Byte) Read ” this request defines an aligned 4 byte block of address space from which 0 to 4 bytes can be read. Any single byte location or any group of bytes within the 4 byte block can be accessed. The typical use of this transaction is for reading MMIO registers.
Sized (DW) Read ” this request identifies an aligned 64 byte block of address space from which 4-64 bytes can be read. Any continuous group of aligned 4 byte groups (DWs) can be accessed.

The protocol associated with Sized Read transactions is illustrated in Figure 2-15 on page 41. These transactions begin with the delivery of a Sized Read Request packet and completes when the target device returns a corresponding response packet followed by data.

Figure 2-15. Example Protocol ” Receiving Data from Target

graphics/02fig15.jpg

The basic rules for maintaining high performance of HT reads include:

For reads, the requester won't issue the request until it has buffers available to receive all requested data without wait states.
The requester won't issue the request until it knows the target has room in its transaction queue to accept it (Flow Control)
Upon receiving the read request, the target won't issue the read response until it has all requested data and status available to send. Once it starts the response, there will be no wait states until the read response packet and all data (up to 16 dwords) have been sent.
Upon receiving the response, the requester will check the error bits to make certain the data is valid.
The target and any bridges in the path de-allocate buffers and queue entries as soon as the response has been sent.

Sized Write Transactions

Sized Write transactions permit the host bridge (PIO operation) to send data to a HyperTransport device, or permits a HyperTransport device to send data to memory (DMA operation) or to another device (Peer-to-peer operation). Two types of Sized Write requests permit different sizes of memory or MMIO space to be accessed.

Sized (Byte) Write ” this request identifies an aligned block of 32 bytes of address space into which data is to be written. The amount of data to be written can be from 0 to 32 bytes. Note that the maximum transfer size of 32 bytes only occurs if the start address is 32 byte aligned. If the start address is not on a 32-byte boundary, the transfer will be less than 32 bytes. Furthermore, no Byte Write transaction crosses a 32 byte address boundary. Any combination of bytes (need not be contiguous) can be written from the start address to the next aligned 32 byte block of address space.
Sized (DW) Write ” this request identifies an aligned block of 64 bytes of address space into which data can be written. The start address must be aligned on 4-byte boundaries, and data to be written is always aligned in 4- byte contiguous groups (DWs). The amount of data written can be from 1 to 16 DW increments .

Non-Posted Sized Writes.

The packet protocol associated with Sized Write transactions depends on whether the Sized Write is posted or not. Figure 2-16 on page 43 illustrates the case of a non-posted Sized Write. This diagram illustrates the basic HT split-transaction request-target done response sequence.

Figure 2-16. Example Protocol ” Non-Posted Sized Write

graphics/02fig16.jpg

The basic rules for maintaining high performance in HT writes include:

The requester won't issue the non-posted write request until it knows the target can accommodate the request and all of the data to be sent. Refer to the section on Flow Control to see how this is managed for writes.
Upon receiving the write request and data, the target won't issue the target done response until it has properly delivered all data. Once it starts the response, there will be no wait states until the four bytes of the target done response packet have been sent.
Upon receiving the response, the requester will check the error bits to make certain delivery is complete.
The target and any bridges in the path de-allocate request queue entries as soon as the target done response has been sent.

Posted Sized Writes

Figure 2-17 on page 43 depicts a posted Sized Write. In both case the transaction begins with the Sized Write request followed by the data. Non-posted operations include a response packet that is delivered back to the requester as verification that the operation has completed, whereas posted writes end once the data is sent.

Figure 2-17. Example Protocol ” Posted Sized Write

graphics/02fig17.jpg

Flush

Flush is useful in cases where a device must be certain that its posted writes are "visible" in host memory before it takes subsequent action. Flush is an upstream, non-posted " dummy " read command that pushes all posted requests ahead of it to memory. Note that only previously posted writes within the same transaction stream as Flush transaction need be flushed to memory. When an intermediate bridge receives a Flush transaction, it generates one or more Sized Write transactions necessary to forward all data in its upstream posted-write buffer toward the host bridge. Ultimately, the host bridge receives the command and flushes the previously-posted writes to memory. Receipt of the read response from the host bridge is confirmation that the flush operation has completed.

The protocol used when performing a Flush transaction is depicted in Figure 2-18. When the Flush request reaches the host bridge it completes previously-posted writes to memory. In this example two previously-posted writes are flushed to memory, after which the Target Done (TgtDone) response is returned to the requester.

Figure 2-18. Example Protocol ” Flush Transaction

graphics/02fig18.jpg

Fence

Fence is designed to provide a barrier between posted writes, which applies across all UnitIDs and therefore across all I/O streams and all virtual channels. Thus, the fence command is global because it applies to all I/O streams. The Fence command goes in the posted request virtual channel and has no response. The behavior of a Fence is as follows:

The PassPW bit must be clear so that the Fence pushes all requests in the posted channel ahead of it.
Packets with their PassPW bit clear will not pass a Fence regardless of UnitID.
Packets with their PassPW bit set may pass a Fence.
A nonposted request with PassPW clear will not pass a Fence as it is forwarded through the chain, but it may do so after it reaches a host bridge.

Fence requests are never issued as part of an ordered sequence, so their SeqID will always be 0. Fence requests with PassPW set, or with a nonzero SeqID, are legal, but may have an unpredictable effect. Fence is only issued from a device to a host bridge or from one host bridge to another. Devices are never the target of a fence so they do not need to perform the intended function. If a device at the end of the chain receives a fence, it must decode it properly to maintain proper operation of the flow control buffers. The device should then drop it.

Figure 2-19. Example Protocol ” Fence Transaction

graphics/02fig19.jpg

Atomic

Atomic Read-Modify-Write (ARMW) is used so that a memory location may be read, (evaluated and) modified, then conditionally written back ” all without the race-condition of another device trying to do it at the same time. HT defines two types of Atomic operation:

Fetch & Add
Compare & Swap

The protocol associated with an Atomic Transaction is shown in Figure 2-20 on page 46. The request is followed by a data packet that contains the argument of the atomic operation. The target device performs the request operation and returns the original data read from the target location.

Figure 2-20. Example Protocol ” Atomic Operation

graphics/02fig20.jpg

Broadcast

Broadcast Message requests are sent downstream by host bridges, and are used to send messages to all devices. They are accepted and forwarded by all agents onto all links.

Figure 2-21 illustrates the operation of a Broadcast transaction. This example shows a broadcast request working its way down the HT fabric. All devices recognize the Broadcast Message request type and the reserved address, accept the message, and pass it along. Examples of Broadcast Messages include Halt, Shutdown, and the End-Of-Interrupt (EOI) message.

Figure 2-21. Example of Packet Flow During Broadcast Transaction

graphics/02fig21.jpg