PCI Bus Issues


Several features of the PCI bus must be handled in the correct fashion when interfacing with the HT bus. For background information and details regarding PCI ordering, refer to MindShare's PCI System Architecture book, 4th edition.

PCI Ordering Requirements

Transaction ordering on the PCI bus is based on the Producer/Consumer programming model. This model involves 5 elements:

  1. Producer ” PCI master that sources data to a memory target

  2. Target ” main memory or any PCI device containing memory

  3. Consumer ” PCI master that reads and processes the Producer data from the target

  4. Flag element ” a memory or I/O location updated by the producer to indicate that all data has been delivered to the target, and checked by the Consumer to determine when it can begin to read and process the data.

  5. Status element ” a memory or I/O location updated by the Consumer to indicate that it has processed all of the Producer data, and checked by the Producer to determine when the next batch of data can be sent.

This model works flawlessly in PCI when all elements reside on the same shared PCI bus. When these elements reside on different PCI buses (i.e. across PCI to PCI bridges, the model can fail without adherence to the PCI ordering rules.

The PCI specification, versions 2.2 and 2.3, defines the required transaction ordering rules. These ordering rules are included in this section as review and to identify rules that have may have no purpose in some HT designs. Table 20-1 on page 459 defines the ordering rules or PCI bridges. When reading the table, please note the following:

  • PMW stands for posted memory write.

  • DRR and DRC stand for Delayed Read Request and Delayed Read Completion, respectively.

  • DWR and DWC stand for Delayed Write Request and Delayed Write Completion, respectively.

  • "Yes" specifies that the transaction just latched must be ordered ahead of the previously latched transaction indicated in the column heading.

  • "No" specifies that the transaction just latched must never be ordered ahead of the previously latched transaction indicated in the column heading.

  • "Yes/No" entries means that the transaction just latched is allowed to be ordered ahead of the previously-latched operation indicated in the column heading, but such reordering is not required. The Producer/Consumer Model works correctly either way.

Table 20-1. PCI Ordering Rules

Transaction just latched

Posted Memory Write

Delayed Request

Delayed Completion

PMW Column 1

DRR Column 2

DWR Column 3

DRC Column 4

DWC Column 5

PMW (row 1)

No

Yes

Yes

Yes

Yes

DRR (row 2)

No

Yes/No

DWR (row 3)

No

DRC (row 4)

No

Yes

Yes/No

DWC (row 5)

Yes/No

Note that all of the transaction types listed under the heading, "transaction just latched" (except Delayed Write Completions, because the write has already completed) must never be reordered ahead of a previously posted memory write transaction (column 1). These rules are present to enforce proper operation of the producer/consumer model. HT support these rules providing that transactions originated from or targeting PCI devices do not use the PassPW feature in HT.

Avoiding Deadlocks

PCI ordering rules require that Posted Memory Writes (PMWs) in Row 1, be ordered ahead of the delayed requests and delayed completions listed in columns 2-5. This requirement is based on avoiding potential deadlocks. Each of the deadlocks involve scenarios arising from the use PCI bridges based on earlier versions of the specification. If all PCI bridge designs used in HT platforms are based on 2.1 and later versions of the PCI specification, the PCI ordering rules with "Yes" entries in row 1 can be treated as "Yes/No."

Table 20-1 also specifies that Delayed Read Completions and Delayed Write Completions in rows 4 and 5, must be ordered ahead of the Delayed Requests in Columns 2 and 3. These ordering rules arise from potential deadlocks that can occur when two hierarchical bridges are implemented as illustrated in Figure 20-1 on page 460. Refer to MindShare's PCI System Architecture book for a detailed explanation of this deadlock. If a given platform avoids this topology, then the "Yes" entries in rows 4 and 5 can be treated as "Yes/No."

Figure 20-1. Topology Causing Deadlock Scenario for Rows 4 and 5

graphics/20fig01.jpg

Subtractive Decode

PCI employs a technique referred to as subtractive decode to handle devices that are mapped into memory or I/O address space by user selection of switches and jumpers (e.g. ISA devices). Consequently, configuration software has no knowledge of the resources assigned to these devices. Fortunately, these PC legacy devices are mapped into relatively small ranges of address space that can be reserved by platform configuration software.

Subtractive Decode: The PCI Method

Subtractive decode is a process of elimination . Since configuration software allocates and assigns address space for PCI, HT, AGP and other devices, any access to address locations not assigned can be presumed to target a legacy device, or may be an errant address.

All PCI devices must perform a positive decode to determine if they are being targeted by the current request. This decode must be performed as a fast, medium, or slow decode. The device targeted must indicate that it will respond to the request by signaling device select (DEVSEL#) across the shared bus. When device driver software issues a request with an address that has not been assigned by configuration software, no PCI device is targeted (i.e. no DEVSEL# is asserted within the time allowed) By process of elimination, the subtractive decode agent recognizes that no PCI device has responded and therefore it asserts DEVSEL# and forwards the transaction to the ISA bus, where the request is completed.

Subtractive Decode: The Simple HT Method

An HT system with a single chain can possibly implement subtractive decode without extra host support required of more complex HT systems. Figure 20-2 on page 462 illustrates a simple system with a single-hosted chain. Note that the subtractive decode agent is at the end of the chain. If a request initiated at the host reaches the South Bridge, then the bridge knows that no other HT agents have claimed the transaction based on positive decode; therefore, a subtractive decode is safe.

Figure 20-2. Subtractive Decode in a Simple HT System

graphics/20fig02.jpg

Subtractive Decode: HT Systems Requiring Extra Support

When the subtractive decode agent is not at the end of a single-hosted chain, or when more than one HT I/O chain is implemented in a system, subtractive decode becomes more difficult.

The Problem

HyperTransport devices in a chain do not share the same bus as in PCI, so a subtractive decode agent cannot detect if a request has not been claimed by other devices on the chain.

The Solution

As described previously, configuration software assigns addresses to all HT, PCI, and AGP devices. Therefore, the host knows when a request will result in a positive decode and when it will not. The specification requires that all hosts connecting to HyperTransport I/O chains implement registers that identify the positive decode ranges for all HyperTransport technology I/O devices and bridges (except as noted in the simple method). One of these I/O chains may also include a subtractive bridge (typically leading to an ISA, or LPC bus). Requests that do not match any of the positive ranges must be issued with the compat bit set, and must be routed to the chain containing the subtractive decode bridge. This chain is referred to as the compatibility chain.

The Compat bit indicates to the subtractive decode bridge that it should claim the request, regardless of address. Requests that fall within the positive decode ranges must not have the Compat bit set, and are passed to the I/O chain upon which the target device resides. The target chain may be the compatibility or any other I/O chain.

Subtractive Decode: Behind PCI Bridge

Figure 20-3 on page 463 illustrates the subtractive decode agent residing on a PCI bus behind a HT-to-PCI Bridge. When the host initiates a request that falls outside the assigned positive address ranges, it will set the compat bit and deliver the request to the HT-to-PCI Bridge. The bridge will detect the Compat bit set and forward the transaction on to the PCI bus where the South Bridge will perform the subtractive decode.

Figure 20-3. Subtractive Decode Agent Behind PCI Bridge

graphics/20fig03.jpg

Subtractive Decode: Legacy System Considerations

Some legacy software requires that the subtractive decoder reside on Bus 0. Figure 20-4 on page 464 illustrates a system where the subtractive decode agent resides on the PCI bus. Normally, this topology would assign Bus 0 to the HT chain and the PCI bus would be numbered as Bus 1. This would violate the software requirement that the compatibility bus be Bus 0. To solve this problem, the HT-to-PCI bridge could be implemented with a regular device configuration header rather than a bridge configuration header. In this way, configuration software will view the devices on the PCI bus as residing on bus zero along with the HT devices on chain 0. The bridge will simply forward all transactions to the PCI bus as well as downstream to other devices residing on the chain. Any transaction with the compat bit set would be sent to the PCI bus.

Figure 20-4. Subtractive Decode Agent on PCI Bus 0

graphics/20fig04.jpg

Subtractive Decode: Without Software Initialization

Systems may optionally set up the subtractive decode path in hardware so that software initialization is not required to enable subtractive decode. This remains true even though devices on the HyperTransport chain do not yet have their UnitIDs programmed. By setting the Compat bit accesses will be routed to the subtractive decode device. This approach may be useful when accessing the boot ROM following powerup.

HT-to-PCI Address Remapping

Address remapping between HT address space and PCI space is discussed in Chapter 21, entitled "Address Remapping," on page 477.

Transaction Translation

When transactions are passed between the HT and PCI buses, the command type must be translated for the other protocol. Table 20-2 lists the PCI to HT command conversion that must take place when bridging from PCI to HT.

Table 20-2. PCI to HT Command Conversion

PCI Transaction Type

HyperTransport Packet Type

Posted Memory Write

WrSized, Posted, PassPW = 0, Data Error=PERR# [1]

Delayed Read Request

RdSized, PassPW = 0, RespPassPW = 0

Delayed Write Request [2]

WrSized, Nonposted, PassPW = 0

Delayed Read Completion

RdResponse, PassPW = 0 (from request packet), Data Error=PERR# [3]

Delayed Write Completion

TgtDone, PassPW = 1 [4]

[1] DataError is set if the bridge detected a data parity error.

[2] DataError is set if a data parity error is detected in a non-posted write, and the write is discarded (only if Parity Error Response Enable is set).

[3] DataError is set when PERR# is detected during a Delayed Write Completion (only if the Parity Error Response Enable is set for the PCI interface).

[4] To ensure correct ordering of some message sequences (e.g., Interrupt and STPCLK virutal signaling) the PassPW bit must be cleared in the TgtDone .

Table 20-3 lists the command conversion required when bridging from HT-to-PCI.

Table 20-3. HT to PCI Command Conversion

HyperTransport Packet Type

PCI Transaction Type

WrSized to Memory Space [1]

Posted Memory Write [3]

WrSized to Configuration or I/O Space

Delayed Write Request [2]

RdSized from Memory, Configuration, or I/O Space

Delayed Read Request [2]

RdSized due to X86 Interrupt Acknowledge Cycle

Interrupt Acknowledge

RdResponse

Delayed Read Completion [3]

TgtDone

Delayed Write Completion

4. Assertion of PERR for a data error is gated by the Data Error Response Enable for the HyperTransport interface.

[1] PCI requires that all memory write be posted, thus HT non-posted writes will be treated as posted operations on PCI .

[3] If the DataError bit is set, the bridge should send incorrect parity to alert the receiver that the data is corrupt.

[2] Delayed Read Requests are never posted operations and will always result in a Read Response or Target Done.

PCI Burst Transactions

PCI permits long burst transactions with either contiguous or discontiguous byte masks (byte enables) that may not be supported by HT. These long bursts must be broken into multiple requests to support the HT protocol as follows :

  • PCI read requests with discontiguous byte masks that cross aligned 4-byte boundaries must be broken into multiple 4-byte HT RdSized (byte) requests.

  • PCI write requests with discontiguous byte masks that cross 32-byte boundaries must be broken into multiple 32-byte HT WrSized (byte) requests. Note that the resulting sequence of write requests must be strongly ordered in ascending address order.

  • PCI write requests with contiguous byte masks that cross 64-byte boundaries must be broken into multiple 64-byte HT WrSized (dword) requests.



HyperTransport System Architecture
HyperTransportв„ў System Architecture
ISBN: 0321168453
EAN: 2147483647
Year: 2003
Pages: 182

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net