Fibre Channel


Compared to Ethernet, FC is a complex technology. FC attempts to provide functionality equivalent to that provided by Ethernet plus elements of IP, UDP, and TCP. So, it is difficult to compare FC to just Ethernet. FC promises to continue maturing at a rapid pace and is currently considered the switching technology of choice for block-level storage protocols. As block-level SANs proliferate, FC is expected to maintain its market share dominance in the short term. The long term is difficult to predict in the face of rapidly maturing IPS protocols, but FC already enjoys a sufficiently large installed base to justify a detailed examination of FC's inner workings. This section explores the details of FC operation at OSI Layers 1 and 2.

FC Media, Connectors, Transceivers, and Operating Ranges

FC supports a very broad range of media, connectors, and transceivers. Today, most FC deployments are based on fiber media for end node and ISL connectivity. Most 1-Gbps FC products employ the SC style connector, and most 2-Gbps FC products employ the LC style. Copper media is used primarily for intra-enclosure connectivity. For this reason, copper media is not discussed herein. Table 5-7 summarizes the media, connectors, transceivers, and operating ranges that are specified in ANSI T11 FC-PH, FC-PH-2, FC-PI, FC-PI-2, and 10GFC. Data rates under 100 MBps are excluded from Table 5-7 because they are considered historical. The nomenclature used to represent each defined FC implementation is [data rate expressed in MBps]-[medium]-[transceiver]-[distance capability]. The distance capability designator represents a superset of the defined operating range for each implementation.

Table 5-7. Fibre Channel Media, Connectors, Transceivers, and Operating Ranges

FC Variant

Medium

Modal Bandwidth

Connectors

Transceiver

Operating Range (m)

100-SM-LL-V

9 µm SMF

N/A

Duplex SC, Duplex SG, Duplex LC

1550nm laser

250k

100-SM-LL-L

9 µm SMF

N/A

Duplex SC

1300nm laser

210k

100-SM-LC-L

9 µm SMF

N/A

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

1300nm laser (Cost Reduced)

210k

100-SM-LL-I

9 µm SMF

N/A

Duplex SC

1300nm laser

22k

100-M5-SN-I

50 µm MMF

500 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.5500

100-M5-SN-I

50 µm MMF

400 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.5450

100-M5-SL-I

50 µm MMF

500 MHz*km

Duplex SC

780nm laser

2500

100-M6-SN-I

62.5 µm MMF

200 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.5300

100-M6-SN-I

62.5 µm MMF

160 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

2300

100-M6-SL-I

62.5 µm MMF

160 MHz*km

Duplex SC

780nm laser

2175

200-SM-LL-V

9 µm SMF

N/A

Duplex SC, Duplex SG, Duplex LC

1550nm laser

250k

200-SM-LC-L

9 µm SMF

N/A

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

1300nm laser (Cost Reduced)

210k

200-SM-LL-I

9 µm SMF

N/A

Duplex SC

1300nm laser

22k

200-M5-SN-I

50 µm MMF

500 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.5300

200-M5-SN-I

50 µm MMF

400 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.5260

200-M6-SN-I

62.5 µm MMF

200 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.5150

200-M6-SN-I

62.5 µm MMF

160 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.5120

400-SM-LL-V

9 µm SMF

N/A

Duplex SC, Duplex SG, Duplex LC

1550nm laser

250k

400-SM-LC-L

9 µm SMF

N/A

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

1300nm laser (Cost Reduced)

210k

400-SM-LL-I

9 µm SMF

N/A

Duplex SC

1300nm laser

22k

400-M5-SN-I

50 µm MMF

500 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

2175

400-M5-SN-I

50 µm MMF

400 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.5130

400-M6-SN-I

62.5 µm MMF

200 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.570

400-M6-SN-I

62.5 µm MMF

160 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.555

1200-SM-LL-L

9 µm SMF

N/A

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

1310nm laser

210k

1200-SM-LC4-L

9 µm SMF

N/A

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

12691356nm CWDM lasers

210k

1200-M5E-SN4-I

50 µm Enhanced MMF

1500 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

772857nm CWDM lasers

0.5550

1200-M5E-SN4P-I

50 µm Enhanced MMF

2000 MHz*km

MPO

850nm Parallel lasers

0.5300

1200-M5E-SN-I

50 µm Enhanced MMF

2000 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.5300

1200-M5-LC4-L

50 µm MMF

500 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

12691356nm CWDM lasers

0.5290

1200-M5-LC4-L

50 µm MMF

400 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

12691356nm CWDM lasers

0.5230

1200-M5-SN4-I

50 µm MMF

500 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

772-857nm CWDM lasers

0.5290

1200-M5-SN4P-I

50 µm MMF

500 MHz*km

MPO

850nm Parallel lasers

0.5150

1200-M5-SN-I

50 µm MMF

500 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.582

1200-M5-SN-I

50 µm MMF

400 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.566

1200-M6-LC4-L

62.5 µm MMF

500 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

12691356nm CWDM lasers

0.5290

1200-M6-SN4-I

62.5 µm MMF

200 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

772857nm CWDM lasers

0.5118

1200-M6-SN4P-I

62.5 µm MMF

200 MHz*km

MPO

850nm Parallel lasers

0.575

1200-M6-SN-I

62.5 µm MMF

200 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.533

1200-M6-SN-I

62.5 µm MMF

160 MHz*km

Duplex SC, Duplex SG, Duplex LC, Duplex MT-RJ

850nm laser

0.526


FC Encoding and Signaling

The following definitions and rules apply only to switched FC implementations. FC-AL is not discussed herein. All FC implementations use one of two encoding schemes. Table 5-8 lists the encoding scheme used by each FC and 10GFC implementation and the associated BER objective.

Table 5-8. FC and 10GFC Encoding Schemes and BER Objectives

FC Variant

Encoding Scheme

BER Objective

100-SM-LL-V

8B/10B

1012

100-SM-LL-L

8B/10B

1012

100-SM-LC-L

8B/10B

1012

100-SM-LL-I

8B/10B

1012

100-M5-SN-I

8B/10B

1012

100-M5-SL-I

8B/10B

1012

100-M6-SN-I

8B/10B

1012

100-M6-SL-I

8B/10B

1012

200-SM-LL-V

8B/10B

1012

200-SM-LC-L

8B/10B

1012

200-SM-LL-I

8B/10B

1012

200-M5-SN-I

8B/10B

1012

200-M6-SN-I

8B/10B

1012

400-SM-LL-V

8B/10B

1012

400-SM-LC-L

8B/10B

1012

400-SM-LL-I

8B/10B

1012

400-M5-SN-I

8B/10B

1012

400-M6-SN-I

8B/10B

1012

1200-SM-LL-L

64B/66B

1012

1200-SM-LC4-L

8B/10B

1012

1200-M5E-SN4-I

8B/10B

1012

1200-M5E-SN4P-I

8B/10B

1012

1200-M5E-SN-I

64B/66B

1012

1200-M5-LC4-L

8B/10B

1012

1200-M5-SN4-I

8B/10B

1012

1200-M5-SN4P-I

8B/10B

1012

1200-M5-SN-I

64B/66B

1012

1200-M6-LC4-L

8B/10B

1012

1200-M6-SN4-I

8B/10B

1012

1200-M6-SN4P-I

8B/10B

1012

1200-M6-SN-I

64B/66B

1012


FC implementations operating at 100-MBps, 200-MBps, and 400-MBps use the 8B/10B encoding scheme. Only one of the control characters defined by the 8B/10B encoding scheme is used: K28.5. FC uses fixed-length ordered sets consisting of four characters. Each ordered set begins with K28.5. FC defines 31 ordered sets. FC uses ordered sets as frame delimiters, Primitive Signals, and Primitive Sequences. Multiple frame delimiters are defined so that additional information can be communicated. The start-of-frame (SOF) delimiters indicate the class of service being requested and the position of the frame within the sequence (first or subsequent). The end-of-frame (EOF) delimiters indicate the position of the frame within the sequence (intermediate or last) and whether the frame is valid or invalid or corrupt. Primitive Signals include idle, receiver_ready (R_RDY), virtual_circuit_ready (VC_RDY), buffer-to-buffer_state_change (BB_SC), and clock synchronization (SYN). Idles are transmitted in the absence of data traffic to maintain clock synchronization and in the presence of data traffic to maintain inter-frame spacing. An R_RDY is transmitted after processing a received frame to increment the receive buffer counter (the number of Buffer-to-Buffer_Credits [BB_Credits]) used for link-level flow control. A VC_RDY is transmitted only by a switch port after forwarding a received frame to increment the transmitting node's buffer counter used for link-level flow control within a virtual circuit. BB_SC permits BB_Credit recovery. SYN enables time synchronization of the internal clocks of attached nodes (similar to Network Time Protocol [NTP]). Primitive Signals are transmitted for as long as the transmitting device deems appropriate. By contrast, Primitive Sequences are transmitted continuously until the receiving device responds. Primitive Sequences are used to convey the state of a port, recover from certain types of errors, establish bit-level synchronization, and achieve word alignment. Primitive Sequences include offline state (OLS), not operational state (NOS), link reset (LR), and link reset response (LRR).

The encoding scheme of CWDM and parallel implementations of 10GFC is a combination of definitions and rules taken from the CWDM and parallel implementations of 10GE and the lower-speed FC implementations. 10GFC uses the same seven control characters as 10GE. The rules for their use in 10GFC are the same as in 10GE. However, 10GFC ordered set definitions closely match those of lower-speed FC implementations. There are only six differences in the ordered set definitions. The 10GE Sync_Column, Skip_Column, Align_Column, Local_Fault, and Remote_Fault ordered sets are used in 10GFC. The NOS ordered set used in lower-speed FC implementations is not used in 10GFC (replaced by Remote_Fault). In total, 10GFC uses 35 fixed-length ordered sets. Each consists of four characters, but the composition of each 10GFC ordered set is different than the equivalent ordered set in lower-speed FC implementations.

Serial implementations of 10GFC use the 64B/66B encoding scheme. The definitions and rules are unchanged from the 10GE implementation.

Further details of each encoding scheme are outside the scope of this book. The 8B/10B encoding scheme is well documented in clause 5 of the ANSI T11 FC-FS-2 specification, and in clauses 9 and 12 of the ANSI T11 10GFC specification. The 64B/66B encoding scheme is well documented in clause 49 of the IEEE 802.3ae-2002 specification and in clause 13 of the ANSI T11 10GFC specification.

FC Addressing Scheme

FC employs an addressing scheme that directly maps to the SAM addressing scheme. FC uses WWNs to positively identify each HBA and port, which represent the equivalent of SAM device and port names, respectively. An FC WWN is a 64-bit value expressed in colon-separated hexadecimal notation such as 21:00:00:e0:8b:08:a5:44. There are many formats for FC WWNs, most of which provide universal uniqueness. Figure 5-13 illustrates the basic ANSI T11 WWN address format.

Figure 5-13. Basic ANSI T11 WWN Address Format


A brief description of each field follows:

  • NAA 4 bits long. It indicates the type of address contained within the Name field and the format of the Name field.

  • Name 60 bits long. It contains the actual name of the node.

The Name field can contain a locally assigned address in any format, a mapped external address in the format defined by the NAA responsible for that address type, or a mapped external address in a modified format defined by the ANSI T11 subcommittee. External addresses are mapped into the Name field according to the rules defined in the FC-PH and FC-FS series of specifications. Six mappings are defined: IEEE MAC-48, IEEE extended, IEEE registered, IEEE registered extended, IEEE EUI-64, and IETF IPv4. The FC-DA specification series mandates the use of the IEEE MAC-48, IEEE extended, IEEE registered, or IEEE EUI-64 format to ensure universal uniqueness and interoperability. All six formats are described herein for the sake of completeness. Figure 5-14 illustrates the format of the Name field for containment of a locally assigned address.

Figure 5-14. ANSI T11 Name Field Format for Locally Assigned Addresses


A brief description of each field follows:

  • NAA 4 bits long. It is set to 0011.

  • Vendor Assigned 60 bits long. It can contain any series of bits in any format as defined by the vendor. Therefore, universal uniqueness cannot be guaranteed.

Figure 5-15 illustrates the format of the Name field for containment of an IEEE MAC-48 address.

Figure 5-15. ANSI T11 Name Field Format for IEEE MAC-48 Addresses


A brief description of each field follows:

  • NAA 4 bits long. It is set to 0001.

  • Reserved 12 bits long. It may not be used. These bits are set to 0.

  • MAC-48 48 bits long. It contains an IEEE MAC-48 address generated in accordance with the rules set forth by the IEEE. The U/L and I/G bits have no significance and are set to 0.

Figure 5-16 illustrates the format of the Name field for containment of an IEEE extended address.

Figure 5-16. ANSI T11 Name Field Format for IEEE Extended Addresses


A brief description of each field follows:

  • NAA 4 bits long. It is set to 0010.

  • Vendor Assigned 12 bits long. It can contain any series of bits in any format as defined by the vendor.

  • MAC-48 48 bits long. It contains an IEEE MAC-48 address generated in accordance with the rules set forth by the IEEE. The U/L and I/G bits have no significance and are set to 0.

Figure 5-17 illustrates the format of the Name field for containment of an IEEE Registered address.

Figure 5-17. ANSI T11 Name Field Format for IEEE Registered Addresses


A brief description of each field follows:

  • NAA 4 bits long. It is set to 0101.

  • OUI 24 bits long. It contains the vendor's IEEE assigned identifier. The U/L and I/G bits have no significance and are set to 0.

  • Vendor Assigned 36 bits long. It can contain any series of bits in any format as defined by the vendor.

The IEEE registered extended format is atypical because it is the only WWN format that is not 64 bits long. An extra 64-bit field is appended, yielding a total WWN length of 128 bits. The extra length creates some interoperability issues. Figure 5-18 illustrates the format of the Name field for containment of an IEEE registered extended address.

Figure 5-18. ANSI T11 Name Field Format for IEEE Registered Extended Addresses


A brief description of each field follows:

  • NAA 4 bits long. It is set to 0110.

  • OUI 24 bits long. It contains the vendor's IEEE assigned identifier. The U/L and I/G bits have no significance and are set to 0.

  • Vendor Assigned 36 bits long. It can contain any series of bits in any format as defined by the vendor.

  • Vendor Assigned Extension 64 bits long. It can contain any series of bits in any format as defined by the vendor.

Figure 5-19 illustrates the format of the Name field for containment of an IEEE EUI-64 address.

Figure 5-19. ANSI T11 Name Field Format for IEEE EUI-64 Addresses


A brief description of each field follows:

  • NAA 2 bits long. It is set to 11. Because the EUI-64 format is the same length as the FC WWN format, the NAA bits must be taken from the EUI-64 address. To make this easier to accomplish, all NAA values beginning with 11 are designated as EUI-64 indicators. This has the effect of shortening the NAA field to 2 bits. Therefore, only 2 bits need to be taken from the EUI-64 address.

  • OUI 22 bits long. It contains a modified version of the vendor's IEEE assigned identifier. The U/L and I/G bits are omitted from the first byte of the OUI, and the remaining 6 bits of the first byte are right-shifted two bit positions to make room for the 2 NAA bits.

  • Vendor Assigned 40 bits long. It can contain any series of bits in any format as defined by the vendor.

Figure 5-20 illustrates the format of the Name field for containment of an IETF IPv4 address.

Figure 5-20. ANSI T11 Name Field Format for IETF IPv4 Addresses


A brief description of each field follows:

  • NAA 4 bits long. It is set to 0100.

  • Reserved 28 bits long.

  • IPv4 Address 32 bits long. It contains the IP address assigned to the node. The party responsible for assigning the IP address (for example, the product manufacturer or the network administrator) is unspecified. Also unspecified is the manner in which the IP address should be assigned. For example, the product manufacturer could assign an IP address from an Internet Assigned Numbers Authority (IANA) registered address block, or the network administrator could assign an IP address from the RFC 1918 address space. Therefore, this WWN format does not guarantee universal uniqueness.

The FC equivalent of the SAM port identifier is the FC port identifier (Port_ID). The FC Port_ID is embedded in the FC Address Identifier. The FC Address Identifier consists of a hierarchical 24-bit value, and the lower 8 bits make up the Port_ID. The entire FC Address Identifier is sometimes referred to as the FCID (depending on the context). In this book, the phrase FC Address Identifier and the term FCID are used interchangeably except in the context of address assignment. The format of the 24-bit FCID remains unchanged from its original specification. This simplifies communication between FC devices operating at different speeds and preserves the legacy FC frame format. FCIDs are expressed in space-separated hexadecimal notation such as '0x64 03 E8'. Some devices omit the spaces when displaying FCIDs. Figure 5-21 illustrates the ANSI T11 Address Identifier format.

Figure 5-21. ANSI T11 Address Identifier Format


A brief description of each field follows:

  • Domain ID is the first level of hierarchy. This field is 8 bits long. It identifies one or more FC switches.

  • Area ID is the second level of hierarchy. This field is 8 bits long. It identifies one or more end nodes.

  • Port ID is the third level of hierarchy. This field is 8 bits long. It identifies a single end node. This field is sometimes called the FCID, which is why the term FCID can be confusing if the context is not clear.

The SAM defines the concept of domain as the entire system of SCSI devices that interact with one another via a service delivery subsystem. FC implements this concept of domain via the first level of hierarchy in the FCID (Domain_ID). In a single-switch fabric, the Domain_ID represents the single switch. In a multi-switch fabric, the Domain_ID should represent all interconnected switches according to the SAM definition of domain. However, FC forwards frames between switches using the Domain_ID. So, each switch must be assigned a unique Domain_ID. To comply with the SAM definition of domain, the ANSI T11 FC-SW-3 specification explicitly allows multiple interconnected switches to share a single Domain_ID. However, the Domain_ID is not implemented in this manner by any FC switch currently on the market.

The Area_ID can identify a group of ports attached to a single switch. The Area_ID may not span FC switches. The FC-SW-3 specification does not mandate how fabric ports should be grouped into an Area_ID. One common technique is to assign all ports in a single slot of a switch chassis to the same Area_ID. Other techniques can be implemented.

The Port_ID provides a unique identity to each HBA port within each Area_ID. Alternately, the Area_ID field can be concatenated with the Port_ID field to create a 16-bit Port_ID. In this case, no port groupings exist.

The FC standards allow multiple FC Address Identifiers to be associated with a single HBA. This is known as N_Port_ID virtualization (NPIV). NPIV enables multiple virtual initiators to share a single HBA by assigning each virtual initiator its own FC Address Identifier. The normal FLOGI procedure is used to acquire the first FC Address Identifier. Additional FC Address Identifiers are acquired using the discover F_port service parameters (FDISC) ELS. When using NPIV, all virtual initiators must share the receive buffers on the HBA. NPIV enhances server virtualization techniques by enabling FC-SAN security policies (such as zoning) and QoS policies to be enforced independently for each virtual server. Note that some HBA vendors call their NPIV implementation "virtual HBA technology."

FC Name Assignment and Resolution

FC WWNs are "burned in" during the interface manufacturing process. Each HBA is assigned a node WWN (NWWN). Each port within an HBA is assigned a port WWN (PWWN). Some HBAs allow these values to be overridden by the network administrator. Locally administered WWNs are not guaranteed to be globally unique, so the factory-assigned WWNs are used in the vast majority of deployments.

FC name resolution occurs immediately following link initialization. As discussed in chapter 3, "Overview of Network Operating Principles," each initiator and target registers with the FCNS. Following registration, each initiator queries the FCNS to discover accessible targets. Depending on the number and type of queries issued, the FCNS replies can contain the NWWN, PWWN, or FCID of some or all of the targets accessible by the initiator. Initiators may subsequently query the FCNS as needed. For example, when a new target comes online, an RSCN is sent to registered nodes. Upon receiving an RSCN, each initiator queries the FCNS for the details of the change. In doing so, the initiator discovers the NWWN, PWWN, or FCID of the new target.

Two alternate methods are defined to enable nodes to directly query other nodes to resolve or update name-to-address mappings. This is accomplished using extended link service (ELS) commands. An ELS may comprise one or more frames per direction transmitted as a single sequence within a new Exchange. Most ELSs are defined in the FC-LS specification. The first ELS is called discover address (ADISC). ADISC may be used only after completion of the PLOGI process. Because the FCID of the destination node is required to initiate PLOGI, ADISC cannot be used to resolve name-to-address mappings. ADISC can be used to update a peer regarding a local FCID change during an active PLOGI session. Such a change is treated as informational and has no effect on the current active PLOGI session.

The second ELS is called Fibre Channel Address Resolution Protocol (FARP). FARP may be used before using PLOGI. Thus, FARP can be used to resolve a NWWN or PWWN to an FCID. This enables an initiator that queries only for NWWN or PWWN following FCNS registration to issue a PLOGI to a target without querying the FCNS again. In theory, this can be useful in fabrics containing large numbers of targets that use dynamic-fluid FCIDs. However, the actual benefits are negligible. FARP can also be useful as a secondary mechanism to the FCNS in case the FCNS becomes unavailable. For example, some FC switches support a feature called hot code load that allows network administrators to upgrade the switch operating system without disrupting the flow of data frames. However, this feature halts all fabric services (including the FCNS) for an extended period of time. Thus, initiators that are dependent upon the FCNS cannot resolve FCIDs during the switch upgrade. In reality, this is not a concern because most initiators query the FCNS for NWWN, PWWN, and FCID following FCNS registration. To use FARP, the requestor must already know the NWWN or PWWN of the destination node so that the destination node can recognize itself as the intended responder upon receipt of a FARP request. Even though this is practical for some ULPs, FCP generally relies on the FCNS.

FC Address Assignment and Resolution

By default, a Domain_ID is dynamically assigned to each switch when a fabric comes online via the Domain_ID Distribution process. Each switch then assigns an FCID to each of its attached nodes via the Area_ID and Port_ID fields of the FC Address Identifier. A single switch, known as the principal switch (PS), controls the Domain_ID Distribution process. The PS is dynamically selected via the principal switch selection (PSS) process. The PSS process occurs automatically upon completion of the extended link initialization process. The PSS process involves the following events, which occur in the order listed:

  1. Upon entering the non-disruptive fabric reconfiguration state machine, each switch clears its internal Domain_ID list. The Domain_ID list is a cached record of all Domain_IDs that have been assigned and the switch NWWN associated with each. Clearing the Domain_ID list has no effect during the initial configuration of a fabric because each switch's Domain_ID list is already empty.

  2. Each switch transmits a build fabric (BF) switch fabric internal link service (SW_ILS) frame on each ISL. A SW_ILS is an ELS that may be transmitted only between fabric elements (such as FC switches). Most SW_ILSs are defined in the FC-SW specification series. If a BF SW_ILS frame is received on an ISL before transmission of a BF SW_ILS frame on that ISL, the recipient switch does not transmit a BF SW_ILS frame on that ISL.

  3. Each switch waits for the fabric stability time-out value (F_S_TOV) to expire before originating exchange fabric parameters (EFP) SW_ILS frames. This allows the BF SW_ILS frames to flood throughout the entire fabric before any subsequent action.

  4. Each switch transmits an EFP SW_ILS frame on each ISL. If an EFP SW_ILS frame is received on an ISL before transmission of an EFP SW_ILS frame on that ISL, the recipient switch transmits a switch accept (SW_ACC) SW_ILS frame on that ISL instead of transmitting an EFP SW_ILS frame. Each EFP and associated SW_ACC SW_ILS frame contains a PS_Priority field, a PS_Name field and a Domain_ID_List field. The PS_Priority and PS_Name fields of an EFP SW_ILS frame initially contain the priority and NWWN of the transmitting switch. The priority and NWWN are concatenated to select the PS. The lowest concatenated value wins. Upon receipt of an EFP or SW_ACC SW_ILS frame containing a priority-NWWN value lower than the recipient switch's value, the F_S_TOV timer is reset, the new priority-NWWN value is cached, and the recipient switch transmits an updated EFP SW_ILS frame containing the cached priority-NWWN value on all ISLs except the ISL on which the lower value was received. This flooding continues until all switches agree on the PS. Each switch determines there is PS agreement upon expiration of F_S_TOV. The Domain_ID_List field remains empty during the PSS process but is used during the subsequent Domain_ID Distribution process.

Upon successful completion of the PSS process, the Domain_ID Distribution process ensues. Domain_IDs can be manually assigned by the network administrator, but the Domain_ID Distribution process still executes so the PS (also known as the domain address manager) can compile a list of all assigned Domain_IDs, ensure there are no overlapping Domain_IDs, and distribute the complete Domain_ID_List to all other switches in the fabric. The Domain_ID Distribution process involves the following events, which occur in the order listed:

  1. The PS assigns itself a Domain_ID.

  2. The PS transmits a Domain_ID Assigned (DIA) SW_ILS frame on all ISLs. The DIA SW_ILS frame indicates that the transmitting switch has been assigned a Domain_ID. A received DIA SW_ILS frame is never forwarded by the recipient switch.

  3. Each recipient switch replies to the DIA SW_ILS frame with an SW_ACC SW_ILS frame.

  4. Each switch that replied to the DIA SW_ILS frame transmits a Request Domain_ID (RDI) SW_ILS frame to the PS. The RDI SW_ILS frame may optionally contain one or more preferred Domain_IDs. During reconfiguration of a previously operational fabric, each switch may list its previous Domain_ID as its preferred Domain_ID. Alternatively, a preferred or static Domain_ID can be manually assigned to each switch by the network administrator. If the transmitting switch does not have a preferred or static Domain_ID, it indicates this in the RDI SW_ILS frame by listing its preferred Domain_ID as 0x00.

  5. The PS assigns a Domain_ID to each switch that transmitted an RDI SW_ILS frame. If available, each switch's requested Domain_ID is assigned. If a requested Domain_ID is not available, the PS may assign a different Domain_ID or reject the request. Each assigned Domain_ID is communicated to the associated switch via an SW_ACC SW_ILS frame.

  6. Upon receiving its Domain_ID, each switch transmits a DIA SW_ILS frame on all ISLs except the ISL that connects to the PS (called the upstream principal ISL).

  7. Each recipient switch replies to the DIA SW_ILS frame with an SW_ACC SW_ILS frame.

  8. Each switch that replied to the DIA SW_ILS frame transmits an RDI SW_ILS frame on its upstream principal ISL.

  9. Each intermediate switch forwards the RDI SW_ILS frame on its upstream principal ISL.

  10. The PS assigns a Domain_ID to each switch that transmitted an RDI SW_ILS frame and replies with an SW_ACC SW_ILS frame.

  11. Each intermediate switch forwards the SW_ACC SW_ILS frame(s) on its downstream principal ISL(s) to the requesting switch(es).

  12. The Domain_ID assignment process repeats in this manner until all switches have been assigned a Domain_ID. Thus, Domain_ID assignment propagates outward from the PS.

  13. Each time the PS assigns a Domain_ID, it transmits an EFP SW_ILS frame containing the updated Domain_ID_List on all ISLs.

  14. Each switch directly connected to the PS replies to the EFP SW_ILS frame with an SW_ACC SW_ILS frame and forwards the EFP SW_ILS frame on all ISLs except the Upstream Principal ISL. Thus, the EFP SW_ILS frame propagates outward from the PS until all switches have received it.

The preceding descriptions of the PSS and Domain_ID Distribution processes are simplified to exclude error conditions and other contingent scenarios. For more information about these processes, readers are encouraged to consult the ANSI T11 FC-SW-3 specification. The eight-bit Domain_ID field mathematically accommodates 256 Domain_IDs, but some Domain_IDs are reserved. Only 239 Domain_IDs are available for use as FC switch identifiers. Table 5-9 lists all FC Domain_ID values and the status and usage of each.

Table 5-9. FC Domain_IDs, Status, and Usage

Domain_ID

Status

Usage

0x00

Reserved

FC-AL Environments

0x01-EF

Available

Switch Domain_IDs

0xF0-FE

Reserved

None

0xFF

Reserved

WKAs, Multicast, Broadcast, Domain Controllers


As the preceding table indicates, some Domain_IDs are reserved for use in WKAs. Some WKAs facilitate access to fabric services. Table 5-10 lists the currently defined FC WKAs and the fabric service associated with each.

Table 5-10. FC WKAs and Associated Fabric Services

Well Known Address

Fabric Service

0x'FF FF F5'

Multicast Server

0x'FF FF F6'

Clock Synchronization Server

0x'FF FF F7'

Security Key Distribution Server

0x'FF FF F8'

Alias Server

0x'FF FF F9'

Quality of Service Facilitator Class 4

0x'FF FF FA'

Management Server

0x'FF FF FB'

Time Server

0x'FF FF FC'

Directory Server

0x'FF FF FD'

Fabric Controller

0x'FF FF FE'

Fabric Login Server


In the context of address assignment mechanisms, the term FCID refers only to the Area_ID and Port_ID fields of the FC Address Identifier. These two values can be assigned dynamically by the FC switch or statically by either the FC switch or the network administrator. Dynamic FCID assignment can be fluid or persistent. With dynamic-fluid assignment, FCID assignments may be completely randomized each time an HBA port boots or resets. With dynamic-persistent assignment, the first assignment of an FCID to an HBA port may be completely randomized, but each subsequent boot or reset of that HBA port will result in reassignment of the same FCID. With static assignment, the first assignment of an FCID to an HBA port is predetermined by the software design of the FC switch or by the network administrator, and persistence is inherent in both cases.

FC Address Identifiers are not required to be universally unique. In fact, the entire FC address space is available for use within each physical fabric. Likewise, the entire FC address space is available for use within each VSAN. This increases the scalability of each physical fabric that contains multiple VSANs. However, reusing the entire FC address space can prevent physical fabrics or VSANs from being non-disruptively merged due to potential address conflicts. Reusing the entire FC address space also prevents communication between physical fabrics via SAN routers and between VSANs via inter-VSAN routing (IVR) unless network address translation (NAT) is employed. NAT improves scalability by allowing reuse of the entire FC address space while simultaneously facilitating communication across physical fabric boundaries and across VSAN boundaries. However, because NAT negates universal FC Address Identifier uniqueness, potential address conflicts can still exist, and physical fabric/VSAN mergers can still be disruptive. NAT also increases configuration complexity, processing overhead and management overhead. So, NAT represents a tradeoff between communication flexibility and configuration simplicity. Address reservation schemes facilitate communication between physical fabrics or VSANs without using NAT by ensuring that there is no overlap between the addresses assigned within each physical fabric or VSAN. A unique portion of the FC address space is used within each physical fabric or VSAN. This has the effect of limiting the scalability of all interconnected physical fabrics or VSANs to a single instance of the FC address space. However, address reservation schemes eliminate potential address conflicts, so physical fabrics or VSANs can be merged non-disruptively.

Note that some host operating systems use the FC Address Identifier of target ports to positively identify target ports, which is the stated purpose of PWWNs. Such operating systems require the use of dynamic-persistent or static FCIDs in combination with dynamic-persistent or static Domain_IDs. Note also that the processing of preferred Domain_IDs during the PSS process guarantees Domain_ID persistence in most cases without administrative intervention. In other words, the PSS process employs a dynamic-persistent Domain_ID assignment mechanism by default. However, merging two physical fabrics (or two VSANs) into one can result in Domain_ID conflicts. Thus, static Domain_ID assignment is required to achieve the highest availability of targets in the presence of host operating systems that use FC Address Identifiers to positively identify target ports. As long as static Domain_IDs are used, and the network administrator takes care to assign unique Domain_IDs across physical fabrics (or VSANs) via an address reservation scheme, dynamic-persistent FCID assignment can be used in place of static FCIDs without risk of address conflicts during physical fabric (or VSAN) mergers.

An HBA's FC Address Identifier is used as the destination address in all unicast frames sent to that HBA and as the source address in all frames (unicast, multicast or broadcast) transmitted from that HBA. Two exceptions to the source address rule are defined: one related to FCID assignment (see the FC Link Initialization section) and another related to Class 6 multicast frames. FC multicast addressing is currently outside the scope of this book. Broadcast traffic is sent to the reserved FC Address Identifier 0x'FF FF FF'. Broadcast traffic delivery is subject to operational parameters such as zoning policy and class of service. All FC devices that receive a frame sent to the broadcast address accept the frame and process it accordingly.

Note

In FC, multicast addresses are also called Alias addresses. This should not be confused with PWWN aliases that are optionally used during zoning operations. Another potential point of confusion is Hunt Group addressing, which involves the use of Alias addresses in a particular manner. Hunt Groups are currently outside the scope of this book.


FC implements only OSI Layer 2 addresses, so address resolution is not required to transport SCSI. Note that address resolution is required to transport IP. RFC 2625, IP, and ARP over Fibre Channel (IPFC), defines ARP operation in FC environments. ARP over FC can complement FARP, or FARP can be used independently. ARP over FC facilitates dynamic resolution of an IP address to a PWWN. The PWWN is then resolved to an FC Address Identifier using FARP. Alternately, FARP can be used to directly resolve an IP address to an FC Address Identifier. FARP operation is very similar to ARP operation, but FARP can also solicit a PLOGI from the destination node instead of a FARP reply. Regardless of how ARP and FARP are used, the IP address of the destination node must be known to the requestor before transmitting the resolution request. This is consistent with the ARP over Ethernet model described in the preceding Ethernet section of this chapter. As with ARP, system administrators can create static mappings in the FARP table on each host. Typically, we use static mappings only in special situations to accomplish a particular goal.

FC Media Access

As stated in chapter 3, "Overview of Network Operating Principles," FC-AL is a shared media implementation, so it requires some form of media access control. However, we use FC-AL primarily for embedded applications (such as connectivity inside a tape library) today, so the FC-AL arbitration mechanism is currently outside the scope of this book. In switched FC implementations, arbitration is not required because full-duplex communication is employed. Likewise, the FC PTP topology used for DAS configurations supports full-duplex communication and does not require arbitration.

FC Network Boundaries

Traditional FC-SANs are physically bounded by media terminations (for example, unused switch ports) and end node interfaces (for example, HBAs). No control information or user data can be transmitted between FC-SANs across physical boundaries. Figure 5-22 illustrates the physical boundaries of a traditional FC-SAN.

Figure 5-22. Traditional FC-SAN Boundaries


FC-SANs also have logical boundaries, but the definition of a logical boundary in Ethernet networks does not apply to FC-SANs. Like the Ethernet architecture, the FC architecture does not define any native functionality at OSI Layer 3. However, Ethernet is used in conjunction with autonomous OSI Layer 3 protocols as a matter of course, so logical boundaries can be easily identified at each OSI Layer 3 entity. By contrast, normal FC communication does not employ autonomous OSI Layer 3 protocols. So, some OSI Layer 2 control information must be transmitted between FC-SANs across logical boundaries to facilitate native communication of user data between FC-SANs. Currently, there is no standard method of facilitating native communication between FC-SANs. Leading FC switch vendors have created several proprietary methods. The ANSI T11 subcommittee is considering all methods, and a standard method is expected in 2006 or 2007. Because of the proprietary and transitory nature of the current methods, further exploration of this topic is currently outside the scope of this book. Note that network technologies autonomous from FC can be employed to facilitate communication between FC-SANs. Non-native FC transports are defined in the FC-BB specification series. chapter 8, "OSI Session, Presentation and Application Layers," discusses one such transport (Fibre Channel over TCP/IP [FCIP]) in detail.

FC-SANs also can have virtual boundaries. There is currently only one method of creating virtual FC-SAN boundaries. Invented in 2002 by Cisco Systems, VSANs are now widely deployed in the FC-SAN market. In 2004, ANSI began researching alternative solutions for virtualization of FC-SAN boundaries. In 2005, ANSI selected Cisco's VSAN technology as the basis for the only standards-based solution (called Virtual Fabrics). The new standards (FC-SW-4, FC-FS-2, and FC-LS) are expected to be finalized in 2006. VSANs are similar to VLANs in the way traffic isolation is provided. Typically, each switch port is statically assigned to a single VSAN by the network administrator. Alternately, each switch port can be dynamically assigned to a VSAN via Cisco's dynamic port VSAN membership (DPVM) technology. DPVM is similar in function to Ethernet's GVRP. Like Ethernet, an FC switch port can belong to multiple VSANs. However, this is used exclusively on ISLs; HBAs do not currently support VSAN trunking. As frames enter a switch from an end node, the switch prepends a tag to indicate the VSAN membership of the ingress port. The tag remains intact until the frame reaches the egress switch port that connects the destination end node. The switch removes the tag and transmits the frame to the destination end node. FC switches made by Cisco Systems use VSAN tags to ensure that no frames are forwarded between VSANs. Thus, VSAN boundaries mimic physical FC-SAN boundaries. User data can be forwarded between VSANs only via IVR. IVR is one of the native FC logical boundaries alluded to in the preceding paragraph. IVR can be used with all of the non-native FC transports defined in the FC-BB specification series.

VSANs provide additional functionality not provided by VLANs. The FC specifications outline a model in which all network services (for example, the zone server) may run on one or more FC switches. This contrasts the TCP/IP model, in which network services other than routing protocols typically run on one or more hosts attached to the network (for example, a DHCP server). The FC service model enables switch vendors to instantiate independent network services within each VSAN during the VSAN creation process. This is the case with FC switches made by Cisco Systems. A multi-VSAN FC switch has an instance of each network service operating independently within each VSAN. This enables network administrators to achieve higher availability, security, and flexibility by providing complete isolation between VSANs. When facilitating communication between VSANs, IVR selectively exports control information bidirectionally between services in the affected VSANs without fusing the services. This is similar in concept to route redistribution between dissimilar IP routing protocols. The result is preservation of the service isolation model.

FC Frame Formats

FC uses one general frame format for many purposes. The general frame format has not changed since the inception of FC. The specific format of an FC frame is determined by the function of the frame. FC frames are word-oriented, and an FC word is 4 bytes. Figure 5-23 illustrates the general FC frame format.

Figure 5-23. General FC Frame Format


A brief description of each field follows:

  • Start of Frame (SOF) ordered set 4 bytes long. It delimits the beginning of a frame, indicates the Class of Service, and indicates whether the frame is the first frame of a new sequence or a subsequent frame in an active sequence.

  • Header first field of a frame. It is 24 bytes long and contains multiple subfields (see the following subsection).

  • Data variable in length. It contains optional headers and ULP data.

  • Optional Encapsulating Security Payload (ESP) header 8 bytes long. It contains the security parameter index (SPI) and the ESP sequence number. ESP, defined in RFC 2406, provides confidentiality, authentication, and anti-replay protection. ESP usage in FC environments is defined by ANSI in the FC-SP specification. Security is discussed in chapter 12, "Storage Network Security."

  • Optional Network header 16 bytes long. It is used by devices that connect FC-SANs to non-native FC networks.

  • Optional Association header 32 bytes long. It is used to identify a process or group of processes within an initiator or target node. Thus, the Association Header represents an alternative to the Routing Control and Type sub-fields within the Header field. FCP does not use the Association Header.

  • Optional Device header 16, 32, or 64 bytes long. It is used by some ULPs. The format of the Device Header is variable and is specified by each ULP that makes use of the header. FCP does not use this header.

  • Payload variable in length. It contains ULP data. The presence of optional headers in the Data field reduces the maximum size of the payload.

  • Optional Fill Bytes variable in length. It ensures that the variable-length payload field ends on a word boundary. The Fill Data Bytes sub-field in the F_CTL field in the Header indicates how many fill bytes are present. This field is not used if the frame contains an ESP header. ESP processing ensures that the payload field is padded if needed. The ESP payload pad field ranges from 0 to 255 bytes.

  • Optional ESP trailer variable in length. It contains the Integrity Check Value (ICV) calculated on the FC Header field (excluding the D_ID, S_ID, and CS_CTL/Priority fields), the ESP Header field and Data field.

  • CRC 4 bytes long. It contains a CRC value calculated on the FC Header field and Data field.

  • End of Frame (EOF) ordered set 4 bytes long. It delimits the end of a frame, indicates whether the frame is the last frame of an active sequence, indicates the termination status of an Exchange that is being closed, and sets the running disparity to negative.

As mentioned in chapter 3, "Overview of Network Operating Principles," the header in the general FC frame format provides functionality at multiple OSI layers. This contrasts the layered header model used in TCP/IP/Ethernet networks wherein a distinct header is present for each protocol operating at OSI Layers 2-4. Figure 5-24 illustrates the FC Header format.

Figure 5-24. FC Header Format


A brief description of each field follows:

  • Routing Control (R_CTL) 1 byte long. It contains two sub-fields: Routing and Information. The Routing sub-field is 4 bits and indicates the whether the frame is a data frame or link-control frame. This aids the receiving node in routing the frame to the appropriate internal process. Two types of data frames can be indicated: frame type zero (FT_0) and frame type one (FT_1). Two types of link-control frames can be indicated: Acknowledge (ACK) and Link_Response. The value of the Routing sub-field determines how the Information sub-field and Type field are interpreted. The Information sub-field is 4 bits. It indicates the category of data contained within a data frame or the specific type of control operation contained within a link-control frame.

  • Destination ID (D_ID) 3 bytes long. It contains the FC Address Identifier of the destination node.

  • Class Specific Control (CS_CTL)/Priority 1 byte long. It can be interpreted as either CS_CTL or Priority. The interpretation of this field is determined by the CS_CTL/Priority Enable bit in the Frame Control field. When used as CS_CTL, this field contains control information such as the connection request policy, virtual circuit identifier, frame preference, and differentiated services codepoint (DSCP) value that is relevant to the CoS indicated by the SOF. When used as Priority in Class 1, 2, 3, or 6 environments, this field indicates the priority of the frame relative to other frames. A minor difference exists between the CS_CTL interpretation and the Priority interpretation regarding QoS. The DSCP values used in the CS_CTL interpretation are defined in the IETF DiffServ RFCs (2597, 3246 and 3260), whereas the priority values used in the Priority interpretation are defined in the ANSI T11 FC-FS specification series. The Priority field also facilitates preemption of a Class 1 or Class 6 connection in favor of a new Class 1 or 6 connection, or Class 2 or 3 frames. Class 4 frames use the Priority field to indicate the virtual circuit identifier.

  • Source ID (S_ID) 3 bytes long. It contains the FC Address Identifier of the source node.

  • Type 1 byte long. It contains operation-specific control information when the Routing sub-field of the R_CTL field indicates a control frame. The Type field indicates the ULP when the Routing sub-field of the R_CTL field indicates a data frame.

  • Frame Control (F_CTL) 3 bytes long. It contains extensive control information. Most notable are the exchange context, sequence context, first_sequence, last_sequence, end_sequence, CS_CTL/priority enable, sequence initiative, retransmitted sequence, continue sequence condition, abort sequence condition, relative offset present, exchange reassembly, and fill data bytes sub-fields. The sequence context bit indicates whether the initiator or target is the source of a sequence. The sequence initiative bit determines which device (initiator or target) may initiate a new sequence. Either the initiator or target possesses the sequence initiative at each point in time. In FC vernacular, streamed sequences are simultaneously outstanding sequences transmitted during a single possession of the sequence initiative, and consecutive non-streamed sequences are successive sequences transmitted during a single possession of the sequence initiative. If a device transmits only one sequence during a single possession of the sequence initiative, that sequence is simply called a sequence.

  • Sequence ID (SEQ_ID) 1 byte long. It is used to group frames belonging to a series. The SEQ_ID is set by the initiator for sequences transmitted by the initiator. Likewise, the SEQ_ID is set by the target for sequences transmitted by the target. The initiator and target each maintain a SEQ_ID counter that is incremented independently. SEQ_ID values can be incremented sequentially or randomly. SEQ_ID values are meaningful only within the context of a S_ID/D_ID pair and may be reused by a source device during simultaneous communication with other destination devices. However, each series of frames transmitted by a source device to a given destination device must have a unique SEQ_ID relative to other frame series that are simultaneously outstanding with that destination device. This requirement applies even when each frame series is associated with a unique OX_ID.

  • Data Field Control (DF_CTL) 1 byte long. It indicates the presence or absence of each optional header. In the case of the Device Header, this field also indicates the size of the optional header.

  • Sequence Count (SEQ_CNT) 2 bytes long. It is used to indicate the sequential order of frames within a SEQ_ID. SEQ_CNT values are meaningful only within the context of a SEQ_ID or consecutive series of SEQ_IDs between a S_ID/D_ID pair. Because SEQ_ID values are unidirectional between each S_ID/D_ID pair, the SEQ_CNT field must be used with the sequence context bit in the F_CTL field to uniquely identify each frame within an Exchange. The SEQ_CNT field is incremented by one for each frame transmitted within a sequence. The SEQ_CNT field is reset after each consecutive non-streamed sequence and after transferring the sequence initiative. The SEQ_CNT field is not reset after each sequence in a set of streamed sequences.

  • Originator Exchange ID (OX_ID) 2 bytes long. It is used to group related sequences (that is, sequences associated with a single ULP operation). The OX_ID is assigned by the initiator. Each OX_ID may be unique for the source node across all destination nodes or may be unique only between the source node and a given destination node. FCP maps each I/O operation to an OX_ID.

  • Responder Exchange ID (RX_ID) 2 bytes long. It is similar to the OX_ID field but is assigned by the target.

  • Parameter 4 bytes long. When the Routing sub-field of the R_CTL field indicates a control frame, the Parameter field contains operation-specific control information. When the Routing sub-field of the R_CTL field indicates a data frame, the interpretation of this field is determined by the Relative Offset Present bit in the F_CTL field. When the Relative Offset Present bit is set to 1, this field indicates the position of the first byte of data carried in this frame relative to the first byte of all the data transferred by the associated SCSI command. This facilitates payload segmentation and reassembly. When the Relative Offset Present bit is set to 0, this field may contain ULP parameters that are passed to the ULP indicated in the Type field.

The S_ID, D_ID, OX_ID, and RX_ID fields are collectively referred to as the fully qualified exchange identifier (FQXID). The S_ID, D_ID, OX_ID, RX_ID, and SEQ_ID fields are collectively referred to as the sequence qualifier. The fields of the sequence qualifier can be used together in several ways. The preceding descriptions of these fields are highly simplified and apply only to FCP. FC implements many control frames to facilitate link, fabric, and session management. Many of the control frames carry additional information within the Data field. Comprehensive exploration of all the control frames and their payloads is outside the scope of this book, but certain control frames are explored in subsequent chapters. For more information about the general FC frame format, readers are encouraged to consult the ANSI T11 FC-FS-2 specification. For more information about control frame formats, readers are encouraged to consult the ANSI T11 FC-FS-2, FC-SW-3, FC-GS-3, FC-LS, and FC-SP specifications.

FC Delivery Mechanisms

Like Ethernet, FC supports several delivery mechanisms. Each set of delivery mechanisms is called a class of service (CoS). Currently, there are six CoS definitions:

  • Class 1 Acknowledged, connection-oriented, full bandwidth service

  • Class 2 Acknowledged, connectionless service

  • Class 3 Unacknowledged, connectionless service

  • Class 4 Acknowledged, connection-oriented, partial bandwidth service

  • Class 6 Acknowledged, connection-oriented, full bandwidth, multicast service

  • Class F Acknowledged, connectionless service

Note

Class 5 was abandoned before completion. Class 5 was never included in any ANSI standard.


Classes 1, 2, 3, 4, and 6 are referred to collectively as Class N services; the N stands for node. The F in Class F stands for fabric because Class F traffic can never leave the fabric. In other words, Class F traffic can never be accepted from or transmitted to a node and may be exchanged only between fabric infrastructure devices such as switches and bridges. FC devices are not required to support all six classes. Classes 1, 4, and 6 are not currently supported on any modern FC switches. Classes 2 and 3 are supported on all modern FC switches. Class 3 is currently the default service on all modern FC switches, and most FC-SANs operate in Class 3 mode. Class F support is mandatory on all FC switches.

Class 1 provides a dedicated circuit between two end nodes (conceptually similar to ATM CES). Class 1 guarantees full bandwidth end-to-end. Class 2 provides reliable delivery without requiring a circuit to be established. All delivered frames are acknowledged, and all delivery failures are detected and indicated to the source node. Class 3 provides unreliable delivery that is roughly equivalent to Ethernet Type 1 service. Class 4 provides a virtual circuit between two end nodes. Class 4 is similar to Class 1 but guarantees only fractional bandwidth end-to-end. Class 6 essentially provides multiple Class 1 circuits between a single initiator and multiple targets. Only the initiator transmits data frames, and targets transmit acknowledgements. Class F is essentially the same as Class 2 but is reserved for fabric-control traffic.

Class 3 is currently the focus of this book. The following paragraphs describe Class 3 in terms applicable to all ULPs. For details of how Class 3 delivery mechanisms are used by FCP, see chapter 8, "OSI Session, Presentation, and Application Layers." Class 3 implements the following delivery mechanisms:

  • Destination nodes can detect frames dropped in transit. This is accomplished via the SEQ_CNT field and the error detect time-out value (E_D_TOV). When a drop is detected, all subsequently received frames within that sequence (and possibly within that exchange) are discarded, and the ULP within the destination node is notified of the error. The source node is not notified. Source node notification is the responsibility of the ULP. For this reason, ULPs that do not implement their own delivery failure notification or delivery acknowledgement schemes should not be deployed in Class 3 networks. (FCP supports delivery failure detection via timeouts and Exchange status monitoring.) The frames of a sequence are buffered to be delivered to the ULP as a group. So, it is not possible for the ULP to receive only part of a sequence. The decision to retransmit just the affected sequence or the entire Exchange is made by the ULP within the initiator before originating the Exchange. The decision is conveyed to the target via the Abort Sequence Condition sub-field in the F_CTL field in the FC Header. It is called the Exchange Error Policy.

  • Destination nodes can detect duplicate frames. However, the current specifications do not explicitly state how duplicate frames should be handled. Duplicates can result only from actions taken by a sequence initiator or from frame forwarding errors within the network. A timer, the resource allocation time-out value (R_A_TOV), is used to avoid transmission of duplicate frames after a node is unexpectedly reset. However, a node that has a software bug, virus, or other errant condition could transmit duplicate frames. It is also possible for frame-forwarding errors caused by software bugs or other errant conditions within an FC switch to result in frame duplication. Recipient behavior in these scenarios is currently subject to vendor interpretation of the specifications.

  • FC devices can detect corrupt frames via the CRC field. Upon detection of a corrupt frame, the frame is dropped. If the frame is dropped by the destination node, the ULP is notified within the destination node, but the source node is not notified. If the frame is dropped by a switch, no notification is sent to the source or destination node. Some FC switches employ cut-through switching techniques and are unable to detect corrupt frames. Thus, corrupt frames are forwarded to the destination node and subsequently dropped. All FC switches produced by Cisco Systems employ a store-and-forward architecture capable of detecting and dropping corrupt frames.

  • Acknowledgement of successful frame delivery is not supported. (Note that some other Classes of Service support acknowledgement.)

  • Retransmission is not supported. (Note that some other Classes of Service support retransmission.) ULPs are expected to retransmit any data lost in transit. FCP supports retransmission. Likewise, SCSI supports retransmission by reissuing failed commands.

  • Link-level flow control is supported in a proactive manner. End-to-end flow control is not supported. (Note that some other Classes of Service support end-to-end flow control.) See chapter 9, "Flow Control and Quality of Service," for more information about flow control.

  • Bandwidth is not guaranteed. Monitoring and trending of bandwidth utilization on shared links is required to ensure optimal network operation. Oversubscription on shared links must be carefully calculated to avoid bandwidth starvation during peak periods. (Note that some other Classes of Service support bandwidth guarantees.)

  • Consistent latency is not guaranteed.

  • The specifications do not define methods for fragmentation or reassembly because the necessary header fields do not exist. An MTU mismatch results in frame drop. To avoid MTU mismatches, end nodes discover the MTU of intermediate network links via fabric login (FLOGI) during link initialization. A single MTU value is provided to end nodes during FLOGI, so all network links must use a common MTU size. End nodes also exchange MTU information during PLOGI (see chapter 7, "OSI Transport Layer"). Based on this information, transmitters do not send frames that exceed the MTU of any intermediate network link or the destination node.

  • Guaranteed in-order delivery of frames within a sequence is not required. Likewise, guaranteed in-order delivery of sequences within an exchange is not required. However, end nodes can request in-order delivery of frames during FLOGI. FC switches are not required to honor such requests. If honored, the entire network must support in-order delivery of frames within each sequence and in-order delivery of sequences within each exchange. This requires FC switch architectures, port channel load-balancing algorithms, and FSPF load-balancing algorithms to be specially developed to ensure in-order delivery of frames across load-balanced port channels and equal-cost FSPF paths, even during port-channel membership changes and network topology changes. All FC switches produced by Cisco Systems architecturally guarantee in-order delivery within a single switch. For multi-switch networks, all FC switches produced by Cisco Systems employ port-channel load-balancing algorithms and FSPF load-balancing algorithms that inherently ensure in-order delivery in a stable topology. For unstable networks, Cisco Systems provides an optional feature (disabled by default) called In-Order Delivery that ensures in-order frame delivery during topology changes. This is accomplished by intentionally delaying in-flight frames during topology changes. The feature must be enabled to honor (in a manner fully compliant with the FC-FS series of specifications) in-order delivery requests made by end nodes during FLOGI. Many modern HBA drivers do not request in-order delivery during FLOGI, so out-of-order frame delivery is possible in many Class 3 networks. End nodes can detect out-of-order frames via the Sequence Qualifier fields in combination with the SEQ_CNT field. The E_D_TOV timer begins immediately upon detection of an out-of-order frame. A frame error occurs if the missing frame is not received before E_D_TOV expiration. The events that follow a frame error are determined by the error-handling capabilities of each node, which are discovered during PLOGI between each pair of nodes (see chapter 7, "OSI Transport Layer"). Subject to these capabilities, an Exchange Error Policy is specified by the source node on a per-exchange basis. This policy determines whether the exchange recipient ignores frame errors (called the process policy), discards only the affected sequence upon detection of a frame error (called the single sequence discard policy) or discards the entire affected exchange upon detection of a frame error (called the multiple sequence discard policy). Receipt of an out-of-order frame is not considered a frame error if the missing frame is received before E_D_TOV expiration. Unfortunately, the specifications do not explicitly require or prohibit frame reordering within the destination node in this scenario. So, recipient behavior is determined by the HBA vendor's interpretation of the specifications. Currently, no HBA produced by any of the three leading HBA vendors (Emulex, QLogic, and JNI/AMCC) reorders frames in this scenario. Instead, this scenario is treated the same as the frame error scenario.

Note

Note that the ability of a destination node to reorder frames is present in every CoS because the Sequence Qualifier fields and SEQ_CNT field are contained in the general header format used by every CoS. However, the requirement for a recipient to reorder frames is established per CoS. This contrasts the IP model wherein each transport layer protocol uses a different header format. Thus, in the IP model, the choice of transport layer protocol determines the recipient's ability and requirement to reorder packets.


FC Link Aggregation

Currently, no standard exists for aggregation of multiple FC links into a port channel. Consequently, some FC switch vendors have developed proprietary methods. Link aggregation between FC switches produced by different vendors is possible, but functionality is limited by the dissimilar nature of the load-balancing algorithms. No FC switch vendors currently allow port channels between heterogeneous switches. Cisco Systems supports FC port channels in addition to automation of link aggregation. Automation of link aggregation is accomplished via Cisco's FC Port Channel Protocol (PCP). PCP is functionally similar to LACP and PAgP. PCP employs two sub-protocols: the bringup protocol and the autocreation protocol. The bringup protocol validates the configuration of the ports at each end of an ISL (for compatibility) and synchronizes Exchange status across each ISL to ensure symmetric data flow. The autocreation protocol aggregates compatible ISLs into a port channel. The full details of PCP have not been published, so further disclosure of PCP within this book is not possible. As with Ethernet, network administrators must be wary of several operational requirements. The following restrictions apply to FC port channels connecting two switches produced by Cisco Systems:

  • All links in a port channel must connect a single pair of devices. In other words, only point-to-point configurations are permitted.

  • All links in a port channel must operate at the same transmission rate.

  • If any link in a port channel is configured as non-trunking, all links in that port channel must be configured as non-trunking. Likewise, if any link in a port channel is configured as trunking, all links in that port channel must be configured as trunking.

  • All links in a non-trunking port channel must belong to the same VSAN.

  • All links in a trunking port channel must trunk the same set of VSANs.

The first two restrictions also apply to other FC switch vendors. The three VSAN-related restrictions only apply to Cisco Systems because VSANs are currently supported only by Cisco Systems. Several additional restrictions that do not apply to Cisco Systems do apply to other FC switch vendors. For example, one FC switch vendor mandates that only contiguous ports can be aggregated, and distance limitations apply because of the possibility of out-of-order frame delivery. Similar to Ethernet, the maximum number of links that may be grouped into a single port channel and the maximum number of port channels that may be configured on a single switch are determined by product design. Cisco Systems supports 16 links per FC port channel and 128 FC port channels per switch. These numbers currently exceed the limits of all other FC switch vendors.

FC Link Initialization

When a FC device is powered on, it begins the basic FC link initialization procedure. Unlike Ethernet, the media type is irrelevant to basic FC link initialization procedures. Like Ethernet, FC links may be manually configured or dynamically configured via auto-negotiation. 10GFC does not currently support auto-negotiation. Most HBAs and switch ports default to auto-negotiation mode. FC auto-negotiation is implemented in a peer-to-peer fashion. Following basic FC link initialization, one of several extended FC link initialization procedures occurs. The sequence of events that transpires is determined by the device types that are connected. The sequence of events differs for node-to-node, node-to-switch, switch-to-switch, and switch-to-bridge connections. Node-to-node connections are used for DAS configurations and are not discussed in this book.

The following basic FC link initialization procedure applies to all FC device types. Three state machines govern the basic FC link-initialization procedure: speed negotiation state machine (SNSM), loop port state machine (LPSM), and FC_Port state machine (FPSM). The SNSM executes first, followed by the LPSM, followed by the FPSM. This book does not discuss the LPSM. When a port (port A) is powered on, it starts its receiver transmitter time-out value (R_T_TOV) timer and begins transmitting OLS at its maximum supported transmission rate. If no receive signal is detected before R_T_TOV expiration, port A begins transmitting NOS at its maximum supported transmission rate and continues until another port is connected and powered on. When another port (port B) is connected and powered on, auto-negotiation of the transmission rate begins. The duplex mode is not auto-negotiated because switch-attached FC devices always operate in full-duplex mode. Port B begins transmitting OLS at its maximum supported transmission rate. Port A continues transmitting NOS at its maximum supported transmission rate. This continues for a specified period of time, then each port drops its transmission rate to the next lower supported rate and continues transmitting for the same period of time. This cycle repeats until a transmission rate match is found or all supported transmission rates (up to a maximum of four) have been attempted by each port.

During each transmission rate cycle, each port attempts to achieve bit-level synchronization and word alignment at each of its supported reception rates. Reception rates are cycled at least five times as fast as transmission rates so that five or more reception rates can be attempted during each transmission cycle. Each port selects its transmission rate based on the highest reception rate at which word alignment is achieved and continues transmission of OLS/NOS at the newly selected transmission rate. When both ports achieve word alignment at the new reception rate, auto-negotiation is complete. When a port is manually configured to operate at a single transmission rate, auto-negotiation remains enabled, but only the configured transmission rate is attempted. Thus, the peer port can achieve bit-level synchronization and word alignment at only one rate. If the configured transmission rate is not supported by the peer device, the network administrator must intervene. After auto-negotiation successfully completes, both ports begin listening for a Primitive Sequence. Upon recognition of three consecutive OLS ordered sets without error, port A begins transmitting LR. Upon recognition of three consecutive LR ordered sets without error, port B begins transmitting LRR to acknowledge recognition of the LR ordered sets. Upon recognition of three consecutive LRR ordered sets without error, port A begins transmitting Idle ordered sets. Upon recognition of the first Idle ordered set, port B begins transmitting Idle ordered sets. At this point, both ports are able to begin normal communication.

To understand the extended FC link initialization procedures, first we must understand FC port types. The ANSI T11 specifications define many port types. Each port type displays a specific behavior during initialization and normal operation. An end node HBA port is called a node port (N_Port). A switch port is called a fabric port (F_Port) when it is connected to an N_Port. A switch port is called an expansion port (E_Port) or a trunking E_Port (TE_Port) when it is connected to another switch port or to a bridge port (B_Port). A device that provides backbone connectivity as defined in the FC-BB specification series is called a bridge. Each bridge device contains at least one B_Port. A B_Port can only be connected to an E_Port or a TE_Port. A TE_Port is a VSAN-aware E_Port capable of conveying VSAN membership information on a frame-by-frame basis. TE_Ports were once proprietary to Cisco Systems but are now included in ANSI's new Virtual Fabric (VF) standard. Switch ports are often called generic ports or G_Ports because they can assume the behavior of more than one port type. If a node is connected, the switch port behaves as an F_Port; if a bridge is connected, the switch port behaves as an E_Port; if another switch is connected, the switch port behaves as an E_Port or TE_Port. To determine the appropriate port type, a switch port may adapt its behavior based on the observed behavior of the connected device during extended link initialization.

All FC switches produced by Cisco Systems behave in this manner. If the connected device does not display any specific behavior (that is, only Idle ordered sets are received), a Cisco Systems FC switch cannot determine which port type is appropriate. So, a wait timer is implemented to bound the wait period. Upon expiration of the wait timer, a Cisco Systems FC switch assumes the role of E_Port. Alternately, a switch port may sequentially assume the behavior of multiple port types during extended link initialization until it discovers the appropriate port type based on the reaction of the connected device. However, this approach can prevent some HBAs from initializing properly. Additional port types are defined for FC-AL environments, but those port types are not discussed in this book.

Note

Cisco Systems supports a feature called switch port analyzer (SPAN) on its Ethernet and FC switches. On FC switches, SPAN makes use of SPAN destination (SD) and SPAN trunk (ST) ports. These port types are currently proprietary to Cisco Systems. The SD port type and the SPAN feature are discussed in chapter 14, "Storage Protocol Decoding and Analysis."


When an N_Port is attached to a switch port, the extended link initialization procedure known as FLOGI is employed. FLOGI is mandatory for all N_Ports regardless of CoS, and communication with other N_Ports is not permitted until FLOGI completes. FLOGI is accomplished with a single-frame request followed by a single-frame response. In switched FC environments, FLOGI accomplishes the following tasks:

  • Determines the presence or absence of a switch

  • Provides the operating characteristics of the requesting N_Port to the switch

  • Provides the operating characteristics of the entire network to the requesting N_Port

  • Assigns an FCID to the requesting N_Port

  • Initializes the BB_Credit mechanism for link-level flow control

The ANSI T11 specifications do not explicitly state a required minimum or maximum number of Idles that must be transmitted before transmission of the FLOGI request. So, the amount of delay varies widely (between 200 microseconds and 1500 milliseconds) from one HBA model to the next. When the N_Port is ready to begin the FLOGI procedure, it transmits a FLOGI ELS frame with the S_ID field set to 0. Upon recognition of the FLOGI request, the switch port assumes the role of F_Port and responds with a FLOGI Link Services Accept (LS_ACC) ELS frame. The FLOGI LS_ACC ELS frame specifies the N_Port's newly assigned FCID via the D_ID field. Upon recognition of the FLOGI LS_ACC ELS frame, the FLOGI procedure is complete, and the N_Port is ready to communicate with other N_Ports. The FLOGI ELS and associated LS_ACC ELS use the same frame format, which is a standard FC frame containing link parameters in the data field. Figure 5-25 illustrates the data field format of an FLOGI/LS_ACC ELS frame.

Figure 5-25. Data Field Format of an FC FLOGI/LS_ACC ELS Frame


A brief description of each field follows:

  • LS Command Code 4 bytes long. It contains the 1-byte FLOGI command code (0x04) followed by 3 bytes of zeros when transmitted by an N_Port. This field contains the 1-byte LS_ACC command code (0x02) followed by 3 bytes of zeros when transmitted by an F_Port.

  • Common Service Parameters 16 bytes long. It contains parameters that affect network operation regardless of the CoS. Key parameters include the number of BB_Credits, the BB_Credit management policy, the BB_SC interval, the MTU, the R_T_TOV, the E_D_TOV, the R_A_TOV, the length of the FLOGI payload, and the transmitter's port type (N_Port or F_Port). Some parameters can be manually configured by the network administrator. If manually configured, only the values configured by the administrator will be advertised to the peer device.

  • N_Port Name 8 bytes long. It contains the PWWN of the N_Port. This field is not used by the responding F_Port.

  • Node Name/Fabric Name 8 bytes long. It contains the NWWN associated with the N_Port (FLOGI) or the switch (LS_ACC).

  • Class 1/6, 2, 3, and 4 Service Parameters Each is 16 bytes long. They contain class-specific parameters that affect network operation. Key parameters relevant to Class 3 include indication of support for Class 3, in-order delivery, priority/preemption, CS_CTL preference, DiffServ, and clock synchronization. Some parameters can be manually configured by the network administrator. If manually configured, only the values configured by the administrator will be advertised to the peer device.

  • Vendor Version Level 16 bytes long. It contains vendor-specific information.

  • Services Availability 8 bytes long. It is used only in LS_ACC frames. It indicates the availability of fabric services at the defined WKAs.

  • Login Extension Data Length 4 bytes long. It indicates the length of the Login Extension Data field expressed in 4-byte words.

  • Login Extension Data 120 bytes long. It contains the vendor identity and other vendor-specific information.

  • Clock Synchronization QoS 8 bytes long. It contains operational parameters relevant to the fabric's ability to deliver clock synchronization data. It is used only if the Clock Synchronization service is supported by the switch.

If the operating characteristics of an N_Port change after the N_Port completes FLOGI, the N_Port can update the switch via the FDISC ELS command. The FDISC ELS and associated LS_ACC ELS use the exact same frame format as FLOGI. The meaning of each field is also identical. The LS Command Code field contains the FDISC command code (0x51). The FDISC ELS enables N_Ports to update the switch without affecting any sequences or exchanges that are currently open. For the new operating characteristics to take affect, the N_Port must log out of the fabric and perform FLOGI again. An N_Port may also use FDISC to request assignment of additional FC Address Identifiers.

When a switch port is attached to another switch port, the switch port mode initialization state machine (SPMISM) governs the extended link initialization procedure. The SPMISM cannot be invoked until the LPSM and FPSM determine that there is no FC-AL or N_Port attached. Because the delay between basic link initialization and FLOGI request transmission is unspecified, each switch vendor must decide how its switches will determine whether an FC-AL or N_Port is attached to a newly initialized link. All FC switches produced by Cisco Systems wait 700 ms after link initialization for a FLOGI request. If no FLOGI request is received within that time, the LPSM and FPSM relinquish control to the SPMISM.

All FC switches behave the same once the SPMISM takes control. An exchange link parameters (ELP) SW_ILS frame is transmitted by one of the connected switch ports (the requestor). Upon recognition of the ELP SW_ILS frame, the receiving switch port (the responder) transmits an SW_ACC SW_ILS frame. Upon recognition of the SW_ACC SW_ILS frame, the requestor transmits an ACK frame. The ELP SW_ILS and SW_ACC SW_ILS both use the same frame format, which is a standard FC frame containing link parameters in the data field. The data field of an ELP/SW_ACC SW_ILS frame is illustrated in Figure 5-26.

Figure 5-26. Data Field Format of an FC ELP/SW_ACC SW_ILS Frame


Note

Each SW_ILS command that expects an SW_ACC response defines the format of the SW_ACC payload. Thus, there are many SW_ACC SW_ILS frame formats.


A brief description of each field follows:

  • SW_ILS Command Code 4 bytes long. This field contains the ELP command code (0x10000000) when transmitted by a requestor. This field contains the SW_ACC command code (0x02000000) when transmitted by a responder.

  • Revision 1 byte long. It indicates the ELP/SW_ACC protocol revision.

  • Flags 2 bytes long. It contains bit-oriented sub-fields that provide additional information about the transmitting port. Currently, only one flag is defined: the Bridge Port flag. If this flag is set to 1, the transmitting port is a B_Port. Otherwise, the transmitting port is an E_Port.

  • BB_SC_N 1 byte long. It indicates the BB_SC interval. The value of this field is meaningful only if the ISL Flow Control Mode field indicates that the R_RDY mechanism is to be used.

  • R_A_TOV 4 bytes long. It indicates the transmitter's required value for resource allocation timeout. All devices within a physical SAN or VSAN must agree upon a common R_A_TOV. Some FC switches allow the network administrator to configure this value manually.

  • E_D_TOV 4 bytes long. It indicates the transmitter's required value for error-detection timeout. All devices within a physical SAN or VSAN must agree upon a common E_D_TOV. Some FC switches allow the network administrator to configure this value manually.

  • Requestor/Responder Interconnect Port Name 8 bytes long. It indicates the PWWN of the transmitting switch port.

  • Requestor/Responder Switch Name 8 bytes long. It indicates the NWWN of the transmitting switch.

  • Class F Service Parameters 6 bytes long. It contains various E_Port operating parameters. Key parameters include the class-specific MTU (ULP buffer size), the maximum number of concurrent Class F sequences, the maximum number of concurrent sequences within each exchange, and the number of End-to-End_Credits (EE_Credits) supported. Some FC switches allow the network administrator to configure one or more of these values manually.

  • Class 1, 2, and 3 Interconnect Port Parameters Each is 4 bytes long. They contain class-specific parameters that affect network operation. Key parameters relevant to Class 3 include indication of support for Class 3 and in-order delivery. Also included is the class-specific MTU (ULP buffer size). Some FC switches allow the network administrator to configure one or more of these values manually.

  • Reserved 20 bytes long.

  • ISL Flow Control Mode 2 bytes long. It indicates whether the R_RDY mechanism or a vendor-specific mechanism is supported. On some FC switches, the flow-control mechanism is determined by the switch operating mode (native or interoperable).

  • Flow Control Parameter Length 2 bytes long. It indicates the length of the Flow Control Parameters field expressed in bytes.

  • Flow Control Parameters variable in length as determined by the value of the ISL Flow Control Mode field. If the R_RDY mechanism is used, this field has a fixed length of 20 bytes and contains two sub-fields: BB_Credit (4 bytes) and compatibility parameters (16 bytes). The BB_Credit sub-field indicates the total number of BB_Credits available to all Classes of Service. The compatibility parameters sub-field contains four parameters required for backward compatibility.

Following ELP, the ISL is reset to activate the new operating parameters. The ELP requestor begins transmitting LR ordered sets. Upon recognition of three consecutive LR ordered sets without error, the ELP responder begins transmitting LRR to acknowledge recognition of the LR ordered sets. Upon recognition of three consecutive LRR ordered sets without error, the ELP requestor begins transmitting Idle ordered sets. Upon recognition of the first Idle ordered set, the ELP responder begins transmitting Idle ordered sets. At this point, the switches are ready to exchange information about the routing protocols that they support via the exchange switch capabilities (ESC) procedure. The ESC procedure is optional, but all modern FC switches perform ESC. The ELP requestor transmits an ESC SW_ILS frame with the S_ID and D_ID fields each set to 0xFFFFFD. The ESC payload contains a list of routing protocols supported by the transmitter. Upon recognition of the ESC SW_ILS frame, the receiver selects a single routing protocol and transmits an SW_ACC SW_ILS frame indicating its selection in the payload. The S_ID and D_ID fields of the SW_ACC SW_ILS frame are each set to 0xFFFFFD. The ESC SW_ILS frame format is a standard FC frame containing a protocol list in the data field. The data field of an ESC SW_ILS frame is illustrated in Figure 5-27.

Figure 5-27. Data Field Format of an FC ESC SW_ILS Frame


A brief description of each field follows:

  • SW_ILS Command Code 1 byte long. It contains the first byte of the ESC command code (0x30). The first byte of the ESC command code is unique to the ESC command, so the remaining 3 bytes are truncated.

  • Reserved 1 byte long.

  • Payload Length 2 bytes long. It indicates the total length of all payload fields expressed in bytes.

  • Vendor ID String 8 bytes long. It contains the unique vendor identification string assigned by ANSI T10 to the manufacturer of the transmitting switch.

  • Protocol Descriptor #n multiple Protocol Descriptor fields may be present in the ESC SW_ILS frame. Each Protocol Descriptor field is 12 bytes long and indicates a protocol supported by the transmitter. Three sub-fields are included: vendor ID string (8 bytes), reserved (2 bytes), and protocol ID (2 bytes). The vendor ID string sub-field indicates whether the protocol is standard or proprietary. This sub-field contains all zeros if the protocol is a standard. Otherwise, this sub-field contains the unique vendor identification string assigned by ANSI T10 to the vendor that created the protocol. The reserved sub-field is reserved for future use. The protocol ID sub-field identifies the protocol. Values from 0x0000-0x7FFF indicate standard protocols. Values from 0x8000-0xFFFF indicate proprietary protocols. Currently, only two standard protocols are defined: FSPF (0x0002) and FSPF-Backbone (0x0001).

The corresponding SW_ACC SW_ILS frame format is a standard FC frame containing a single protocol descriptor in the data field. The data field of an SW_ACC SW_ILS frame corresponding to an ESC SW_ILS command is illustrated in Figure 5-28.

Figure 5-28. Data Field Format of an FC SW_ACC SW_ILS Frame for ESC


A brief description of each field follows:

  • SW_ILS Command Code 1 byte long. It contains the first byte of the SW_ACC command code (0x02). The first byte of the SW_ACC command code is unique to the SW_ACC command, so the remaining 3 bytes are truncated.

  • Reserved 3 bytes long.

  • Vendor ID String 8 bytes long. It contains the unique vendor identification string assigned by ANSI T10 to the manufacturer of the transmitting switch.

  • Accepted Protocol Descriptor 12 bytes long. It indicates the routing protocol selected by the transmitter. This field is copied from the ESC SW_ILS frame.

Following ESC, the switch ports optionally authenticate each other (see chapter 12, "Storage Network Security"). The port-level authentication procedure is relatively new. Thus, few modern FC switches support port-level authentication. That said, all FC switches produced by Cisco Systems support port-level authentication. Upon successful authentication (if supported), the ISL becomes active. Next, the PSS process ensues, followed by domain address assignment. After all Domain_IDs have been assigned, the zone exchange and merge procedure begins. Next, the FSPF routing protocol converges. Finally, RSCNs are generated. To summarize:

  • ELP is exchanged between E_Ports

  • The new ISL is reset to enable the operating parameters

  • ESC is exchanged between E_Ports

  • Optional E_Port authentication occurs

  • PSS occurs

  • Domain_IDs are re-assigned if a Domain_ID conflict exists

  • Zone merge occurs

  • The FSPF protocol converges

  • An SW_RSCN is broadcast to announce the new ISL

  • Name server updates are distributed

  • RSCNs are generated to announce the new name server records

When a switch port is attached to a bridge port, the switch-to-switch extended link initialization procedure is followed, but ELP is exchanged between the switch port and the bridge port. An equivalent SW_ILS, called exchange B_access parameters (EBP), is exchanged between the bridge ports across the WAN. For details about the EBP SW_ILS, see chapter 8, "OSI Session, Presentation, and Application Layers." Bridge ports are transparent to all inter-switch operations after ELP. Following ELP, the link is reset. ESC is then performed between the switch ports. Likewise, port-level authentication is optionally performed between the switch ports. The resulting ISL is called a virtual ISL (VISL). Figure 5-29 illustrates this topology.

Figure 5-29. FC VISL Across Bridge Devices


Any SW_ILS command may be rejected by the responding port via the switch internal link service reject (SW_RJT). A common SW_RJT format is used for all SW_ILS. Figure 5-30 illustrates the data field format of an SW_RJT frame.

Figure 5-30. Data Field Format of an FC SW_RJT Frame


A brief description of each field follows:

  • SW_ILS Command Code 4 bytes long. This field contains the SW_RJT command code (0x01000000).

  • Reserved 1 byte long.

  • Reason Code 1 byte long. It indicates why the SW_ILS command was rejected. A common set of reason codes is used for all SW_ILS commands. Table 5-11 summarizes the reason codes defined by the FC-SW-4 specification. All reason codes excluded from Table 5-11 are reserved.

  • Reason Code Explanation 1 byte long. It provides additional diagnostic information that complements the Reason Code field. A common set of reason code explanations is used for all SW_ILS commands. Table 5-12 summarizes the reason code explanations defined by the FC-SW-4 specification. All reason code explanations excluded from Table 5-12 are reserved.

  • Vendor Specific 1 byte long. When the Reason Code field is set to 0xFF, this field provides a vendor-specific reason code. When the Reason Code field is set to any value other than 0xFF, this field is ignored.

Table 5-11. SW_RJT Reason Codes

Reason Code

Description

0x01

Invalid SW_ILS Command Code

0x02

Invalid Revision Level

0x03

Logical Error

0x04

Invalid Payload Size

0x05

Logical Busy

0x07

Protocol Error

0x09

Unable To Perform Command Request

0x0B

Command Not Supported

0x0C

Invalid Attachment

0xFF

Vendor Specific Error


Table 5-12. SW_RJT Reason Code Explanations

Reason Code Explanation

Description

0x00

No Additional Explanation

0x01

Class F Service Parameter Error

0x03

Class "n" Service Parameter Error

0x04

Unknown Flow Control Code

0x05

Invalid Flow Control Parameters

0x0D

Invalid Port_Name

0x0E

Invalid Switch_Name

0x0F

R_A_TOV Or E_D_TOV Mismatch

0x10

Invalid Domain_ID_List

0x19

Command Already In Progress

0x29

Insufficient Resources Available

0x2A

Domain_ID Not Available

0x2B

Invalid Domain_ID

0x2C

Request Not Supported

0x2D

Link Parameters Not Yet Established

0x2E

Requested Domain_IDs Not Available

0x2F

E_Port Is Isolated

0x31

Authorization Failed

0x32

Authentication Failed

0x33

Incompatible Security Attribute

0x34

Checks In Progress

0x35

Policy Summary Not Equal

0x36

FC-SP Zoning Summary Not Equal

0x41

Invalid Data Length

0x42

Unsupported Command

0x44

Not Authorized

0x45

Invalid Request

0x46

Fabric Changing

0x47

Update Not Staged

0x48

Invalid Zone Set Format

0x49

Invalid Data

0x4A

Unable To Merge

0x4B

Zone Set Size Not Supported

0x50

Unable To Verify Connection

0x58

Requested Application Not Supported


The preceding descriptions of the FC link initialization procedures are simplified for the sake of clarity. For more detail about Primitive Sequence usage, speed negotiation states, the FPSM, port types, the SPMISM, frame formats, or B_Port operation, readers are encouraged to consult the ANSI T11 FC-FS, FC-LS, and FC-SW specification series.




Storage Networking Protocol Fundamentals
Storage Networking Protocol Fundamentals (Vol 2)
ISBN: 1587051605
EAN: 2147483647
Year: 2007
Pages: 196
Authors: James Long

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net