Compared to Ethernet, FC is a complex technology. FC attempts to provide functionality equivalent to that provided by Ethernet plus elements of IP, UDP, and TCP. So, it is difficult to compare FC to just Ethernet. FC promises to continue maturing at a rapid pace and is currently considered the switching technology of choice for block-level storage protocols. As block-level SANs proliferate, FC is expected to maintain its market share dominance in the short term. The long term is difficult to predict in the face of rapidly maturing IPS protocols, but FC already enjoys a sufficiently large installed base to justify a detailed examination of FC's inner workings. This section explores the details of FC operation at OSI Layers 1 and 2. FC Media, Connectors, Transceivers, and Operating RangesFC supports a very broad range of media, connectors, and transceivers. Today, most FC deployments are based on fiber media for end node and ISL connectivity. Most 1-Gbps FC products employ the SC style connector, and most 2-Gbps FC products employ the LC style. Copper media is used primarily for intra-enclosure connectivity. For this reason, copper media is not discussed herein. Table 5-7 summarizes the media, connectors, transceivers, and operating ranges that are specified in ANSI T11 FC-PH, FC-PH-2, FC-PI, FC-PI-2, and 10GFC. Data rates under 100 MBps are excluded from Table 5-7 because they are considered historical. The nomenclature used to represent each defined FC implementation is [data rate expressed in MBps]-[medium]-[transceiver]-[distance capability]. The distance capability designator represents a superset of the defined operating range for each implementation.
FC Encoding and SignalingThe following definitions and rules apply only to switched FC implementations. FC-AL is not discussed herein. All FC implementations use one of two encoding schemes. Table 5-8 lists the encoding scheme used by each FC and 10GFC implementation and the associated BER objective.
FC implementations operating at 100-MBps, 200-MBps, and 400-MBps use the 8B/10B encoding scheme. Only one of the control characters defined by the 8B/10B encoding scheme is used: K28.5. FC uses fixed-length ordered sets consisting of four characters. Each ordered set begins with K28.5. FC defines 31 ordered sets. FC uses ordered sets as frame delimiters, Primitive Signals, and Primitive Sequences. Multiple frame delimiters are defined so that additional information can be communicated. The start-of-frame (SOF) delimiters indicate the class of service being requested and the position of the frame within the sequence (first or subsequent). The end-of-frame (EOF) delimiters indicate the position of the frame within the sequence (intermediate or last) and whether the frame is valid or invalid or corrupt. Primitive Signals include idle, receiver_ready (R_RDY), virtual_circuit_ready (VC_RDY), buffer-to-buffer_state_change (BB_SC), and clock synchronization (SYN). Idles are transmitted in the absence of data traffic to maintain clock synchronization and in the presence of data traffic to maintain inter-frame spacing. An R_RDY is transmitted after processing a received frame to increment the receive buffer counter (the number of Buffer-to-Buffer_Credits [BB_Credits]) used for link-level flow control. A VC_RDY is transmitted only by a switch port after forwarding a received frame to increment the transmitting node's buffer counter used for link-level flow control within a virtual circuit. BB_SC permits BB_Credit recovery. SYN enables time synchronization of the internal clocks of attached nodes (similar to Network Time Protocol [NTP]). Primitive Signals are transmitted for as long as the transmitting device deems appropriate. By contrast, Primitive Sequences are transmitted continuously until the receiving device responds. Primitive Sequences are used to convey the state of a port, recover from certain types of errors, establish bit-level synchronization, and achieve word alignment. Primitive Sequences include offline state (OLS), not operational state (NOS), link reset (LR), and link reset response (LRR). The encoding scheme of CWDM and parallel implementations of 10GFC is a combination of definitions and rules taken from the CWDM and parallel implementations of 10GE and the lower-speed FC implementations. 10GFC uses the same seven control characters as 10GE. The rules for their use in 10GFC are the same as in 10GE. However, 10GFC ordered set definitions closely match those of lower-speed FC implementations. There are only six differences in the ordered set definitions. The 10GE Sync_Column, Skip_Column, Align_Column, Local_Fault, and Remote_Fault ordered sets are used in 10GFC. The NOS ordered set used in lower-speed FC implementations is not used in 10GFC (replaced by Remote_Fault). In total, 10GFC uses 35 fixed-length ordered sets. Each consists of four characters, but the composition of each 10GFC ordered set is different than the equivalent ordered set in lower-speed FC implementations. Serial implementations of 10GFC use the 64B/66B encoding scheme. The definitions and rules are unchanged from the 10GE implementation. Further details of each encoding scheme are outside the scope of this book. The 8B/10B encoding scheme is well documented in clause 5 of the ANSI T11 FC-FS-2 specification, and in clauses 9 and 12 of the ANSI T11 10GFC specification. The 64B/66B encoding scheme is well documented in clause 49 of the IEEE 802.3ae-2002 specification and in clause 13 of the ANSI T11 10GFC specification. FC Addressing SchemeFC employs an addressing scheme that directly maps to the SAM addressing scheme. FC uses WWNs to positively identify each HBA and port, which represent the equivalent of SAM device and port names, respectively. An FC WWN is a 64-bit value expressed in colon-separated hexadecimal notation such as 21:00:00:e0:8b:08:a5:44. There are many formats for FC WWNs, most of which provide universal uniqueness. Figure 5-13 illustrates the basic ANSI T11 WWN address format. Figure 5-13. Basic ANSI T11 WWN Address Format
A brief description of each field follows:
The Name field can contain a locally assigned address in any format, a mapped external address in the format defined by the NAA responsible for that address type, or a mapped external address in a modified format defined by the ANSI T11 subcommittee. External addresses are mapped into the Name field according to the rules defined in the FC-PH and FC-FS series of specifications. Six mappings are defined: IEEE MAC-48, IEEE extended, IEEE registered, IEEE registered extended, IEEE EUI-64, and IETF IPv4. The FC-DA specification series mandates the use of the IEEE MAC-48, IEEE extended, IEEE registered, or IEEE EUI-64 format to ensure universal uniqueness and interoperability. All six formats are described herein for the sake of completeness. Figure 5-14 illustrates the format of the Name field for containment of a locally assigned address. Figure 5-14. ANSI T11 Name Field Format for Locally Assigned Addresses
A brief description of each field follows:
Figure 5-15 illustrates the format of the Name field for containment of an IEEE MAC-48 address. Figure 5-15. ANSI T11 Name Field Format for IEEE MAC-48 Addresses
A brief description of each field follows:
Figure 5-16 illustrates the format of the Name field for containment of an IEEE extended address. Figure 5-16. ANSI T11 Name Field Format for IEEE Extended Addresses
A brief description of each field follows:
Figure 5-17 illustrates the format of the Name field for containment of an IEEE Registered address. Figure 5-17. ANSI T11 Name Field Format for IEEE Registered Addresses
A brief description of each field follows:
The IEEE registered extended format is atypical because it is the only WWN format that is not 64 bits long. An extra 64-bit field is appended, yielding a total WWN length of 128 bits. The extra length creates some interoperability issues. Figure 5-18 illustrates the format of the Name field for containment of an IEEE registered extended address. Figure 5-18. ANSI T11 Name Field Format for IEEE Registered Extended Addresses
A brief description of each field follows:
Figure 5-19 illustrates the format of the Name field for containment of an IEEE EUI-64 address. Figure 5-19. ANSI T11 Name Field Format for IEEE EUI-64 Addresses
A brief description of each field follows:
Figure 5-20 illustrates the format of the Name field for containment of an IETF IPv4 address. Figure 5-20. ANSI T11 Name Field Format for IETF IPv4 Addresses
A brief description of each field follows:
The FC equivalent of the SAM port identifier is the FC port identifier (Port_ID). The FC Port_ID is embedded in the FC Address Identifier. The FC Address Identifier consists of a hierarchical 24-bit value, and the lower 8 bits make up the Port_ID. The entire FC Address Identifier is sometimes referred to as the FCID (depending on the context). In this book, the phrase FC Address Identifier and the term FCID are used interchangeably except in the context of address assignment. The format of the 24-bit FCID remains unchanged from its original specification. This simplifies communication between FC devices operating at different speeds and preserves the legacy FC frame format. FCIDs are expressed in space-separated hexadecimal notation such as '0x64 03 E8'. Some devices omit the spaces when displaying FCIDs. Figure 5-21 illustrates the ANSI T11 Address Identifier format. Figure 5-21. ANSI T11 Address Identifier Format
A brief description of each field follows:
The SAM defines the concept of domain as the entire system of SCSI devices that interact with one another via a service delivery subsystem. FC implements this concept of domain via the first level of hierarchy in the FCID (Domain_ID). In a single-switch fabric, the Domain_ID represents the single switch. In a multi-switch fabric, the Domain_ID should represent all interconnected switches according to the SAM definition of domain. However, FC forwards frames between switches using the Domain_ID. So, each switch must be assigned a unique Domain_ID. To comply with the SAM definition of domain, the ANSI T11 FC-SW-3 specification explicitly allows multiple interconnected switches to share a single Domain_ID. However, the Domain_ID is not implemented in this manner by any FC switch currently on the market. The Area_ID can identify a group of ports attached to a single switch. The Area_ID may not span FC switches. The FC-SW-3 specification does not mandate how fabric ports should be grouped into an Area_ID. One common technique is to assign all ports in a single slot of a switch chassis to the same Area_ID. Other techniques can be implemented. The Port_ID provides a unique identity to each HBA port within each Area_ID. Alternately, the Area_ID field can be concatenated with the Port_ID field to create a 16-bit Port_ID. In this case, no port groupings exist. The FC standards allow multiple FC Address Identifiers to be associated with a single HBA. This is known as N_Port_ID virtualization (NPIV). NPIV enables multiple virtual initiators to share a single HBA by assigning each virtual initiator its own FC Address Identifier. The normal FLOGI procedure is used to acquire the first FC Address Identifier. Additional FC Address Identifiers are acquired using the discover F_port service parameters (FDISC) ELS. When using NPIV, all virtual initiators must share the receive buffers on the HBA. NPIV enhances server virtualization techniques by enabling FC-SAN security policies (such as zoning) and QoS policies to be enforced independently for each virtual server. Note that some HBA vendors call their NPIV implementation "virtual HBA technology." FC Name Assignment and ResolutionFC WWNs are "burned in" during the interface manufacturing process. Each HBA is assigned a node WWN (NWWN). Each port within an HBA is assigned a port WWN (PWWN). Some HBAs allow these values to be overridden by the network administrator. Locally administered WWNs are not guaranteed to be globally unique, so the factory-assigned WWNs are used in the vast majority of deployments. FC name resolution occurs immediately following link initialization. As discussed in chapter 3, "Overview of Network Operating Principles," each initiator and target registers with the FCNS. Following registration, each initiator queries the FCNS to discover accessible targets. Depending on the number and type of queries issued, the FCNS replies can contain the NWWN, PWWN, or FCID of some or all of the targets accessible by the initiator. Initiators may subsequently query the FCNS as needed. For example, when a new target comes online, an RSCN is sent to registered nodes. Upon receiving an RSCN, each initiator queries the FCNS for the details of the change. In doing so, the initiator discovers the NWWN, PWWN, or FCID of the new target. Two alternate methods are defined to enable nodes to directly query other nodes to resolve or update name-to-address mappings. This is accomplished using extended link service (ELS) commands. An ELS may comprise one or more frames per direction transmitted as a single sequence within a new Exchange. Most ELSs are defined in the FC-LS specification. The first ELS is called discover address (ADISC). ADISC may be used only after completion of the PLOGI process. Because the FCID of the destination node is required to initiate PLOGI, ADISC cannot be used to resolve name-to-address mappings. ADISC can be used to update a peer regarding a local FCID change during an active PLOGI session. Such a change is treated as informational and has no effect on the current active PLOGI session. The second ELS is called Fibre Channel Address Resolution Protocol (FARP). FARP may be used before using PLOGI. Thus, FARP can be used to resolve a NWWN or PWWN to an FCID. This enables an initiator that queries only for NWWN or PWWN following FCNS registration to issue a PLOGI to a target without querying the FCNS again. In theory, this can be useful in fabrics containing large numbers of targets that use dynamic-fluid FCIDs. However, the actual benefits are negligible. FARP can also be useful as a secondary mechanism to the FCNS in case the FCNS becomes unavailable. For example, some FC switches support a feature called hot code load that allows network administrators to upgrade the switch operating system without disrupting the flow of data frames. However, this feature halts all fabric services (including the FCNS) for an extended period of time. Thus, initiators that are dependent upon the FCNS cannot resolve FCIDs during the switch upgrade. In reality, this is not a concern because most initiators query the FCNS for NWWN, PWWN, and FCID following FCNS registration. To use FARP, the requestor must already know the NWWN or PWWN of the destination node so that the destination node can recognize itself as the intended responder upon receipt of a FARP request. Even though this is practical for some ULPs, FCP generally relies on the FCNS. FC Address Assignment and ResolutionBy default, a Domain_ID is dynamically assigned to each switch when a fabric comes online via the Domain_ID Distribution process. Each switch then assigns an FCID to each of its attached nodes via the Area_ID and Port_ID fields of the FC Address Identifier. A single switch, known as the principal switch (PS), controls the Domain_ID Distribution process. The PS is dynamically selected via the principal switch selection (PSS) process. The PSS process occurs automatically upon completion of the extended link initialization process. The PSS process involves the following events, which occur in the order listed:
Upon successful completion of the PSS process, the Domain_ID Distribution process ensues. Domain_IDs can be manually assigned by the network administrator, but the Domain_ID Distribution process still executes so the PS (also known as the domain address manager) can compile a list of all assigned Domain_IDs, ensure there are no overlapping Domain_IDs, and distribute the complete Domain_ID_List to all other switches in the fabric. The Domain_ID Distribution process involves the following events, which occur in the order listed:
The preceding descriptions of the PSS and Domain_ID Distribution processes are simplified to exclude error conditions and other contingent scenarios. For more information about these processes, readers are encouraged to consult the ANSI T11 FC-SW-3 specification. The eight-bit Domain_ID field mathematically accommodates 256 Domain_IDs, but some Domain_IDs are reserved. Only 239 Domain_IDs are available for use as FC switch identifiers. Table 5-9 lists all FC Domain_ID values and the status and usage of each.
As the preceding table indicates, some Domain_IDs are reserved for use in WKAs. Some WKAs facilitate access to fabric services. Table 5-10 lists the currently defined FC WKAs and the fabric service associated with each.
In the context of address assignment mechanisms, the term FCID refers only to the Area_ID and Port_ID fields of the FC Address Identifier. These two values can be assigned dynamically by the FC switch or statically by either the FC switch or the network administrator. Dynamic FCID assignment can be fluid or persistent. With dynamic-fluid assignment, FCID assignments may be completely randomized each time an HBA port boots or resets. With dynamic-persistent assignment, the first assignment of an FCID to an HBA port may be completely randomized, but each subsequent boot or reset of that HBA port will result in reassignment of the same FCID. With static assignment, the first assignment of an FCID to an HBA port is predetermined by the software design of the FC switch or by the network administrator, and persistence is inherent in both cases. FC Address Identifiers are not required to be universally unique. In fact, the entire FC address space is available for use within each physical fabric. Likewise, the entire FC address space is available for use within each VSAN. This increases the scalability of each physical fabric that contains multiple VSANs. However, reusing the entire FC address space can prevent physical fabrics or VSANs from being non-disruptively merged due to potential address conflicts. Reusing the entire FC address space also prevents communication between physical fabrics via SAN routers and between VSANs via inter-VSAN routing (IVR) unless network address translation (NAT) is employed. NAT improves scalability by allowing reuse of the entire FC address space while simultaneously facilitating communication across physical fabric boundaries and across VSAN boundaries. However, because NAT negates universal FC Address Identifier uniqueness, potential address conflicts can still exist, and physical fabric/VSAN mergers can still be disruptive. NAT also increases configuration complexity, processing overhead and management overhead. So, NAT represents a tradeoff between communication flexibility and configuration simplicity. Address reservation schemes facilitate communication between physical fabrics or VSANs without using NAT by ensuring that there is no overlap between the addresses assigned within each physical fabric or VSAN. A unique portion of the FC address space is used within each physical fabric or VSAN. This has the effect of limiting the scalability of all interconnected physical fabrics or VSANs to a single instance of the FC address space. However, address reservation schemes eliminate potential address conflicts, so physical fabrics or VSANs can be merged non-disruptively. Note that some host operating systems use the FC Address Identifier of target ports to positively identify target ports, which is the stated purpose of PWWNs. Such operating systems require the use of dynamic-persistent or static FCIDs in combination with dynamic-persistent or static Domain_IDs. Note also that the processing of preferred Domain_IDs during the PSS process guarantees Domain_ID persistence in most cases without administrative intervention. In other words, the PSS process employs a dynamic-persistent Domain_ID assignment mechanism by default. However, merging two physical fabrics (or two VSANs) into one can result in Domain_ID conflicts. Thus, static Domain_ID assignment is required to achieve the highest availability of targets in the presence of host operating systems that use FC Address Identifiers to positively identify target ports. As long as static Domain_IDs are used, and the network administrator takes care to assign unique Domain_IDs across physical fabrics (or VSANs) via an address reservation scheme, dynamic-persistent FCID assignment can be used in place of static FCIDs without risk of address conflicts during physical fabric (or VSAN) mergers. An HBA's FC Address Identifier is used as the destination address in all unicast frames sent to that HBA and as the source address in all frames (unicast, multicast or broadcast) transmitted from that HBA. Two exceptions to the source address rule are defined: one related to FCID assignment (see the FC Link Initialization section) and another related to Class 6 multicast frames. FC multicast addressing is currently outside the scope of this book. Broadcast traffic is sent to the reserved FC Address Identifier 0x'FF FF FF'. Broadcast traffic delivery is subject to operational parameters such as zoning policy and class of service. All FC devices that receive a frame sent to the broadcast address accept the frame and process it accordingly. Note In FC, multicast addresses are also called Alias addresses. This should not be confused with PWWN aliases that are optionally used during zoning operations. Another potential point of confusion is Hunt Group addressing, which involves the use of Alias addresses in a particular manner. Hunt Groups are currently outside the scope of this book. FC implements only OSI Layer 2 addresses, so address resolution is not required to transport SCSI. Note that address resolution is required to transport IP. RFC 2625, IP, and ARP over Fibre Channel (IPFC), defines ARP operation in FC environments. ARP over FC can complement FARP, or FARP can be used independently. ARP over FC facilitates dynamic resolution of an IP address to a PWWN. The PWWN is then resolved to an FC Address Identifier using FARP. Alternately, FARP can be used to directly resolve an IP address to an FC Address Identifier. FARP operation is very similar to ARP operation, but FARP can also solicit a PLOGI from the destination node instead of a FARP reply. Regardless of how ARP and FARP are used, the IP address of the destination node must be known to the requestor before transmitting the resolution request. This is consistent with the ARP over Ethernet model described in the preceding Ethernet section of this chapter. As with ARP, system administrators can create static mappings in the FARP table on each host. Typically, we use static mappings only in special situations to accomplish a particular goal. FC Media AccessAs stated in chapter 3, "Overview of Network Operating Principles," FC-AL is a shared media implementation, so it requires some form of media access control. However, we use FC-AL primarily for embedded applications (such as connectivity inside a tape library) today, so the FC-AL arbitration mechanism is currently outside the scope of this book. In switched FC implementations, arbitration is not required because full-duplex communication is employed. Likewise, the FC PTP topology used for DAS configurations supports full-duplex communication and does not require arbitration. FC Network BoundariesTraditional FC-SANs are physically bounded by media terminations (for example, unused switch ports) and end node interfaces (for example, HBAs). No control information or user data can be transmitted between FC-SANs across physical boundaries. Figure 5-22 illustrates the physical boundaries of a traditional FC-SAN. Figure 5-22. Traditional FC-SAN Boundaries
FC-SANs also have logical boundaries, but the definition of a logical boundary in Ethernet networks does not apply to FC-SANs. Like the Ethernet architecture, the FC architecture does not define any native functionality at OSI Layer 3. However, Ethernet is used in conjunction with autonomous OSI Layer 3 protocols as a matter of course, so logical boundaries can be easily identified at each OSI Layer 3 entity. By contrast, normal FC communication does not employ autonomous OSI Layer 3 protocols. So, some OSI Layer 2 control information must be transmitted between FC-SANs across logical boundaries to facilitate native communication of user data between FC-SANs. Currently, there is no standard method of facilitating native communication between FC-SANs. Leading FC switch vendors have created several proprietary methods. The ANSI T11 subcommittee is considering all methods, and a standard method is expected in 2006 or 2007. Because of the proprietary and transitory nature of the current methods, further exploration of this topic is currently outside the scope of this book. Note that network technologies autonomous from FC can be employed to facilitate communication between FC-SANs. Non-native FC transports are defined in the FC-BB specification series. chapter 8, "OSI Session, Presentation and Application Layers," discusses one such transport (Fibre Channel over TCP/IP [FCIP]) in detail. FC-SANs also can have virtual boundaries. There is currently only one method of creating virtual FC-SAN boundaries. Invented in 2002 by Cisco Systems, VSANs are now widely deployed in the FC-SAN market. In 2004, ANSI began researching alternative solutions for virtualization of FC-SAN boundaries. In 2005, ANSI selected Cisco's VSAN technology as the basis for the only standards-based solution (called Virtual Fabrics). The new standards (FC-SW-4, FC-FS-2, and FC-LS) are expected to be finalized in 2006. VSANs are similar to VLANs in the way traffic isolation is provided. Typically, each switch port is statically assigned to a single VSAN by the network administrator. Alternately, each switch port can be dynamically assigned to a VSAN via Cisco's dynamic port VSAN membership (DPVM) technology. DPVM is similar in function to Ethernet's GVRP. Like Ethernet, an FC switch port can belong to multiple VSANs. However, this is used exclusively on ISLs; HBAs do not currently support VSAN trunking. As frames enter a switch from an end node, the switch prepends a tag to indicate the VSAN membership of the ingress port. The tag remains intact until the frame reaches the egress switch port that connects the destination end node. The switch removes the tag and transmits the frame to the destination end node. FC switches made by Cisco Systems use VSAN tags to ensure that no frames are forwarded between VSANs. Thus, VSAN boundaries mimic physical FC-SAN boundaries. User data can be forwarded between VSANs only via IVR. IVR is one of the native FC logical boundaries alluded to in the preceding paragraph. IVR can be used with all of the non-native FC transports defined in the FC-BB specification series. VSANs provide additional functionality not provided by VLANs. The FC specifications outline a model in which all network services (for example, the zone server) may run on one or more FC switches. This contrasts the TCP/IP model, in which network services other than routing protocols typically run on one or more hosts attached to the network (for example, a DHCP server). The FC service model enables switch vendors to instantiate independent network services within each VSAN during the VSAN creation process. This is the case with FC switches made by Cisco Systems. A multi-VSAN FC switch has an instance of each network service operating independently within each VSAN. This enables network administrators to achieve higher availability, security, and flexibility by providing complete isolation between VSANs. When facilitating communication between VSANs, IVR selectively exports control information bidirectionally between services in the affected VSANs without fusing the services. This is similar in concept to route redistribution between dissimilar IP routing protocols. The result is preservation of the service isolation model. FC Frame FormatsFC uses one general frame format for many purposes. The general frame format has not changed since the inception of FC. The specific format of an FC frame is determined by the function of the frame. FC frames are word-oriented, and an FC word is 4 bytes. Figure 5-23 illustrates the general FC frame format. Figure 5-23. General FC Frame FormatA brief description of each field follows:
As mentioned in chapter 3, "Overview of Network Operating Principles," the header in the general FC frame format provides functionality at multiple OSI layers. This contrasts the layered header model used in TCP/IP/Ethernet networks wherein a distinct header is present for each protocol operating at OSI Layers 2-4. Figure 5-24 illustrates the FC Header format. Figure 5-24. FC Header FormatA brief description of each field follows:
The S_ID, D_ID, OX_ID, and RX_ID fields are collectively referred to as the fully qualified exchange identifier (FQXID). The S_ID, D_ID, OX_ID, RX_ID, and SEQ_ID fields are collectively referred to as the sequence qualifier. The fields of the sequence qualifier can be used together in several ways. The preceding descriptions of these fields are highly simplified and apply only to FCP. FC implements many control frames to facilitate link, fabric, and session management. Many of the control frames carry additional information within the Data field. Comprehensive exploration of all the control frames and their payloads is outside the scope of this book, but certain control frames are explored in subsequent chapters. For more information about the general FC frame format, readers are encouraged to consult the ANSI T11 FC-FS-2 specification. For more information about control frame formats, readers are encouraged to consult the ANSI T11 FC-FS-2, FC-SW-3, FC-GS-3, FC-LS, and FC-SP specifications. FC Delivery MechanismsLike Ethernet, FC supports several delivery mechanisms. Each set of delivery mechanisms is called a class of service (CoS). Currently, there are six CoS definitions:
Note Class 5 was abandoned before completion. Class 5 was never included in any ANSI standard. Classes 1, 2, 3, 4, and 6 are referred to collectively as Class N services; the N stands for node. The F in Class F stands for fabric because Class F traffic can never leave the fabric. In other words, Class F traffic can never be accepted from or transmitted to a node and may be exchanged only between fabric infrastructure devices such as switches and bridges. FC devices are not required to support all six classes. Classes 1, 4, and 6 are not currently supported on any modern FC switches. Classes 2 and 3 are supported on all modern FC switches. Class 3 is currently the default service on all modern FC switches, and most FC-SANs operate in Class 3 mode. Class F support is mandatory on all FC switches. Class 1 provides a dedicated circuit between two end nodes (conceptually similar to ATM CES). Class 1 guarantees full bandwidth end-to-end. Class 2 provides reliable delivery without requiring a circuit to be established. All delivered frames are acknowledged, and all delivery failures are detected and indicated to the source node. Class 3 provides unreliable delivery that is roughly equivalent to Ethernet Type 1 service. Class 4 provides a virtual circuit between two end nodes. Class 4 is similar to Class 1 but guarantees only fractional bandwidth end-to-end. Class 6 essentially provides multiple Class 1 circuits between a single initiator and multiple targets. Only the initiator transmits data frames, and targets transmit acknowledgements. Class F is essentially the same as Class 2 but is reserved for fabric-control traffic. Class 3 is currently the focus of this book. The following paragraphs describe Class 3 in terms applicable to all ULPs. For details of how Class 3 delivery mechanisms are used by FCP, see chapter 8, "OSI Session, Presentation, and Application Layers." Class 3 implements the following delivery mechanisms:
Note Note that the ability of a destination node to reorder frames is present in every CoS because the Sequence Qualifier fields and SEQ_CNT field are contained in the general header format used by every CoS. However, the requirement for a recipient to reorder frames is established per CoS. This contrasts the IP model wherein each transport layer protocol uses a different header format. Thus, in the IP model, the choice of transport layer protocol determines the recipient's ability and requirement to reorder packets. FC Link AggregationCurrently, no standard exists for aggregation of multiple FC links into a port channel. Consequently, some FC switch vendors have developed proprietary methods. Link aggregation between FC switches produced by different vendors is possible, but functionality is limited by the dissimilar nature of the load-balancing algorithms. No FC switch vendors currently allow port channels between heterogeneous switches. Cisco Systems supports FC port channels in addition to automation of link aggregation. Automation of link aggregation is accomplished via Cisco's FC Port Channel Protocol (PCP). PCP is functionally similar to LACP and PAgP. PCP employs two sub-protocols: the bringup protocol and the autocreation protocol. The bringup protocol validates the configuration of the ports at each end of an ISL (for compatibility) and synchronizes Exchange status across each ISL to ensure symmetric data flow. The autocreation protocol aggregates compatible ISLs into a port channel. The full details of PCP have not been published, so further disclosure of PCP within this book is not possible. As with Ethernet, network administrators must be wary of several operational requirements. The following restrictions apply to FC port channels connecting two switches produced by Cisco Systems:
The first two restrictions also apply to other FC switch vendors. The three VSAN-related restrictions only apply to Cisco Systems because VSANs are currently supported only by Cisco Systems. Several additional restrictions that do not apply to Cisco Systems do apply to other FC switch vendors. For example, one FC switch vendor mandates that only contiguous ports can be aggregated, and distance limitations apply because of the possibility of out-of-order frame delivery. Similar to Ethernet, the maximum number of links that may be grouped into a single port channel and the maximum number of port channels that may be configured on a single switch are determined by product design. Cisco Systems supports 16 links per FC port channel and 128 FC port channels per switch. These numbers currently exceed the limits of all other FC switch vendors. FC Link InitializationWhen a FC device is powered on, it begins the basic FC link initialization procedure. Unlike Ethernet, the media type is irrelevant to basic FC link initialization procedures. Like Ethernet, FC links may be manually configured or dynamically configured via auto-negotiation. 10GFC does not currently support auto-negotiation. Most HBAs and switch ports default to auto-negotiation mode. FC auto-negotiation is implemented in a peer-to-peer fashion. Following basic FC link initialization, one of several extended FC link initialization procedures occurs. The sequence of events that transpires is determined by the device types that are connected. The sequence of events differs for node-to-node, node-to-switch, switch-to-switch, and switch-to-bridge connections. Node-to-node connections are used for DAS configurations and are not discussed in this book. The following basic FC link initialization procedure applies to all FC device types. Three state machines govern the basic FC link-initialization procedure: speed negotiation state machine (SNSM), loop port state machine (LPSM), and FC_Port state machine (FPSM). The SNSM executes first, followed by the LPSM, followed by the FPSM. This book does not discuss the LPSM. When a port (port A) is powered on, it starts its receiver transmitter time-out value (R_T_TOV) timer and begins transmitting OLS at its maximum supported transmission rate. If no receive signal is detected before R_T_TOV expiration, port A begins transmitting NOS at its maximum supported transmission rate and continues until another port is connected and powered on. When another port (port B) is connected and powered on, auto-negotiation of the transmission rate begins. The duplex mode is not auto-negotiated because switch-attached FC devices always operate in full-duplex mode. Port B begins transmitting OLS at its maximum supported transmission rate. Port A continues transmitting NOS at its maximum supported transmission rate. This continues for a specified period of time, then each port drops its transmission rate to the next lower supported rate and continues transmitting for the same period of time. This cycle repeats until a transmission rate match is found or all supported transmission rates (up to a maximum of four) have been attempted by each port. During each transmission rate cycle, each port attempts to achieve bit-level synchronization and word alignment at each of its supported reception rates. Reception rates are cycled at least five times as fast as transmission rates so that five or more reception rates can be attempted during each transmission cycle. Each port selects its transmission rate based on the highest reception rate at which word alignment is achieved and continues transmission of OLS/NOS at the newly selected transmission rate. When both ports achieve word alignment at the new reception rate, auto-negotiation is complete. When a port is manually configured to operate at a single transmission rate, auto-negotiation remains enabled, but only the configured transmission rate is attempted. Thus, the peer port can achieve bit-level synchronization and word alignment at only one rate. If the configured transmission rate is not supported by the peer device, the network administrator must intervene. After auto-negotiation successfully completes, both ports begin listening for a Primitive Sequence. Upon recognition of three consecutive OLS ordered sets without error, port A begins transmitting LR. Upon recognition of three consecutive LR ordered sets without error, port B begins transmitting LRR to acknowledge recognition of the LR ordered sets. Upon recognition of three consecutive LRR ordered sets without error, port A begins transmitting Idle ordered sets. Upon recognition of the first Idle ordered set, port B begins transmitting Idle ordered sets. At this point, both ports are able to begin normal communication. To understand the extended FC link initialization procedures, first we must understand FC port types. The ANSI T11 specifications define many port types. Each port type displays a specific behavior during initialization and normal operation. An end node HBA port is called a node port (N_Port). A switch port is called a fabric port (F_Port) when it is connected to an N_Port. A switch port is called an expansion port (E_Port) or a trunking E_Port (TE_Port) when it is connected to another switch port or to a bridge port (B_Port). A device that provides backbone connectivity as defined in the FC-BB specification series is called a bridge. Each bridge device contains at least one B_Port. A B_Port can only be connected to an E_Port or a TE_Port. A TE_Port is a VSAN-aware E_Port capable of conveying VSAN membership information on a frame-by-frame basis. TE_Ports were once proprietary to Cisco Systems but are now included in ANSI's new Virtual Fabric (VF) standard. Switch ports are often called generic ports or G_Ports because they can assume the behavior of more than one port type. If a node is connected, the switch port behaves as an F_Port; if a bridge is connected, the switch port behaves as an E_Port; if another switch is connected, the switch port behaves as an E_Port or TE_Port. To determine the appropriate port type, a switch port may adapt its behavior based on the observed behavior of the connected device during extended link initialization. All FC switches produced by Cisco Systems behave in this manner. If the connected device does not display any specific behavior (that is, only Idle ordered sets are received), a Cisco Systems FC switch cannot determine which port type is appropriate. So, a wait timer is implemented to bound the wait period. Upon expiration of the wait timer, a Cisco Systems FC switch assumes the role of E_Port. Alternately, a switch port may sequentially assume the behavior of multiple port types during extended link initialization until it discovers the appropriate port type based on the reaction of the connected device. However, this approach can prevent some HBAs from initializing properly. Additional port types are defined for FC-AL environments, but those port types are not discussed in this book. Note Cisco Systems supports a feature called switch port analyzer (SPAN) on its Ethernet and FC switches. On FC switches, SPAN makes use of SPAN destination (SD) and SPAN trunk (ST) ports. These port types are currently proprietary to Cisco Systems. The SD port type and the SPAN feature are discussed in chapter 14, "Storage Protocol Decoding and Analysis." When an N_Port is attached to a switch port, the extended link initialization procedure known as FLOGI is employed. FLOGI is mandatory for all N_Ports regardless of CoS, and communication with other N_Ports is not permitted until FLOGI completes. FLOGI is accomplished with a single-frame request followed by a single-frame response. In switched FC environments, FLOGI accomplishes the following tasks:
The ANSI T11 specifications do not explicitly state a required minimum or maximum number of Idles that must be transmitted before transmission of the FLOGI request. So, the amount of delay varies widely (between 200 microseconds and 1500 milliseconds) from one HBA model to the next. When the N_Port is ready to begin the FLOGI procedure, it transmits a FLOGI ELS frame with the S_ID field set to 0. Upon recognition of the FLOGI request, the switch port assumes the role of F_Port and responds with a FLOGI Link Services Accept (LS_ACC) ELS frame. The FLOGI LS_ACC ELS frame specifies the N_Port's newly assigned FCID via the D_ID field. Upon recognition of the FLOGI LS_ACC ELS frame, the FLOGI procedure is complete, and the N_Port is ready to communicate with other N_Ports. The FLOGI ELS and associated LS_ACC ELS use the same frame format, which is a standard FC frame containing link parameters in the data field. Figure 5-25 illustrates the data field format of an FLOGI/LS_ACC ELS frame. Figure 5-25. Data Field Format of an FC FLOGI/LS_ACC ELS FrameA brief description of each field follows:
If the operating characteristics of an N_Port change after the N_Port completes FLOGI, the N_Port can update the switch via the FDISC ELS command. The FDISC ELS and associated LS_ACC ELS use the exact same frame format as FLOGI. The meaning of each field is also identical. The LS Command Code field contains the FDISC command code (0x51). The FDISC ELS enables N_Ports to update the switch without affecting any sequences or exchanges that are currently open. For the new operating characteristics to take affect, the N_Port must log out of the fabric and perform FLOGI again. An N_Port may also use FDISC to request assignment of additional FC Address Identifiers. When a switch port is attached to another switch port, the switch port mode initialization state machine (SPMISM) governs the extended link initialization procedure. The SPMISM cannot be invoked until the LPSM and FPSM determine that there is no FC-AL or N_Port attached. Because the delay between basic link initialization and FLOGI request transmission is unspecified, each switch vendor must decide how its switches will determine whether an FC-AL or N_Port is attached to a newly initialized link. All FC switches produced by Cisco Systems wait 700 ms after link initialization for a FLOGI request. If no FLOGI request is received within that time, the LPSM and FPSM relinquish control to the SPMISM. All FC switches behave the same once the SPMISM takes control. An exchange link parameters (ELP) SW_ILS frame is transmitted by one of the connected switch ports (the requestor). Upon recognition of the ELP SW_ILS frame, the receiving switch port (the responder) transmits an SW_ACC SW_ILS frame. Upon recognition of the SW_ACC SW_ILS frame, the requestor transmits an ACK frame. The ELP SW_ILS and SW_ACC SW_ILS both use the same frame format, which is a standard FC frame containing link parameters in the data field. The data field of an ELP/SW_ACC SW_ILS frame is illustrated in Figure 5-26. Figure 5-26. Data Field Format of an FC ELP/SW_ACC SW_ILS FrameNote Each SW_ILS command that expects an SW_ACC response defines the format of the SW_ACC payload. Thus, there are many SW_ACC SW_ILS frame formats. A brief description of each field follows:
Following ELP, the ISL is reset to activate the new operating parameters. The ELP requestor begins transmitting LR ordered sets. Upon recognition of three consecutive LR ordered sets without error, the ELP responder begins transmitting LRR to acknowledge recognition of the LR ordered sets. Upon recognition of three consecutive LRR ordered sets without error, the ELP requestor begins transmitting Idle ordered sets. Upon recognition of the first Idle ordered set, the ELP responder begins transmitting Idle ordered sets. At this point, the switches are ready to exchange information about the routing protocols that they support via the exchange switch capabilities (ESC) procedure. The ESC procedure is optional, but all modern FC switches perform ESC. The ELP requestor transmits an ESC SW_ILS frame with the S_ID and D_ID fields each set to 0xFFFFFD. The ESC payload contains a list of routing protocols supported by the transmitter. Upon recognition of the ESC SW_ILS frame, the receiver selects a single routing protocol and transmits an SW_ACC SW_ILS frame indicating its selection in the payload. The S_ID and D_ID fields of the SW_ACC SW_ILS frame are each set to 0xFFFFFD. The ESC SW_ILS frame format is a standard FC frame containing a protocol list in the data field. The data field of an ESC SW_ILS frame is illustrated in Figure 5-27. Figure 5-27. Data Field Format of an FC ESC SW_ILS FrameA brief description of each field follows:
The corresponding SW_ACC SW_ILS frame format is a standard FC frame containing a single protocol descriptor in the data field. The data field of an SW_ACC SW_ILS frame corresponding to an ESC SW_ILS command is illustrated in Figure 5-28. Figure 5-28. Data Field Format of an FC SW_ACC SW_ILS Frame for ESCA brief description of each field follows:
Following ESC, the switch ports optionally authenticate each other (see chapter 12, "Storage Network Security"). The port-level authentication procedure is relatively new. Thus, few modern FC switches support port-level authentication. That said, all FC switches produced by Cisco Systems support port-level authentication. Upon successful authentication (if supported), the ISL becomes active. Next, the PSS process ensues, followed by domain address assignment. After all Domain_IDs have been assigned, the zone exchange and merge procedure begins. Next, the FSPF routing protocol converges. Finally, RSCNs are generated. To summarize:
When a switch port is attached to a bridge port, the switch-to-switch extended link initialization procedure is followed, but ELP is exchanged between the switch port and the bridge port. An equivalent SW_ILS, called exchange B_access parameters (EBP), is exchanged between the bridge ports across the WAN. For details about the EBP SW_ILS, see chapter 8, "OSI Session, Presentation, and Application Layers." Bridge ports are transparent to all inter-switch operations after ELP. Following ELP, the link is reset. ESC is then performed between the switch ports. Likewise, port-level authentication is optionally performed between the switch ports. The resulting ISL is called a virtual ISL (VISL). Figure 5-29 illustrates this topology. Figure 5-29. FC VISL Across Bridge DevicesAny SW_ILS command may be rejected by the responding port via the switch internal link service reject (SW_RJT). A common SW_RJT format is used for all SW_ILS. Figure 5-30 illustrates the data field format of an SW_RJT frame. Figure 5-30. Data Field Format of an FC SW_RJT FrameA brief description of each field follows:
The preceding descriptions of the FC link initialization procedures are simplified for the sake of clarity. For more detail about Primitive Sequence usage, speed negotiation states, the FPSM, port types, the SPMISM, frame formats, or B_Port operation, readers are encouraged to consult the ANSI T11 FC-FS, FC-LS, and FC-SW specification series. |