Ethernet


Many people misunderstand the current capabilities of Ethernet because of lingering preconceptions formed during the early days of Ethernet. In its pre-switching era, Ethernet had some severe limitations. However, most of Ethernet's major drawbacks have been eliminated by the widespread adoption of switching and other technological advances. This section explains how Ethernet has evolved to become the most broadly applicable LAN technology in history, and how Ethernet provides the foundation for new services (like storage networking) that might not be considered LAN-friendly.

Low Overhead Paradigm

To fully appreciate Ethernet, you need to understand that Ethernet began as a low overhead, high efficiency alternative to competing communication technologies. Simplicity was at the heart of the original Ethernet specification developed by Robert Metcalfe and David Boggs in 1973. Ethernet really began to penetrate the LAN market after the IEEE became involved and produced the 802.3 specification. ARCNET and Token Ring were Ethernet's primary competitors when 802.3 debuted. For the remainder of this book, the term Ethernet shall refer to the IEEE 802.3 specification unless otherwise noted.

Ethernet was not designed to outperform supplementary LAN technologies in every way, but rather to provide an alternative that was inexpensive and easy to deploy and support. At 10 Mbps, Ethernet was much faster than ARCNET (2.5 Mbps at that time) and Token Ring (4 Mbps at that time). This was perceived to be advantageous, though the 10 Mbps transmission rate was rarely achieved in real world deployments. Fortunately for Ethernet, very few people realized that at the time. Ethernet had another advantage over ARCNET. Unlike ARCNET, which employed the use of switch blocks to physically configure the address of each node, Ethernet addresses were "burned" into NICs at the factory. This made adds, moves, and changes relatively painless in Ethernet networks compared to ARCNET. A third advantage was Ethernet's prodigious address space. ARCNET's limited address space created challenges for companies as their LANs grew. ARCNET address conflicts often would occur long before a LAN was near the maximum number of nodes unless administrators carefully tracked address assignments. Ethernet was (and still is) carefree in this regard, because every Ethernet address was globally unique.

Ethernet also had several advantages over Token Ring. Unlike Token Ring, Ethernet equipment could be purchased from several vendors. IBM was the only vendor of Token Ring equipment in the early days of LANs. Cabling was another advantage for Ethernet. Most Ethernet networks were based on standard coaxial cabling (commonly called Thinnet and Thicknet) at that time. This made Ethernet cable plants less expensive than Token Ring cable plants, which were based on proprietary IBM Type I cabling. The bus topology also simplified Ethernet cabling compared to Token Ring cabling, which required each ring to be physically looped. A Token Ring MAU enabled the creation of a collapsed ring topology. As mentioned previously, a collapsed ring simplifies the cable plant installation and lowers the cabling costs by centralizing all cable runs. The same could be accomplished with Ethernet by using hubs (sometimes called concentrators) and twisted pair cabling to create a star topology (a configuration called 10BASE-T). The cost of a Token Ring MAU was typically higher than the cost of an Ethernet hub. So, the total cost of a Token Ring solution (with or without MAUs) was typically higher than a comparable Ethernet solution. In the early days of LANs, most companies considered cost more important than other factors.

The Ethernet bus and star topologies required the medium to be shared, and a drawback of shared media is that it supports only half-duplex communication. So, one Ethernet node would transmit while all other nodes received. This limited aggregate throughput and created the need to arbitrate for media access. The arbitration method employed by Ethernet is based on the principle of fairness, which seeks to ensure that all nodes have equal access to the medium. The method of arbitration is called carrier sense multiple access with collision detection (CSMA/CD). ARCNET and Token Ring both employed half-duplex communication at that time, because they too were shared media implementations. ARCNET and Token Ring both employed arbitration based on token passing schemes in which some nodes could have higher priority than others. The CSMA/CD mechanism has very low overhead compared to token passing schemes. The tradeoffs for this low overhead are indeterminate throughput because of unpredictable collision rates and (as mentioned previously) the inability to achieve the maximum theoretical throughput.

As time passed, ARCNET and Token Ring lost market share to Ethernet. That shift in market demand was motivated primarily by the desire to avoid unnecessary complexity, to achieve higher throughput, and (in the case of Token Ring) to reduce costs. As Ethernet displaced ARCNET and Token Ring, it expanded into new deployment scenarios. As more companies became dependent upon Ethernet, its weaknesses became more apparent. The unreliability of Ethernet's coax cabling became intolerable, so Thinnet and Thicknet gave way to 10BASE-T. The indeterminate throughput of CSMA/CD also came into sharper focus. This and other factors created demand for deterministic throughput and line-rate performance. Naturally, the cost savings of Ethernet became less important to consumers as their demand for increased functionality rose in importance. So, 10BASE-T switches were introduced, and Ethernet's 10 Mbps line rate was increased to 100 Mbps (officially called 100BASE-T but commonly called Fast Ethernet and abbreviated as FE). FE hubs were less expensive than 10BASE-T switches and somewhat masked the drawbacks of CSMA/CD. 10BASE-T switches and FE hubs temporarily satiated 10BASE-T users' demand for improved performance.

FE hubs made it possible for Fiber Distributed Data Interface (FDDI) users to begin migrating to Ethernet. Compared to FDDI, FE was inexpensive for two reasons. First, the rapidly expanding Ethernet market kept prices in check as enhancements were introduced. Second, FE used copper cabling that was (at the time) significantly less expensive than FDDI's fiber optic cabling. FE hubs proliferated quickly as 10BASE-T users upgraded and FDDI users migrated. Newly converted FDDI users renewed the demand for deterministic throughput and line rate performance. They were accustomed to high performance because FDDI used a token passing scheme to ensure deterministic throughput. Around the same time, businesses of all sizes in all industries began to see LANs as mandatory rather than optional. Centralized file servers and e-mail servers were proliferating and changing the way businesses operated. So, FE hubs eventually gave way to switches capable of supporting both 10 Mbps and 100 Mbps nodes (commonly called 10/100 auto-sensing switches).

Tip

Auto-sensing is a common term that refers to the ability of Ethernet peers to exchange link level capabilities via a process officially named auto-negotiation. Note that Fibre Channel also supports auto-sensing.


10BASE-T switches had been adopted by some companies, but uncertainty about switching technology and the comparatively high cost of 10BASE-T switches prevented their widespread adoption. Ethernet switches cost more than their hub counterparts because they enable full-duplex communication. Full-duplex communication enables each node to transmit and receive simultaneously, which eliminates the need to arbitrate, which eliminates Ethernet collisions, which enables deterministic throughput and full line-rate performance. Another benefit of switching is the decoupling of transmission rate from aggregate throughput. In switched technologies, the aggregate throughput per port is twice the transmission rate. The aggregate throughput per switch is limited only by the internal switch design (the crossbar implementation, queuing mechanisms, forwarding decision capabilities, and so on) and the number of ports. So, as people began to fully appreciate the benefits of switching, the adoption rate for 10/100 auto-sensing switches began to rise. As switching became the norm, Ethernet left many of its limitations behind. Today, the vast majority of Ethernet deployments are switch-based.

Ethernet's basic media access control (MAC) frame format contains little more than the bare essentials. Ethernet assumes that most protocol functions are handled by upper layer protocols. This too reflects Ethernet's low overhead, high efficiency philosophy. As switching became the norm, some protocol enhancements became necessary and took the form of MAC frame format changes, new frame types, and new control signals.

One of the more notable enhancements is support for link-level flow control. Many people are unaware that Ethernet supports link-level flow control. Again reflecting its low overhead philosophy, Ethernet supports a simple back-pressure flow-control mechanism rather than a credit-based mechanism. Some perceive this as a drawback because back-pressure mechanisms can result in frame loss during periods of congestion. This is true of some back-pressure mechanisms, but does not apply to Ethernet's current mechanism if it is implemented properly. Some people remember the early days of Ethernet when some Ethernet switch vendors used a different back-pressure mechanism (but not Cisco). Many switches initially were deployed as high-speed interconnects between hub segments. If a switch began to experience transmit congestion on a port, it would create back pressure on the port to which the source node was connected. The switch did so by intentionally generating a collision on the source node's segment. That action resulted in dropped frames and lowered the effective throughput of all nodes on the source segment. The modern Ethernet back-pressure mechanism is implemented only on full-duplex links, and uses explicit signaling to control the transmission of frames intelligently on a per-link basis. It is still possible to drop Ethernet frames during periods of congestion, but it is far less likely in modern Ethernet implementations. We can further reduce the likelihood of dropped frames through proper design of the MAC sublayer components. In fact, the IEEE 802.3-2002 specification explicitly advises system designers to account for processing and link-propagation delays when implementing flow control. In other words, system designers can and should proactively invoke the flow-control mechanism rather than waiting for all receive buffers to fill before transmitting a flow control frame. A receive buffer high-water mark can be established to trigger flow-control invocation. Because there is no mechanism for determining the round-trip time (RTT) of a link, system designers should take a conservative approach to determining the high-water mark. The unknown RTT represents the possibility of dropped frames. When frames are dropped, Ethernet maintains its low-overhead philosophy by assuming that an upper layer protocol will handle detection and retransmission of the dropped frames.

Ethernet Throughput

As mentioned previously, Ethernet initially operated at only 10 Mbps. Over time, faster transmission rates came to market beginning with 100 Mbps (also called Fast Ethernet [FE]) followed by 1000 Mbps (GE). Each time the transmission rate increased, the auto-sensing capabilities of NICs and switch ports adapted. Today, 10/100/1000 auto-sensing NICs and switches are common. Ethernet achieved a transmission rate of 10 Gbps (called 10Gig-E and abbreviated as 10GE) in early 2003. Currently, 10GE interfaces do not interoperate with 10/100/1000 interfaces because 10GE does not support auto-negotiation. 10GE supports only full-duplex mode and does not implement CSMA/CD.

As previously mentioned, a detailed analysis of Ethernet data transfer rates is germane to modern storage networking. This is because the iSCSI, Fibre Channel over TCP/IP (FCIP), and Internet Fibre Channel Protocol (iFCP) protocols (collectively referred to as IP storage (IPS) protocols) are being deployed on FE and GE today, and the vast majority of IPS deployments will likely run on GE and 10GE as IPS protocols proliferate. So, it is useful to understand the throughput of GE and 10GE when calculating throughput for IPS protocols. The names GE and 10GE refer to their respective data bit rates, not to their respective raw bit rates. This contrasts with some other technologies, whose common names refer to their raw bit rates. The fiber optic variants of GE operate at 1.25 GBaud and encode 1 bit per baud to provide a raw bit rate of 1.25 Gbps. The control bits reduce the data bit rate to 1 Gbps. To derive ULP throughput, we must make some assumptions regarding framing options. Using a standard frame size (no jumbo frames), the maximum payload (1500 bytes), the 802.3 basic MAC frame format, no 802.2 header, and minimum inter-frame spacing (96 bit times), a total of 38 bytes of framing overhead is incurred. The ULP throughput rate is 975.293 Mbps.

The copper variant of GE is somewhat more complex. It simultaneously uses all four pairs of wires in a Category 5 (Cat5) cable. Signals are transmitted on all four pairs in a striped manner. Signals are also received on all four pairs in a striped manner. Implementing dual function drivers makes full-duplex communication possible. Each signal operates at 125 MBaud. Two bits are encoded per baud to provide a raw bit rate of 1 Gbps. There are no dedicated control bits, so the data bit rate is also 1 Gbps. This yields the same ULP throughput rate as the fiber optic variants.

The numerous variants of 10GE each fall into one of three categories:

  • 10GBASE-X

  • 10GBASE-R

  • 10GBASE-W

10GBASE-X is a WDM-based technology that simultaneously uses four lambdas, operates at 3.125 GBaud per lambda, and encodes 1 bit per baud to provide a raw bit rate of 12.5 Gbps. The control bits reduce the data bit rate to 10 Gbps. Using the same framing assumptions used for GE, this yields an ULP throughput rate of 9.75293 Gbps.

10BASE-R operates at 10.3125 GBaud and encodes 1 bit per baud to provide a raw bit rate of 10.3125 Gbps. The control bits reduce the data bit rate to 10 Gbps. Using the same framing assumptions used for GE, this yields the same ULP throughput rate as 10GBASE-X.

10GBASE-W is an exception. 10GBASE-W provides a mapping of 10GE onto SONET STS-192c and SDH VC-4-64c for service provider applications. Only SONET STS-192c is discussed herein on the basis that SDH VC-4-64c is not significantly different in the context of ULP throughput. The STS-192c baud rate of 9.95328 GBaud is lower than the 10GBASE-R PHY baud rate. As mentioned previously, SONET is an OSI Layer 1 technology. However, SONET incorporates robust framing to enable transport of many disparate OSI Layer 2 technologies. The 10GBASE-R PHY merely encodes data received from the MAC, whereas SONET introduces additional framing overhead not present in 10GBASE-R. So, the ULP throughput rate of an STS-192c interface represents the raw bit rate in 10GBASE-W. The WAN interface sublayer (WIS) of the 10GBASE-W PHY is responsible for STS-192c framing. The WIS presents a raw bit rate of 9.5846 Gbps to the physical coding sublayer (PCS) of the 10GBASE-W PHY. 10GBASE-W control bits reduce the data bit rate to 9.2942 Gbps. Using the same framing assumptions used for GE, this yields an ULP throughput rate of 9.06456 Gbps.

Table 3-2 summarizes the baud, bit, and ULP throughput rates of the GE and 10GE variants.

Table 3-2. Ethernet Baud, Bit, and ULP Throughput Rates

Ethernet Variant

Baud Rate

Raw Bit Rate

Data Bit Rate

ULP Throughput

GE Fiber Optic

1.25 GBaud

1.25 Gbps

1 Gbps

975.293 Mbps

GE Copper

125 MBaud x 4

1 Gbps

1 Gbps

975.293 Mbps

10GBASE-X

3.125 GBaud x 4

12.5 Gbps

10 Gbps

9.75293 Gbps

10GBASE-R

10.3125 GBaud

10.3125 Gbps

10 Gbps

9.75293 Gbps

10GBASE-W

9.95328 GBaud

9.5846 Gbps

9.2942 Gbps

9.06456 Gbps


Note

The IEEE 802.3-2002 specification includes a 1 Mbps variant called 1Base5 that was derived from an obsolete network technology called StarLAN. Discussion of 1Base5 is omitted herein. StarLAN evolved to support 10 Mbps operation and was called StarLAN10. The IEEE's 10BaseT specification was partially derived from StarLAN10.


To transport IPS protocols, additional framing overhead must be incurred. Taking iSCSI as an example, three additional headers are required: IP, TCP, and iSCSI. The TCP/IP section of this chapter discusses IPS protocol overhead.

Ethernet Topologies

Today, Ethernet supports all physical topologies. The original Ethernet I and Ethernet II specifications supported only the bus topology. When the IEEE began development of the first 802.3 specification, they decided to include support for the star topology. However, the communication model remained bus-oriented. Although the 802.3-2002 specification still includes the Thinnet and Thicknet bus topologies (officially called 10BASE2 and 10BASE5 respectively), they are obsolete in the real world. No other bus topologies are specified in 802.3-2002. The star topology is considered superior to the bus topology for various reasons. The primary reasons are easier cable plant installation, improved fault isolation, and (when using a switch) the ability to support full-duplex communication. The star topology can be extended by cascading Ethernet hubs (resulting in a hybrid topology). There are two classes of FE hub. Class I FE hubs cannot be cascaded because of CSMA/CD timing restrictions. Class II FE hubs can be cascaded, but no more than two are allowed per collision domain. Cascading is the only means of connecting Ethernet hubs. FE hubs were largely deprecated in favor of 10/100 switches. GE also supports half-duplex operation, but consumers have not embraced GE hubs. Instead, 10/100/1000 switches have become the preferred upgrade path. While it is technically possible to operate IPS protocols on half-duplex Ethernet segments, it is not feasible because collisions can (and do) occur. So, the remaining chapters of this book focus on switch-based Ethernet deployments.

Unlike Ethernet hubs, which merely repeat signals between ports, Ethernet switches bridge signals between ports. For this reason, Ethernet switches are sometimes called multiport bridges. In Ethernet switches, each port is a collision domain unto itself. Also, collisions can occur only on ports operating in half-duplex mode. Because most Ethernet devices (NICs and switches alike) support full-duplex mode and auto-negotiation, most switch ports operate in full-duplex mode today. Without the restrictions imposed by CSMA/CD, all topologies become possible. There is no restriction on the manner in which Ethernet switches may be interconnected. Likewise, there is no restriction on the number of Ethernet switches that may be interconnected. Ethernet is a broadcast capable technology; therefore loops must be suppressed to avoid broadcast storms. As mentioned previously, this is accomplished via STP. The physical inter-switch topology will always be reduced to a logical cascade or tree topology if STP is enabled. Most switch-based Ethernet deployments have STP enabled by default. The remainder of this book assumes that STP is enabled unless stated otherwise.

A pair of modern Ethernet nodes can be directly connected using a twisted pair or fiber optic cable (crossover cable). The result is a PTP topology in which auto-negotiation occurs directly between the nodes. The PTP topology is obviously not useful for mainstream storage networking, but is useful for various niche situations. For example, dedicated heartbeat connections between clustered devices are commonly implemented via Ethernet. If the cluster contains only two devices, a crossover cable is the simplest and most reliable solution.

Ethernet Service and Device Discovery

There is no service discovery mechanism in Ethernet. No service location protocol is defined by the 802.2-1998, 802.3-2002, or 802.3ae-2002 specifications. Likewise, no procedure is defined for a node to probe other nodes to discover supported services. Reflecting its low-overhead philosophy, Ethernet assumes that ULPs will handle service discovery. Ethernet assumes the same for device discovery. ULPs typically determine the existence of other nodes via their own mechanisms. The ULP address is then resolved to an Ethernet address. Each ULP has its own method for resolving the Ethernet address of another node. The TCP/IP suite uses the Address Resolution Protocol (ARP) specified in RFC 826.

Note

A conceptually related protocol is the Reverse Address Resolution Protocol (RARP) specified in RFC 903. RARP provides a means for a device to resolve its own ULP address using its Ethernet address. RARP is operationally similar to ARP but is functionally different. RARP has no bearing on modern storage networks but is still referenced in some product documentation and is still used in some niche environments. RARP is mentioned here only for the sake of completeness, so you will not be confused if you encounter RARP in documentation or in operation. RARP was replaced by BOOTP and DHCP.


Each ULP has a reserved protocol identifier called an Ethertype. Originally, the Ethertype was not included in the 802.3 header, but it is now used in the 802.3 header to identify the intended ULP within the destination node. The Ethertype field enables multiplexing of ULPs on Ethernet. Ethertypes could be used as well known ports (WKPs) for the purpose of ULP discovery, but each ULP would need to define its own probe/reply mechanism to exploit the Ethertype field. This is not the intended purpose of the Ethertype field. Ethertypes are assigned and administered by the IEEE to ensure global uniqueness.




Storage Networking Protocol Fundamentals
Storage Networking Protocol Fundamentals (Vol 2)
ISBN: 1587051605
EAN: 2147483647
Year: 2007
Pages: 196
Authors: James Long

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net