Serial Storage Interconnects | Storage Networking Fundamentals: An Introduction to Storage Devices, Subsystems, Applications, Management, and File Systems (Vol 1)

Two major serial interconnects are used for storage today: Fibre Channel and SATA. In the rest of this chapter we'll explore how these interconnects work in storage networking products.

Fibre Channel Loops as an Interconnect for Network Storage

While most people think of Fibre Channel strictly as a storage networking technology, it has been implemented very successfully as a device interconnect inside Fibre Channel SAN subsystems. This section does not discuss Fibre Channel as a networking technology, but looks at its use as an interconnect technology.

Fibre Channel Topologies

Fibre Channel has three topologies: fabric, loop, and point-to-point. The most common Fibre Channel topology used in disk subsystems is the loop technology. Fibre Channel disk drives have interfaces designed to operate in loop topologies.

A Brief History of Fibre Channel

Fibre Channel was initially a joint research project started in the late 1980s by IBM, Sun Microsystems, and Hewlett Packard in an effort to develop a new switched high-speed networking backbone technology. From its inception, some of the early developers thought the technology could also be used to transport storage traffic.

In the early 1990s IBM started to develop a new storage interconnect called serial storage architecture (SSA) that would have greater scalability and throughput than parallel SCSI and other storage interconnects. IBM's goal was to own the technology and license it for a fee to the rest of the industry. Led by Seagate technology, the leader in SCSI disk drives, a number of prominent storage companies started looking for an alternative technology to compete with IBM's SSA.

At roughly the same time, the research effort on Fibre Channel was losing traction. Gigabit Ethernet was emerging as the next obvious high-speed backbone technology, and the chances for Fibre Channel were not looking very good. When the Seagate consortium approached the Fibre Channel organization, they jointly agreed to work on adapting Fibre Channel to the needs of the storage industry. Fibre Channel's arbitrated loop resembles the operations of parallel SCSI fairly closely. By engaging a greater number of industry partners around the new, open Fibre Channel standard, Seagate was able to successfully isolate IBM's SSA initiative in the market.

NOTE

In my humble opinion, Seagate's efforts to work with the floundering Fibre Channel industry were pure genius. One of the main reasons its strategy worked was the fact that Fibre Channel was already on its way to becoming a standard, even though it did not have strong market backing. Regardless, a standard is a standard, and using an open industry standard gave Seagate and its allies an incredibly powerful marketing weapon against IBM with its proprietary technologywhich was very good technology, by the way.

The first Fibre Channel standards were published in 1994. Unlike the SCSI and ATA standards bodies, the Fibre Channel standards organization was fairly broad from the beginning. Most of the standards have very little to do with storage operations. Instead they deal with such topics as network signaling, switch operations, fabric services, and network addressing methods. Appendix B, "INCITS Storage Standards," lists many of the various subject areas under development by the INCITS T11 committee at http://www.t11.org.

Fibre Channel Interconnect Connections

Unlike parallel SCSI and ATA, the Fibre Channel interconnect was never designed to use cable connections. Instead, the physical interfaces on Fibre Channel disk drives were designed to connect directly to connectors on backplanes inside subsystems. Fibre Channel disk interfaces include both data and power connections in a single integrated connector, facilitating accurate removal and insertion. The electrical engineering challenges involved with hot swapping devices on Fibre Channel are far less than on parallel SCSI or ATA interconnects.

Whenever the device is removed from or added to the loop, the loop must reinitialize to establish addresses and priority. The loop topology in Fibre Channel is called an arbitrated loop, which means the devices on it arbitrate for access, just as they would on a parallel SCSI bus, as discussed in the earlier section "Parallel SCSI Addresses, Arbitration, and Priority."

With any arbitration scheme, there has to be a way to break deadlocks between competing devices. Similar to parallel SCSI buses, Fibre Channel loops use address priorities to break these deadlocks. However, unlike parallel SCSI, there are several ways addresses can be assigned in a Fibre Channel loop. The Fibre Channel loop initialization process (LIP) is designed to ensure loop addresses are correctly established. Considering that Fibre Channel disk drive interfaces were designed to plug into subsystem backplanes, Fibre Channel HBAs are not designed to attach to internal drives within a computer system. That is one of the reasons why internal parallel SCSI disks are often used to store the operating system on servers that use the SAN for application data only.

NOTE

Depending on the HBA, you might actually be able to connect a few Fibre Channel disk drives inside a server system. Some HBAs have a little three-pin connector on them for connecting a set of wires to a small surrogate backplane doohickey called a riser card. It merges the signal from the HBA with a standard PC power connector on one side and connects to the disk drive with the standard backplane connector on the other side. Oh yes, you'll also need a terminator to stick on the riser card. Oh yes, you can also chain these things together to make a loop. Oh yes, you will never get any support for any of this, but it might be a lot of fun if you don't have anything better to do with your time.

The Fibre Channel loop topology supports a maximum of 127 target addresses. Given SCSI's target/LUN addressing, there can theoretically be more than a thousand devices on a single loop. This would probably not make sense for any application, but it points out the scalability advantages of Fibre Channel compared to other interconnects.

Fibre Channel disk drives are dual-ported, which means they can connect to two different internal controllers for high availability. This feature is not obvious when looking at the drive because both ports are incorporated into the drive's single connector.

Fibre Channel Performance and Applications

Fibre Channel disk drives have unrivaled performance characteristics, having the fastest rotational speeds and the lowest seek times. While most Fibre Channel disk drives have 3.5-inch form factors, the fastest 15,000 rpm drives are sometimes designed for smaller 2.5-inch form factors. This is done to keep the drives from overheating.

Applications for the Fibre Channel interconnect are similar to parallel SCSItransaction processing, science and engineering, enterprise-level data sharing, and film, multimedia, and graphics applications.

Fibre Channel Interconnect Configurations

Because Fibre Channel drives are not designed to be connected inside systems, the only meaningful configuration for the Fibre Channel interconnect is inside a SAN disk subsystem.

NOTE

Of course, it is possible to use Fibre Channel to connect storage to a NAS system. The issue is whether this connection should be called a SAN or an internal storage interconnect. I would argue that a NAS system with a Fibre Channel "back end" uses a SAN and not an interconnect, but this is simply word shuffling. For instance, the Network Appliance Filers use Fibre Channel connectivity in the form of optical and copper cabling to connect their storage shelves. While all this is integrated as a complete package, it is a "head" with attached storage. In fact, their ATA-based NAS systems use Fibre Channel cables to connect to ATA storage shelves. I don't know what you'd call it, but I call that a small Fibre Channel SAN with an ATA device interconnect.

Fibre Channel Disk Drives in JBOD and RAID Subsystems

Fibre Channel JBOD subsystems are fairly easy to understand. As with parallel SCSI, the host initiator communicates directly with a logical unit on a Fibre Channel drive inside the subsystem. However, unlike parallel SCSI, the protocol payload does not have to be converted from a serial to a parallel transmission. The only work the controller has to do is map external SAN addresses with internal interconnect addresses.

As with parallel SCSI, Fibre Channel RAID subsystems use virtualization to subdivide and aggregate storage on internal disk drives. Again, the SCSI addresses known to the internal controllers likely won't be the same as those used by host initiators. The host initiator establishes a connection with a logical unit in the subsystem controller, and an internal initiator in the subsystem controller finishes the work through multiple connections to Fibre Channel drives.

Figure 7-11 illustrates the basic traffic flows between a SAN host and a Fibre Channel RAID subsystem.

Figure 7-11. A Fibre Channel RAID Subsystem

Retooling ATA Storage with Serial ATA

One of the most interesting developments in storage device interconnects is SATA. The basic concept of SATA is simple: it is a revision of ATA that retains the logical storing protocols and access methods of ATA but replaces the parallel bus with serial point-to-point connections. In a way it's like using SCSI-3 logic with Fibre Channel, except in this case the ATA logic is not abstracted from the connecting layer as a separate standard. SATA logic and connectivity are very much tied at the hip.

SATA is intended to be implemented as PC system storage that connects primarily to PC motherboards. In fact, the primary driver for the development of SATA is Intel, which would benefit from streamlined I/O processing, smaller motherboard connectors, and reduced power requirements for PC systems. SATA takes a major step in the continued miniaturization and efficiency of PC systems by significantly shrinking the connectors and cables used.

Point-to-Point Connections

SATA gets rid of the arcane addressing of ATA's master/slave designation by allowing only one device per channel in a point-to-point connection. While this simplifies the matter of configuring disk drives, it's not optimal for storage networking operations. With parallel ATA, two channels support four devices, whereas SATA requires four channels for the same four drives.

Initially, most SATA-equipped PC motherboards had only two SATA channels, which made it virtually impossible to expand SATA storage in a PC system. If SATA drives are being used strictly as boot drives, expandability is not necessarily a problem, but it could certainly be a problem if they are being used to store application data. The dearth of SATA channels on motherboards shouldn't last too long, because motherboard manufacturers, especially Intel, will likely start increasing the number of onboard SATA connections relatively soon.

SAN subsystems with SATA drives also need to design around the one-to-one channel limitation. The trick is understanding how much excess channel capacity to put in a subsystem if each additional drive needs an additional channel. This might not be a major problem for small to medium-sized subsystems, but it could be a serious obstacle for larger subsystems. All new storage technologies have problems initially before working out the details, and SATA will be no different.

The Short, Confusing History of SATA

Intel announced the formation of the Serial ATA Working Group early in 2000. Besides Intel, invited participants included APT Technologies, Dell, IBM, Maxtor, Quantum, and Seagate. In December of 2000 they announced the availability of what was referred to as Draft Specification 1.0, defining the SATA interface. In February of 2002, they announced the formation of another Working Group referred to as SATA II, intended to extend SATA 1.0 technology for server and networking applications.

Then in June 2002, Intel, HP, and Dell announced a plan to work together on compatible interfaces between Serial Attached SCSI (SAS) and SATA II. In January 2003, the SCSI trade association, a consortium of parallel SCSI vendors and the SATA II Working Group, announced a collaboration to enable compatible operations between SAS and SATA.

NOTE

SAS is not discussed in this book other than its relationship to SATA in this chapter. I've made the mistake of writing about "the next great technology" before (InfiniBand or DAFS, anybody?). It's not clear whether the world needs and will buy products using an additional and redundant storage interconnect, although there do seem to be advantages to mixing SATA and SAS on the same interconnect.

In February 2003, a year after the 1.0 release, the Serial ATA 1.0a specification was released. Then, in November 2003, the work of the SATA II group released its Revision 1.1 of the Serial ATA 1.0a specification, which included such features as Native Command Queuing and the Port Multiplier.

Despite all these releases from the two SATA Working Groups, at the time this chapter was written, no SATA standards had been published by an accredited industry standards body. That said, it appears that significant work has been done by the T13 ATA standards committee to add Serial ATA to the anticipated ATA/ATAPI-7 standard as the third volume of three volumes in that publication.

NOTE

It's only natural that companies involved in the development of new technology want to promote it and engage other industry members to participate. However, in this case it appears the promotional strategy was "Ready, fire, aim!" At least they don't try to hide it in their specification. When you download the 1.0a spec, the very first thing you read on page 1 is the following:

"This 1.0a revision of the Serial ATA/ High Speed Serialized AT Attachment specification consists of the 1.0 revision of the specification with the following errata incorporated: 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 29, 30, 31, 32, 33, 34, 35, 36, 38, 40"

It's amazing what a little time can do for the development of new technology! A risk for SATA moving forward through both the T10 SCSI committee (with SAS) and the T13 ATA committee is that there will be inconsistencies between two independent standards that are not easily resolved, forcing breakage in one place or another.

Performance Expectations for SATA

The first version of SATA has burst transfer rates of 150 MBps, which is not far off Fibre Channel's 200 MBps, or parallel SCSI's 160 MBps. Future versions are expected to have burst transfer rates of 300 MBps. While burst transfer rates are important, they are only part of the performance landscape. To have competitive performance with Fibre Channel and parallel SCSI, SATA needs to solve the problem of its ATA legacy not implementing command queuing and overlapped I/Os.

Fortunately, SATA developers seem to have solved some of these problems. For starters, when a drive has a dedicated channel, there is no possibility of not having overlapped I/O. If all storage access goes to independent channels, all I/Os are overlapped.

The SATA II 1.1 enhancements revision includes something called Native Command Queuing (NCQ), apparently named to distinguish it from ATA's mediocre command queuing capabilities. Similar to SCSI's command queuing mechanism (see Chapter 6, "SCSI Storage Fundamentals and SAN Adapters"), NCQ reorders commands for more efficient operations in addition to aggregating responses and reducing the number of interrupts on host systems.

Native command queuing will likely make a significant difference over time. The gating factor for now appears to be host software. Most existing Windows software does not take advantage of tagged I/O. Some Linux systems apparently do, however. Intel believes customers who purchase systems with its hyper-threading technology will be able to use command queuing to a limited degree.

SATA Port Multiplier

Another enhancement to SATA is a device called a port multiplier. The idea is fairly simple: the port multiplier provides connectivity to multiple devices over a single channel. The specification provides support for up to 15 devices to be connected to a single port multiplier. There is no provision in the specification for establishing priorities for drives connected through a port multiplier; those details are explicitly left as an implementation detail.

At first this sounds like it would solve most of the scalability problems SATA might have, but there are always a few pesky implementation details to work through. For instance, two modes of switching are defined for SATA host controllers working with port multipliers. One kind of switching is called command-based switching. It allows the host to have outstanding commands on only one device at a time. This is essentially a mode that restricts overlapped I/O and could have a detrimental impact on the system's I/O performance. NCQ would probably still work, but on only one drive at a time.

The other type of switching is called FIS-based switching. (FIS is the data payload in SATA communications and stands for frame information structure.) It is obvious from reading the specification that this is one of those things that can be described facetiously as "a simple matter of programming." In other words, there are no estimates when this can be implemented on a system in your business. Just as command queuing went unused in ATA since the ATA/ATAPI-4 specification, the use of FIS-based switching could have the same kind of response from the industryespecially if it involves getting the attention of large software developers that have other dragons to slay.

SATA Port Selector

The converse of supporting multiple devices with a port multiplier is allowing multiple controllers to access a single disk drive for redundancy purposes. An enhancement to SATA called the port selector provides a mechanism for doing that. The port selector is not the same as dual porting in Fibre Channel in that the drive itself does not have two ports, but the port selector acts as a miniature "front end" to provide that capability.

Its not clear yet whether this technology will be used or how much it would cost to implement, but it does show the depths to which the SATA Working Group has gone to make its technology applicable for applications outside the desktop PC. With the port selector in place, the SATA interconnect would be second only to the Fibre Channel interconnect for supporting fault-tolerant dual pathing inside the subsystem.

SATA Storage Network Applications and Configurations

SATA is a technology that will be much more capable than its predecessor, ATA, for many storage network applications. It is expected to cost less than either Fibre Channel or parallel SCSI. It also will deliver performance capable of handling moderately heavy I/O workloads such as web serving, non-transaction-processing databases, and streaming media. SATA is expected to be used effectively to support the I/O requirements of most Linux and Windows servers.

SATA Disk Drives Within Systems

Like parallel SCSI, SATA will be deployed both within systems as boot disks and NAS storage and in SAN disk subsystems. Figure 7-12 shows a NAS system with a four-port internal SATA RAID controller running RAID 5.

Figure 7-12. A SATA-Based NAS System

SATA in SAN Disk Subsystems

In addition to the port multiplier and port selector functions that are intended to give SATA more of an entry in storage networking applications, the SATA Working Groups have worked on defining backplane interfaces to work with SATA drives. There is little doubt that these drives will be better suited to adoption into disk subsystems than either parallel SCSI or ATA.

Because they are not SCSI drives, physically or logically, it is not possible to implement SATA in a way that allows SAN initiators to establish SCSI connections directly with SATA drives the way parallel SCSI and Fibre Channel can. SATA drives could still be used in a JBOD-like way where individual drives are exported for use by individual servers, but the disk subsystem controller would perform the entire target/logical units functions before reinitiating commands internally. There is no way the performance of such a system could rival either parallel SCSI or Fibre Channel.

The ability to use backplane connector technology with SATA makes hot swapping much easier than with either ATA or parallel SCSI drives. In addition, SATA port selector technology could potentially be integrated into the backplane circuitry to provide integrated multipath protection from controller failures.

Figure 7-13 shows a hypothetical SATA disk subsystem with port selector technology integrated into a five-disk RAID backplane.

Figure 7-13. A SATA RAID Subsystem with Integrated Port Selector Technology for Dual Pathing