A storage protocol defines the parameters of communication between storage devices and storage controllers. Storage protocols fall into one of two categories: file-oriented or block-oriented. File-oriented protocols (also known as file-level protocols) read and write variable-length files. Files are segmented into blocks before being stored on disk or tape. Block-oriented protocols (also known as block-level protocols) read and write individual fixed-length blocks of data. For example, when a client computer uses a file-level storage protocol to write a file to a disk contained in a server, the server first receives the file via the file-level protocol and then invokes the block-level protocol to segment the file into blocks and write the blocks to disk. File-level protocols are discussed in more detail in a subsequent section of this chapter. The three principal block-level storage protocols in use today are advanced technology attachment (ATA), small computer system interface (SCSI), and single-byte command code set (SBCCS). ATAATA is an open-systems standard originally started by the Common Access Method (CAM) committee and later standardized by the ANSI X3 committee in 1994. Several subsequent ATA standards have been published by ANSI. The Small Form Factor (SFF) Committee has published several enhancements, which have been included in subsequent ANSI standards. Each ANSI ATA standard specifies a block-level protocol, a parallel electrical interface, and a parallel physical interface. ATA operates as a bus topology and allows up to two devices per bus. Many computers contain two or more ATA buses. The first ANSI ATA standard is sometimes referred to as ATA-1 or Integrated Drive Electronics (IDE). Many updates to ATA-1 have focused on electrical and physical interface enhancements. The ANSI ATA standards include ATA-1, ATA-2 (also known as enhanced IDE [EIDE]), ATA-3, ATA/ATAPI-4, ATA/ATAPI-5, and ATA/ATAPI-6. The current work in progress is ATA/ATAPI-7. Early ATA standards only supported hard disk commands, but the ATA Packet Interface (ATAPI) introduced SCSI-like commands that allow CD-ROM and tape drives to operate on an ATA electrical interface. Sometimes we refer to these standards collectively as parallel ATA (PATA). The serial ATA (SATA) Working Group, an industry consortium, published the SATA 1.0 specification in 2001. ANSI is incorporating SATA 1.0 into ATA/ATAPI-7. The SATA Working Group continues other efforts by including minor enhancements to SATA 1.0 that might not be included in ATA/ATAPI-7, the development of SATA II, and, most notably, a collaborative effort with the ANSI T10 subcommittee to "align" SATA II with the Serial Attached SCSI (SAS) specification. The future of serial ATA standards is unclear in light of so many efforts, but it is clear that serial ATA technologies will proliferate. ATA devices have integrated controller functionality. Computers that contain ATA devices communicate with the devices via an unintelligent electrical interface (sometimes mistakenly called a controller) that essentially converts electrical signals between the system bus and the ATA bus. The ATA/ATAPI command set is implemented in software. This means that the host central processing unit (CPU) shoulders the processing burden associated with storage I/O. The hard disks in most desktop and laptop computers implement ATA. ATA does not support as many device types or as many advanced communication features as SCSI. The ATA protocol typically is not used in storage networks, but ATA disks often are used in storage subsystems that connect to storage networks. These storage devices act as storage protocol converters by speaking SCSI on their SAN interfaces and ATA on their internal ATA bus interfaces. The primary benefit of using ATA disks in storage subsystems is cost savings. ATA disk drives historically have cost less than SCSI disk drives for several reasons. SCSI disk drives typically have a higher mean time between failures (MTBF) rating, which means that they are more reliable. Also, SCSI disk drives historically have provided higher performance and higher capacity. ATA disk drives have gained significant ground in these areas, but still tend to lag behind SCSI disk drives. Of course, these features drive the cost of SCSI disk drives higher. Because the value of data varies from one application to the next, it makes good business sense to store less valuable data on less costly storage devices. Thus, ATA disks increasingly are deployed in SAN environments to provide primary storage to comparatively low-value applications. ATA disks also are being used as first-tier media in new backup/restore solutions, whereby tapes are used as second-tier media for long-term archival or off-site storage. The enhanced backup solutions Initiative (EBSI) is an industry effort to develop advanced backup techniques that leverage the relatively low cost of ATA disks. Figure 1-4 illustrates an ATA-based storage subsystem connected to an FC-SAN. Figure 1-4. ATA-Based Storage Connected to FC-SAN
SCSISCSI is an open-systems standard originally started by Shugart Associates as the Shugart Associates Systems Interface (SASI) in 1981 and later standardized by the ANSI X3 committee in 1986. Each early SCSI standard specified a block-level protocol, a parallel electrical interface, and a parallel physical interface. These standards are known as SCSI-1 and SCSI-2. Each operates as a bus topology capable of connecting 8 and 16 devices, respectively. The SCSI-3 family of standards separated the physical interface, electrical interface, and protocol into separate specifications. The protocol commands are separated into two categories: primary and device-specific. Primary commands are common to all types of devices, whereas device-specific commands enable operations unique to each type of device. The SCSI-3 protocol supports a wide variety of device types and transmission technologies. The supported transmission technologies include updated versions of the SCSI-2 parallel electrical and physical interfaces in addition to many serial interfaces. Even though most of the mapping specifications for transport of SCSI-3 over a given transmission technology are included in the SCSI-3 family of standards, some are not. An example is the iSCSI protocol, which is specified by the IETF. Most server and workstation computers that employ the DAS model contain either SCSI devices attached via an internal SCSI bus, or they access SCSI devices contained in specialized external enclosures. In the case of external DAS, SCSI bus and Fibre Channel point-to-point connections are common. Computers that access SCSI devices typically implement the SCSI protocol in specialized hardware generically referred to as a storage controller. When the SCSI protocol is transported over a traditional parallel SCSI bus, a storage controller is called a SCSI adapter or SCSI controller. If a SCSI adapter has an onboard CPU and memory, it can control the system bus temporarily, and is called a SCSI host bus adapter (HBA). When the SCSI protocol is transported over a Fibre Channel connection, the storage controller always has a CPU and memory, and is called a Fibre Channel HBA. When a SCSI HBA or Fibre Channel HBA is used, most storage I/O processing is offloaded from the host CPU. When the SCSI protocol is transported over TCP/IP, the storage controller may be implemented via software drivers using a standard network interface card (NIC) or a specialized NIC called a TCP offload engine (TOE), which has a CPU and memory. As its name implies, a TOE offloads TCP processing from the host CPU. Some TOEs also implement iSCSI logic to offload storage I/O processing from the host CPU. Note A Fibre Channel point-to-point connection is very similar to a Fibre Channel arbitrated loop with only two attached nodes. All SCSI devices are intelligent, but SCSI operates as a master/slave model. One SCSI device (the initiator) initiates communication with another SCSI device (the target) by issuing a command, to which a response is expected. Thus, the SCSI protocol is half-duplex by design and is considered a command/response protocol. The initiating device is usually a SCSI controller, so SCSI controllers typically are called initiators. SCSI storage devices typically are called targets. That said, a SCSI controller in a modern storage array acts as a target externally and as an initiator internally. Also note that array-based replication software requires a storage controller in the initiating storage array to act as initiator both externally and internally. So it is important to consider the context when discussing SCSI controllers. The SCSI parallel bus topology is a shared medium implementation, so only one initiator/target session can use the bus at any one time. Separate sessions must alternate accessing the bus. This limitation is removed by newer serial transmission facilities that employ switching techniques. Moreover, the full-duplex nature of switched transmission facilities enables each initiator to participate in multiple simultaneous sessions with one or more targets. Each session employs half-duplex communication, but sessions are multiplexed by the transmission facilities. So, an initiator can issue a command to one target while simultaneously receiving a response from another target. The majority of open-systems storage networks being deployed are Fibre Channel and TCP/IP networks used to transport SCSI traffic. Those environments are the primary focus of this book. Figure 1-5 illustrates a SCSI-based storage subsystem connected to an FC-SAN. Figure 1-5. SCSI-Based Storage Connected to FC-SAN
SCSI parallel bus interfaces have one important characteristic, their ability to operate asynchronously or synchronously. Asynchronous mode requires an acknowledgment for each outstanding command before another command can be sent. Synchronous mode allows multiple commands to be issued before receiving an acknowledgment for the first command issued. The maximum number of outstanding commands is negotiated between the initiator and the target. Synchronous mode allows much higher throughput. Despite the similarity of this mechanism to the windowing mechanism of TCP, this mechanism is implemented by the SCSI electrical interface (not the SCSI protocol). Another important point is the contrasting meaning of the word synchronous in the context of the SCSI parallel bus versus its meaning in the context of long-distance storage replication. In the latter context, synchronous refers not to the mode of communication, but to the states of the primary and secondary disk images. The states are guaranteed to be synchronized when the replication software is operating in synchronous mode. When a host (acting as SCSI initiator) sends a write command to the primary storage device, the primary storage device (acting as SCSI target) caches the data. The primary storage device (acting as SCSI initiator) then forwards the data to the secondary storage device at another site. The secondary storage device (acting as SCSI target) writes the data to disk and then sends acknowledgment to the primary storage device, indicating that the command completed successfully. Only after receiving acknowledgment does the primary storage device (acting as SCSI target) write the data to disk and send acknowledgment of successful completion to the initiating host. Because packets can be lost or damaged in transit over long distances, the best way to ensure that both disk images are synchronized is to expect an acknowledgment for each request before sending another request. Using this method, the two disk images can never be more than one request out of sync at any point in time. Tip The terms synchronous and asynchronous should always be interpreted in context. The meanings of these terms often reverse from one context to another. SBCCSSBCCS is a generic term describing the mechanism by which IBM mainframe computers perform I/O using single-byte commands. IBM mainframes conduct I/O via a channel architecture. A channel architecture comprises many hardware and software components including channel adapters, adapter cables, interface assemblies, device drivers, I/O programming interfaces, I/O units, channel protocols, and so on. IBM channels come in two flavors: byte multiplexer and block multiplexer. Channels used for storage employ block multiplexer communication. I/O units used for storage are called disk control units (CU). Mainframes communicate with CUs via the channel protocol, and CUs translate channel protocol commands into storage I/O commands that the storage device (for example, disk or tape drive) can understand. This contrasts with ATA and SCSI operations, wherein the host initiates the storage I/O commands understood by the storage devices. Figure 1-6 illustrates this contrast. Figure 1-6. Mainframe Storage Channel Communication
Two block channel protocols follow:
DCI requires a response for each outstanding command before another command can be sent. This is conceptually analogous to asynchronous mode on a SCSI parallel bus interface. DS can issue multiple commands while waiting for responses. This is conceptually analogous to synchronous mode on a SCSI parallel bus interface. A block channel protocol consists of command, control, and status frames. Commands are known as channel command words (CCW), and each CCW is a single byte. Supported CCWs vary depending on which CU hardware model is used. Some configurations allow an AIX host to appear as a CU to the mainframe. Control frames are exchanged between the channel adapter and the CU during the execution of each command. Upon completion of data transfer, the CU sends ending status to the channel adapter in the mainframe. Command and status frames are acknowledged. Applications must inform the operating system of the I/O request details via operation request blocks (ORB). ORBs contain the memory address where the CCWs to be executed have been stored. The operating system then initiates each I/O operation by invoking the channel subsystem (CSS). The CSS determines which channel adapter to use for communication to the designated CU and then transmits CCWs to the designated CU. SBCCS is a published protocol that may be implemented without paying royalties to IBM. However, IBM might not support SBCCS if implemented without the control unit port (CUP) feature, and CUP is not available for royalty-free implementation. Furthermore, only IBM and IBM-compatible mainframes, peripherals, and channel extension devices implement SBCCS. So, the open nature of SBCCS is cloudy, and the term pseudo-proprietary seems appropriate. There are two versions of SBCCS; one for enterprise systems connection (ESCON) and one for Fibre Channel connection (FICON). Mainframes have always played a significant role in the computing industry. In response to advances made in open computing systems and to customer demands, mainframes have evolved by incorporating new hardware, software, and networking technologies. Because this book does not cover mainframes in detail, the following section provides a brief overview of mainframe storage networking for the sake of completeness. |