< Day Day Up > |
The term Direct Attach Storage, or DAS, did not exist before there was networked storage. Then, it was simply storage, and all of it was attached directly to a mainframe or server. Only after Storage Area Networks become popular did the DAS term become common. DAS refers to any storage that is attached locally to a computer. In the case of large arrays of devices, the computer may be connected to a controller. The controller then provides a single interface for many devices. The hard drive in a PC is an example of DAS. A multiterabyte disk array that is connected to a server via a SCSI cable is also DAS. Size does not determine whether storage is DAS only its architecture. Most open-system DAS devices use the SCSI or ATA standards to communicate to the storage devices or to a controller. ATA is more commonly used for slower, less reliable storage such as desktop storage, whereas SCSI is used for high-performance, high-reliability systems. That said, SCSI has shown up in desktop computers and ATA in enterprise-class arrays. Mainframes use their own protocols and specifications, most of which are proprietary. SCSIThe term SCSI (pronounced "scuzzy") stands for Small Computer Systems Interface. It defines a specification for both hardware and software protocols, used to transfer data between peripheral devices and the peripheral bus in a computer. Although SCSI is not used exclusively for data storage, the most common usage of the technology is for mass storage devices. More expensive and complex than many other methods of storing data, SCSI tends to be deployed in situations where high performance is necessary. The SCSI standards (Table 2-3) define both a set of hardware specifications and a software protocol. The hardware specifications include how many wires are used to move data and control information, addressing, basic topology, voltage, clock speed, and error correction methods. The software protocol defines how requests for data are made, how devices respond, and how information about devices can be retrieved.
Parallel SCSI is the predominant form of SCSI today. It is called this because it transmits all of its data and control bits at the same time. The SCSI software protocol has also been adapted for use over Fibre Channel interfaces and designated as FCP. Other forms of the SCSI protocol are also available, although they are new and not yet widely deployed. iSCSI is a networked version of SCSI that transmits data over an IP network. Serial Attached SCSI (SAS) is used for Direct Attach Storage but sends information one bit at a time for faster, more reliable data transfers. The type of SCSI implementation is usually denoted by the width of the data path (normal or Wide) and error correction method (Single Ended or SE, Low Voltage Differential or LVD, or High Voltage Differential or HVD). There are also several types of serial SCSI. Targets and InitiatorsThe SCSI command protocol is based on a client-server architecture. With SCSI, the device that will request data is the initiator, and the device that will return data is the target. Most often, the initiator is a host bus adapter (HBA). An HBA is a peripheral board or embedded processor, used to connect a host's peripheral bus to the SCSI bus. The target is the storage device or device controller, such as a RAID controller or tape drive. In all cases, the initiator is the master, and the target is the slave. The initiator begins all conversations and requests all data. The target provides whatever the initiator requests unless there is an error. It is possible to be both a target and an initiator. This is unlikely in the DAS situation, but there are Fibre Channel and management devices that use this capability.
SCSI AddressingParallel SCSI allows for either 8 or 16 addresses. Each device has an address with an ID from 0 to 7 or 0 to 15, depending on the SCSI implementation. The number of addresses is related to the number of control lines available, which is the same as the size of the data path. This is not a very large address space. To expand this limited address space, an additional addressing layer was added to SCSI. Each SCSI address can also be broken down into sub-addresses, called Logical Unit Numbers (LUNs). A LUN represents a logical, rather than a physical, address. LUNs can be assigned to portions of a physical device, and multiple LUNs can be assigned to the same device. In Parallel SCSI, there can be 16 LUNs for each SCSI address, for a total of 256 device addresses. Other implementations of SCSI use LUNs to provide a very large address space. The different Serial SCSI implementations maintain the same overall SCSI addressing scheme by mapping native addresses, such as Fibre Channel or IP addresses, to SCSI addresses and LUNs. Extensions to the SCSI addressing model allow for hierarchical addressing and a much larger address space. This is usually used to accommodate networked SCSI implementations.
Parallel SCSIUntil there were serial implementations of SCSI, Parallel SCSI was simply SCSI. The "parallel" part was added to differentiate it from the new serial forms of SCSI, especially Fibre Channel SCSI (FCP). The name refers to the fact that data is transferred in parallel, on all wires in the cable at once. For Ultra and Ultra2 types of SCSI, 8 bits are sent at once, with 16 bits sent for Wide versions. Starting with Ultra3 SCSI, all implementations are Wide. Note Parallel SCSI is a hardware standard. It is separate from the SCSI software protocol, which operates over several different hardware architectures. Parallel SCSI varies by three main characteristics. They are
The data transfer rate indicates how many blocks of data can be transferred in a given time period and is given in megabytes per second. This is a theoretical maximum rate based on the standardized signal speeds, not true access time. What is most confusing about SCSI nomenclature is that most implementations do not actually say what the data transfer rate is numerically. Instead, Parallel SCSI uses the term Ultra SCSI, Ultra2, and Ultra3 SCSI. Only with the advent of Ultra320 was a real data transfer rate mentioned. Many elements can affect the real throughput of a SCSI data transfer. These include software overhead in the SCSI host bus adapter and storage controller, the speed of the media itself (especially tape drives), and the condition and quality of the connector cables used. The second major characteristic is the data path width. As previously discussed, Parallel SCSI has two different options: an 8-bit path and Wide I (16 bits). The last characteristic is the error correction method. Originally, SCSI had no hardware error correction. All bits were sent down a single set of wires. This type of SCSI is known as Single Ended, or SE for short. SE hardware does not detect and resend individual bits when they are corrupt. The upper-level protocols have to detect corrupt data. The target then has to resend all the data in the requested block. One corrupted bit could cause an entire block, which might be megabytes in length, to be resent. This reduces the actual throughput of the device. One of the major causes of lost or corrupted data is noise. The effects of noise become more pronounced as cable lengths increase and signal rates go up. After a certain length and speed, bit errors due to noise will occur frequently. Because of this, SE SCSI requires very short cables. The cables, however, are cheaper, as is the rest of the hardware. Single Ended SCSI can be used in places where the cable lengths are very short and noise well managed, such as inside a server. It was quickly realized that the short cable lengths were a real hindrance to external system implementations of SCSI. Although SE was fine for attaching a few hard drives inside a file server, it was often impractical for longer external connections to storage devices such as disk arrays. Moreover, as disk arrays and tape libraries became larger, internal cable lengths became a serious issue. The cable length limitations affect the entire data path, including the cabling inside the arrays. The answer was Differential SCSI. With Differential SCSI, two wires are used. The voltage on one wire carries the data, and the voltage on the other is the exact opposite. One is represented as SIGNAL and the other as +SIGNAL. If you add the two voltages together, you should get 0. If not, something is wrong, and just the most recent byte or two of data needs to be resent. This allows SCSI to operate in much noisier environments without significant loss of throughput. The upshot of it all is that much larger cable can be used. The downside of differential SCSI is that hardware and cables were more expensive. To make matters more confusing, there are also two versions of Differential SCSI: High Voltage Differential (HVD) and Low Voltage Differential (LVD). The major difference between the two is obvious: the voltage. LVD uses lower voltage differences to make its comparisons and detect lost bits. With the introduction of the Ultra2 standard, SE was supplanted by LVD. LVD is less expensive than HVD, and the cable lengths are somewhat shorter (though much longer than SE) and able to operate at data transfer rates that SE could never achieve. LVD worked well enough that it is the only method used for Ultra3 and Ultra 320. There has been considerable debate as to which are better: Serial implementations of SCSI (especially Fibre Channel) or the Parallel ones. It is a silly argument, because each is better in some ways and poor in others. Parallel SCSI has the advantage of being very cheap and very fast. It is used extensively for connections inside disk arrays and tape libraries, because it is tried-and-true technology. The best place to use Parallel SCSI is in instances where distances are short, and speed and reliability are important Serial Attached SCSI (SAS)Serial Attached SCSI, or SAS, is relatively new, but the premise is simple. Use the same SCSI protocol, but rethink the hardware layer. Instead of sending 8 or 16 bits at a time, send only one, like Ethernet. Connectors and cables are kept small, and reliability is kept high. This makes for a very inexpensive, yet very fast way of connecting storage devices within a computer or array. SAS has been envisioned as an in-the-box technology. A typical way to make use of SAS is inside a disk array that has external Fibre Channel or Ethernet connections. SAS is also likely to become popular as a replacement for internal SCSI drives.
The SCSI ProtocolNo matter which hardware architecture is used be it SAS, a flavor of Parallel SCSI, or Fibre Channel SCSI all architectures use some form of the SCSI protocol. At a software level, they all are very similar, with differences due mostly to addressing. From the perspective of the applications and operating systems that interface with SCSI devices, they are all the same. The same software architecture and commands are used by all types of SCSI devices. This helps to account for the longevity of SCSI as a protocol. It has been adapted to a variety of platforms and architectures without causing major changes in applications and operating system interfaces. SCSI uses a client server architecture, though it doesn't use the terms client or server. The initiator sends commands to the target, which is then expected to respond. All commands follow a standard format called the Command Descriptor Block (CDB), which contains the command plus the target address. The target device responds with what was requested or an error block. There are a number of phases to the protocol. The first few Bus Free, Arbitration, Selection, and Reselection are used to allow initiators to gain control of the SCSI bus so they can send a command. These phases do not necessarily apply to all types of SCSI. Next is the Command phase, where the target requests a command from the initiator. The Data phase occurs when the initiator sends its command and gets a response from the target. The final phases are used by the target to request messages and information from the initiator. SCSI has a lot of commands. The most common are used to read and write data to block devices, such as tapes and disks. Other commands exist that allow an initiator to request addressing information, error messages, and device configurations. Specialized commands exist for media changers on tape and CD-ROM libraries, and so do a host of other read and write commands. LUN MaskingLUN masking is a technique that hides LUNs from certain initiators. When a target implements LUN masking, it will respond only to the SCSI Inquiry command from select initiators. Other initiators will believe that no device exists at that LUN address. In this way, only select initiators will know of the existence of a device at certain LUN addresses. There are advantages to this approach. In large disk arrays, which are shared among several different hosts, the array can be partitioned among hosts. It is often used in environments where the hosts have different operating systems and could corrupt other hosts' disks. It is even used as a crude form of security. LUN masking only hides the LUN from the SCSI Inquiry command. If an initiator continues to send commands to that LUN anyway, the target will respond normally.
ATAATA is the most popular mass storage device interface specification in use today. ATA stands for AT Attachment, as in the IBM PC AT, from roughly 1982. It is the interface of choice for desktop hard drives. It is also used extensively for CD-ROM/RWs in desktop and laptop PCs. Currently, ATA is used mostly inside computer and similar devices. Parallel ATA, like Parallel SCSI, uses a bus architecture with a limited address space. Each ATA channel can only address two devices: a primary and secondary, also called the master and slave. Most desktop computers have controllers with two parallel ATA channels. It is not uncommon to have an ATA interface with a channel for two hard drives and another channel for the CD-ROM/RW and DVD-ROM/RW drives. As is the case with Parallel SCSI, Parallel ATA transmits bits in parallel, 16 bits at a time. Data can be transferred to and from only one device at a time. Whichever device has control of the bus keeps it until it is finished with the transfer. This means that two high-usage devices will often be in contention for the bus, and performance will suffer. It is also why it is common to put the CD-ROM and other slow devices on a separate channel from the hard drives. ATA also uses a protocol layer that is similar to SCSI, making it independent, in some respects, from the hardware specification. This is one of the reasons that it has been adapted for Serial ATA and that other protocols can run over the ATA hardware. ATAPI, an offshoot of ATA, is used mostly for removable media, such as CD-ROM and tape drives. It employs SCSI commands over the ATA hardware to communicate with these devices.
Different versions of ATA are usually referred to by their data transfer speed. Common implementations are ATA/33 (33 megabytes per second), ATA/66, ATA/100, and ATA/133. Serial ATA (SATA)For much the same reasons that Serial Attached SCSI was created, so was Serial ATA. They are, in fact, linked because the hardware layer is the same for both. The difference between SAS and SATA is the protocol that they use. This provides backward compatibility with the maximum number of applications and operating systems while keeping costs low. Like SAS, SATA is viewed best as an in-the-box technology, used to create very inexpensive, yet high-performance, disk arrays. SATA works well for hot backups, staging data, and snapshots, and as storage for less important data. Performance should approximate that of SAS, and the reliability of SATA disk arrays should be nearly as good as that of a SCSI or Fibre Channel array. |
< Day Day Up > |