Hard drives became a commodity product in the early 1990s, and the resulting low margins afforded manufacturers the ability to force many hard disk manufacturers out of business or into merger. Currently, there are four major manufacturers of hard drives in the United States, and between them, they control in excess of 90% of the worldwide market:
Other vendors you may encounter include Fujitsu, Samsung, and Toshiba. Note Regardless of who you buy your disk drives from, the entire industry generally produces reliable products. Consumer drives typically come with anywhere from a three-month to three-year warranty, enterprise-class drives usually start at a three-year warranty. The mean time between failures (MTBF) is usually advertised as around seven years. That may seem like a lot, but remember that this number is a mean. When you purchase drives for a disk array of say eight drives, the chances of one failing are eight times greater than with a single drive. You should expect a drive to fail in the first year or two. That's why you want to have redundancy built into your system, by including extra drives. Whereas it is common to find a single disk drive in a workstation, servers are a different beast entirely. Modern server motherboards don't look all that different from PC motherboards, except that they tend to be larger in order to accommodate more expansion slots. Other differences are that they often support multiple processors, have onboard RAID and network support, and come in an extended ATX or extended BTX form factor, with features geared toward controlling internal case temperatures. Most server deployments opt to use faster drives and more drives, and they combine those drives in the form of external disk arrays. Arrays provide for both enhanced speed (as measured by data throughput) and higher fault tolerance.
You generally only find the IDE bus and ATA drives used in the low end of the server market, typically in single- or dual-processor systems. You might find that a server's boot drive is an enhanced ATA drive or a two-drive EIDE RAID array, but the business end, with larger amounts of storage, typically uses other technologies. Most of the server industry is currently using one of the following four standard types of 3.5-inch disk drives:
Each of these types is described in the sections that follow. Because all drives have the same size and form factor, the best way to tell them apart is to look at the back of the drive to see what connectors each has. IDE and ATA DrivesIntegrated Drive Electronics (IDE) has been the storage interface on PC motherboards for many years. The "integrated" part refers to an onboard controller (an ASIC) that offloads processing to the disk, as well as an onboard drive cache. In Chapter 6, "The ATA/IDE Interface," we already detailed just about all there is to know about the technology behind the IDE interface. In this section, however, we'll recap some of the core details as they pertain to this chapter.
These days when people refer to IDE drives, what they really mean is the advanced version of the ATA standard, probably implemented in a proprietary version. ATA-2 is also marketed as Enhanced IDE (EIDE), which was a term Western Digital coined for its hard drives. Seagate was another vendor with an "enhanced" IDE standard; it called its Fast ATA or FAST IDE. Essentially, these names all refer to minor variations on the same theme (ATA-2), and the term EIDE is in wide usage. You can expect EIDE drives to be about three or four times faster than the original IDE standard, with data transfer rates between 4MBps and 16.6MBps. The next version of the ATA standard was ATA-3, a minor revision that isn't differentiated from EIDE. Usually, a server motherboard comes with two or four EIDE connections, with each channel supporting one master device and one slave device. Devices that can connect to IDE are hard drives, optical drives, and tape drives, for the most part. Motherboards with four IDE connections often connect a pair of them to an onboard EIDE RAID chip. You can also purchase EIDE RAID controllers from several vendors. EIDE RAID is discussed later in this chapter, in the section "RAID Controllers." From the standpoint of using ATA in servers, the next few versions of ATA are of the most interest. After version ATA-3, there followed more versions of the ATA standard. There are currently seven ATA versions. Those that are currently in use are recapped in the following list:
Note ATA as implemented in real drives is the work of an ad hoc group called the Small Form Factor (SFF) Committee. You can read more about them at www.sffcommittee.com, but the standard itself is specified by Technical Committee T13 of the NCITS (see http://t13.org). Let's look at some of the features offered by the latest versions of EIDE drives, as these are the ones that are most desirable to use in server deployments. The specs following Figure 11.1 list an abbreviated spec sheet for a Samsung SP1604N 160GB, 7200rpm EIDE hard drive, a drive that is in the SpinPoint P80 series. Figure 11.1. Most EIDE drives used today are ATA-7 (Ultra ATA/133) drives that spin at at least 7200 rpm.
The SP1604N features the following:
It's very important that server drives be quiet, cool, and power efficient because you multiply these issues many times in server deployments. From the standpoint of performance, here's what Samsung advertises:
It's still possible to find EIDE models with 2MB of onboard cache or ones that operate at the older 5600rpm speed, but their cost advantage over drives such as the one from Samsung is no longer what it used to be. Consequently, these drives are slowly being phased out. Serial ATASerial ATA (SATA) is the technology that is meant to replace the EIDE interface in desktop systems and workstations. You find SATA connections on nearly all modern motherboards, with two or four onboard connections being common. Some new server boards have as many as eight SATA connections. The SATA bus standard is similar to Fibre Channel in regard to its ability to fan out and connect not only hard drives but peripherals as well. In terms of the physical drive, a SATA drive is identical to a standard parallel ATA (PATA) drive, except that it uses the serial interface. The ATA or EIDE drives you learned about in the preceding section use a parallel data transfer method, with many channels communicating during each processor cycle. The ATA/100 standard uses a 16-bit channel and requires the ribbon cables we are so familiar with in desktop PCs. It's not uncommon to see ribbon cables with 40 or 80 wires. The form of the wire is flat and wide to eliminate crosstalk between channels. The rounded cables that replace ribbon IDE cables contain a significant amount of shielding. SATA uses a serial connection, sending data down a single control channel because the serial data stream runs at much greater speeds than the EIDE bus. When SATA was first introduced, the bus clock was sending data down the wire at 1.5GHz, compared to the clock rate of around 100MHz for Ultra ATA/100. That is roughly 150 times as fast, and so even if you reduced the number of wires by a factor of 40 or 80, SATA would be faster than ATA. SATA suffers from fewer data transmission errors, and there's less need to re-send data over an SATA cable than there typically is over an ATA cable. All that, coupled with the fact that the run lengths of SATA can be longer than ATA, adds up to a significant design advantage. Note The website of the Serial ATA International Organization, which manages the SATA standard, is www.serialata.org. Another one of EIDE/ATA's thorny configuration issues, the master/slave relationship, disappears with SATA. An ATA bus master/slave configuration shares the same wire, effectively halving the performance available to each device. In SATA data connections, there is no master or slave. Each device gets a full dedicated 150MBps or 300MBps connection to the SATA host controller, with future transfer rates of 600MBps and faster planned as part of the SATA road map. SATA allows for hot-swappable drivesand more of them than EIDE. This means SATA can compete with SCSI and Fibre Channel when it comes to creating fault-tolerant RAID configurations. Although SATA was initially thought of as a technology for gamers and for enhanced workstations, it is a very interesting server storage technology, and the industry is keeping a sharp eye on its development.
SCSI DrivesSCSI has been the drive interface of choice for small computer systems that require fast data transfers and can live with short cable runs. SCSI is notable for being a self-configuring bidirectional bus, with what is now a long history. The first SCSI standard appeared about the same time as the Macintosh II personal computer and was developed at Apple. Server vendors have made heavy use of the SCSI interface in applications ranging from drives inside the server itself (so-called "captive disk") to DAS applications, as well as the internal drives used in some storage servers (smaller ones, generally) and disk arrays.
Demystifying SCSIYou learned a lot about the SCSI bus in Chapter 7, "The SCSI Bus," so in this chapter we simply review in brief the different standards and look at the types of SCSI drives that are preferred in server deployments today. You saw seven different iterations of the ATA standard earlier (depending on how you count them); with SCSI, there are even more. Many of these standard revision levels are backward compatible with others, and some are not. The key feature to look for is the pin count of the connector. SCSI has appeared in the following revision levels or types:
These standards define the command set that any variant of that level of SCSI uses. Chances are that these days, if someone refers to "SCSI," they are talking about some form of the SCSI-2 specification. The original SCSI was an 8-bit bus standard operating at 4 MBps, using a 25-pin connector. The 8-bit bus offered seven connected devices, IDs 0 through 7. ID 7 usually is the host adapter, by default. (SCSI gives priority to the highest-numbered device on a chain, which is why 7 is used.) Moving to a 16-bit standard (SCSI-2) doubled the number of possible connected devices to 15. A 32-bit 40MBps version of SCSI-2 has been codified but is not yet in use. Note A lot of information has been published on the SCSI standard. To begin with, SCSI is defined by the T10 Technical Committee (www.t10.org) of INCITS (InterNational Committee for Information Technology Standards), under the auspices of ANSI (American National Standards Institute; www.ansi.org). The relevant trade organization is the SCSI Trade Association (www.scsita.org). Many vendors maintain an open library of information to aid their customers in buying SCSI disks. Three other sites you might want to look at are Adaptec's (www.adaptec.com/worldwide/support/supportindex.jsp?sess=no&language=English+US&source=home_tab), Seagate's (www.seagate.com/support/kb/disc/index_faq.html), and Parlan's (www.paralan.com/scsi.html), but there are many others as well. All of the first sets of SCSI standards until SCSI-2 were 5-volt single-ended standards that used TTL voltage levels to determine the signal. That architecture used only a single wire but required termination and high power. A recent variant of SCSI-3, Ultra 2 SCSI, offered a lower 3-volt voltage double-ended alternative called Low Voltage Differential (LVD) signal; you may find it referred to as LVD SCSI. The low voltage signal in LVD SCSI allows for a longer cable run (35 feet), and the signal is measured as the difference between the two wires of each channel inside the LVD cable. The advantage of Ultra 2 is that it has modest power requirements and is cheaper to make, and because it has two wires, twice as much data can be transferred. You may see the terms HVD SCSI and LVD SCSI bandied about, and this is what they refer to. You can't directly connect HVD and LVD SCSI, but it is possible to find an adapter called an LVD-to-HVD converter that makes the necessary conversions. You may also encounter a hybrid called "multimode LVD," or LVD/MSE SCSI. That implementation can switch automatically between LVD and single-ended mode. Things started to get weird when SCSI-2 substituted a 50-pin connector in place of the original 25-pin connector introduced by SCSI-1. The 8-bit standard of SCSI-2's 50-pin connector is sometimes referred to as Narrow SCSI, to differentiate it from the 16-bit 68-pin connector standard called Wide SCSI. Thus the terms Narrow SCSI and Wide SCSI identify not only the width of the standard but to some extent the size of the SCSI connector used. SCSI PerformanceThe speed of the SCSI implementation plays a role in what the manufacturer advertises its SCSI as achieving. After SCSI-1, the standards allowed for different speed implementations. After SCSI came Fast SCSI, just like after Ethernet came Fast Ethernet. You will see the Fast standards Fast-10, Fast-20, Fast-40, and Fast-80 SCSI in the literature, where the numbers represent a metric called MegaTransfers/second. For an 8-bit-wide bus, 20 MegaTransfers corresponds to 20MBps, and for the 16-bit bus, 40MBps. That's why each of the Fast standards has two quoted speeds. These so-called "Fast" speed standards are advertised and known by other names:
Note Ultra 3 (or Ultra160) is often promoted as an alternative to Fibre Channel because the throughput speeds are similar. It's a debate that will go on for a while. Fibre Channel has higher numbers of nodes (126 versus 15) and a much longer run length (10km versus 35 feet). Ultra 3 offers better manufacturer interoperability as well as lower costs. These characteristics define where each is used. You don't typically find Fibre Channel inside servers, connecting just a few drives, or connecting arrays that are close to a servercharacteristics of small business or departmental servers. For larger deployments with many more nodes and longer cable lengths, Fibre Channel has the advantage. Let's break this down just a little bit further to see how all these terms come together to form some of the slang surrounding SCSI. Fast-Wide SCSI is the Fast-10 standard using a 16-bit (wide) bus. Wide Ultra SCSI is the 16-bit version of the Fast-20 or Ultra standard. It's really easy to see how people get confused when talking about SCSI. Let's simplify this by getting down to what you really need to know from a server perspective when it comes to disk drives. The Suitability of SCSIThese days, when people refer to SCSI, they are usually referring to the SCSI-2 standard. Ultra SCSI and Ultra 2 SCSI are both in wide use and are usually referred to by their correct names. It is very rare to run into terms such as Fast-20 or wide, and when SCSI is HVD, people don't generally differentiate it from all the other common garden varieties of HVD around. Ultra 3 and Ultra 4, on the other hand, are newer and are used in high-performance applications. People who use Ultra 3 generally refer to their implementation as LVD SCSI or U160 (and its variants). Ultra 4 users generally use the term Ultra 320. For the most part, SCSI standards are interoperablebut not always. If you have two-pin compatible forms of SCSI and plug one into the other, the combination will run at the slower standard's speed. Ultra 2 and Ultra 3, for example, are backward compatible with any of the earlier single-ended versions of SCSI but revert from multimode to single mode when they sense a single-ended device. You need to pay attention to the types of controllers you are using, the cables and terminators they require, and the drives you are purchasing if you want to minimize problems. Let's now consider suitability to task as it relates to SCSI and servers. Table 11.1 lists SCSI cable lengths, speeds, and connectivity.
From Table 11.1, we see that the following is true:
Indeed, this is just the pattern of usage you find in the industry. Now that we've reprised and hopefully demystified the various SCSI standards, let's take a look at the disk drives in common usage. SCSI DrivesSCSI drives are favored for captive disks inside servers, for smaller disk arrays, and in some high-performance disk arrays. SCSI drives are much more expensive than PATA or SATA drives; you can expect to pay a premium that can be as much as two to three times the price of SATA or PATA drives. Two speeds of drives are currently in use today: 10,000rpm, or 10k, disks marketed for mainstream servers, and 15,000rpm, or 15k, disks marketed for high-performance servers. As of this writing, you can purchase SCSI drives in the range of 37GB to 300GB, or you can purchase PATA or SATA drives up to 500GB. SCSI tends to lag a little behind in terms of capacity despite being the leader in performance. The most common types of SCSI drives currently on the market are Ultra320 LVD SCSI drives in either the 10K or 15K speeds, and these Ultra320 drives come with either 68-pin Wide SCSI or 80-pin SCA-2 (hot-swappable) connectors. Most of the drives sold on the market today ship with 8MB of onboard cache in them. As a general rule, the 10K drives have seek times of around 5ms (microseconds) average, and the 15k drives have a seek time of about 2.75ms to 3.5ms on average. Earlier, we looked at a 160GB Samsung PATA drive. Now let's compare that drive to the characteristics you would find for a Seagate Cheetah Ultra320 ST3146854LW drive of similar size. Here are some of the specifications:
This level of performance is particularly impressive given that this drive is two to three times as fast as the Samsung drive detailed earlier. Perhaps the most important comparison is price. The Samsung SP160N's full retail price is a little more than $100, while the Seagate drive is approximately $1,200. Both of these are new drives, but whereas the Samsung will maintain its price over time (it's still a commodity), you will see a large price drop over the next 18 months in a drive like the Seagate one. Fibre ChannelFibre Channel is a high-performance serial drive interface that borrows from the SCSI command set to attach to drives using either copper or high-speed optical wires. Fibre Channel has some very desirable features from the standpoint of server/storage deployment. Originally deployed in mainframes around 1988, Fibre Channel was meant to replace HIPPI. Fibre Channel is the interconnect of choice for large storage devices and SANs. It allows a large number of disks to be connected. You tend to find Fibre Channel employed as the predominant interface type in enterprise-class storage servers such as EMC Symmetrics or Network Appliance NAS servers. There are three different Fibre Channel topologies or architectures:
The wiring used in Fibre Channel from the backplane to the HBA or from the backplane to the switch is either an optical (that is, fiber) or twisted-pair copper cable. Fibre Channel, like many other transport technologies, borrows from and owes a lot to the SCSI standard. Normally, you see the protocol broken down into a five-layer scheme:
Note For more information about the Fibre Channel protocol layers, see http://hsi.web.cern.ch/HSI/fcs/spec/overview.htm. Because Fibre Channel is expensive to deploy and requires separate networking skills, it isn't used in smaller disk deployments. For a server or an array with five or fewer drives, there is no significant performance boost in using Fibre Channel versus SCSI. Characteristics of Fibre ChannelYou learned a lot about Fibre Channel when you learned about SCSI. Fibre Channel borrows many things from the SCSI world. The following factors separate Fibre Channel technology from SCSI and the other drive standards:
Fibre Channel is reliable and well established. Many factors have given Fibre Channel a distinct advantage over other technologies in the past. There is so much investment and expertise in Ethernet infrastructure that developing storage over TCP/IP technologies may steal a lot of Fibre Channel's market share over time. Arbitrated Loop TopologyOne of the major reasons Fibre Channel is so widely deployed is that you can create a topology called an arbitrated loop. As shown in the previous section, FC-AL isn't a loop at all (except topologically speaking); rather, it's a connection through the backplane of drives to the HBA. On the backplane is a port bypass circuit (PPCB) logic board that allows for the fast operation of the loop and for the ability to hot-swap drives. Although there are 126 assignable addresses (the 127th is the host), in practical terms, you can attach from 45 to 55 disks to an FC-AL before you start getting performance degradation. Performance in these storage systems is typically measured in terms of I/O per second. Often the diagnostic utility of choice is the freeware Intel IOmeter. FC-AL can use coaxial, twin-axial, or optical cabling, and there are no drive switches or jumpers to set because the interface is self-configuring. In terms of fault tolerance, a backplane is both hot-swappable and double ported. The current standard runs at 100MBps, with a frequency of 1.062GHz. When Fibre Channel is configured into the dual-loop topography, it picks up not only speed but fault tolerance as well. Dual loops uses two Fibre Channel HBAs connected to each disk drive, allowing each bus to access the same drive. Although only one of the two loops may access a drive at a time, the net result is that you get an overall doubling of throughput. Not only that, but if one of the bus's connections to a disk fails, that disk is still available through the other bus. The ability to share data between two systems provides built-in redundancy that saves you from having to duplicate a data set. Fibre Channel really has a lot to recommend it, especially when you are trying to build a large shared storage system in a storage area network (SAN). Installing a Fibre Channel DriveTo install a Fibre Channel drive, follow these steps:
If your Fibre Channel drive doesn't spin up, the most likely problem is that you don't have a good physical connection to the backplane. You should remove the drive and check its mounting and then reinsert it carefully and firmly. If the drive does spin up but isn't recognized by the Fibre Channel HBA, you need to run the host adapter setup utility that ships with your adapter to enable the drive through software. |