Storage Area Networks

< Day Day Up >

Storage Area Networks is not a specific technology. It has become so closely associated with Fiber Channel that some people think they are the same, but they are not. A SAN is a storage system architecture, and Fibre Channel is one way of implementing it. There are other technologies that can be used to build a SAN. iSCSI is emerging as a way of creating a SAN without the expense and aggravation of dealing with Fibre Channel. It uses the more common Ethernet and IP infrastructure instead.

The SAN architecture replaces the I/O channel bus (Figure 2-6), common to technologies such as SCSI and ATA, with a network. Whereas NAS is file storage on a network, SANs are a network architecture for storage, capable of block I/O.

Figure 2-6. The SAN architecture replaces the DAS architecture

The storage bus architecture, developed for SCSI, suffers from a number of common restrictions. Distance, address space restrictions, number of supported devices, and a one-host/several device limitation make scalable, highly resilient storage systems difficult (and costly) to build and maintain. SANs, being network-based, overcome these limitations. Maximum distance is measured in many kilometers. Address space can be in the millions. The number of devices is usually limited only by the address space. SANs are many-to-many systems and can easily support multiple paths between devices. This makes SANs less susceptible to failures due to path failure.

A related benefit is better cable plant management. Because SANs use serial networking technology, the cables are thin and flexible. SAN cables can be dressed inside cabinets. They can be laid under floors and in ceilings, something that is very difficult with big, fat, rigid SCSI cables. This is a major concern in large installations.

DAS is difficult and expensive to scale. There is a one-to-one relationship between servers and storage. If more storage is needed, often, more servers need to be added as well. In a SAN, adding more storage does not mean more processors. Adding more servers does not mean adding more storage, either. The storage and server architectures can be scaled independently.

The implementation and topology of a SAN varies greatly by the technology used. An IP-based SAN will be designed differently from a Fibre Channel one. SANs may also use more than one type of technology, mixing Fibre Channel, iSCSI, and WAN or MAN elements, depending on system needs and cost constraints. Whatever is used to build the SAN the effect is the same. The servers and storage are decoupled and a multipoint network placed between them.

SAN Components

A SAN needs to have only three components: a host, a network, and storage. Hosts can be computers of any type but are usually servers. Storage can be of any type, such as disk arrays, tape libraries, or CD-ROM jukeboxes. Most often, a SAN is deployed when there is a need to share storage or hosts. If the goal is to consolidate a data center, for example, there is a need to share equipment. In this instance, a SAN may be justified. For a high availability infrastructure, the ability to perform failover to a similar device often makes SANs an important option.

That said, there are a lot of variables when it comes to SAN components. The applications involved drive the type of network needed, which in turn defines the components.

Is It Possible to Just Add More Storage?

It would seem that scaling a SAN should be a relatively easy thing. If something more is needed, simply add it. It's never that simple. To begin with, adding more of anything to a SAN means more network ports are consumed. That presents a problem when adding one more port means buying an entire switch at considerable cost. This is a problem typical of all networks.

The bigger issue is with provisioning, or allocating resources. Depending on the type of network, there may not be enough bandwidth available for the application in question. Allocation of the storage space itself is a problem. Just because the storage is available doesn't mean that a server can use it effectively. If the volume manager doesn't allow a system administrator to change the size of the volume to encompass the new disk space, it may have to be set up as a separate volume. That's often fine for files but not for a database, which may not be able to span volumes. In any event, the application is likely to have to be taken offline for the changes to be made. That is not an attractive option in a high availability environment.

Fibre Channel (FC)

The most common technology used to build SANs today is Fibre Channel (FC). Fibre Channel combines serial network and I/O channel technology to create a high-performance, low-latency interconnect. Although there are copper wire implementations of Fibre Channel, fiber optic support was included from the beginning, giving it excellent long distance capabilities (Table 2-4). Today, fiber optic cables are preferred for Fibre Channel networks.

Table 2-4. Fibre Channel Cable Distances
Cable Type	Maximum Effective Distance
Multimode fiber optic	500 meters
Single-mode fiber optic	2 kilometers
Copper	30 meters

Fibre Channel is very fast. Current implementations have signaling speeds of 1 or 2 Gigabits per second; speeds of 4 Gigabits per second are also available, with 8 and 10 Gigabit implementations in development. The majority of the installed base of Fiber Channel is 1 Gigabit, with 2 Gigabit likely to be implemented in new SANs; 4 Gigabit FC is deployed mostly for switch-to-switch links and inside very-high-performance arrays.

Links, whether optical or copper, are full duplex, with each port capable of transmitting and receiving at the same time. Data is packaged in frames, and frames from different channels are interleaved. Because of these characteristics, data from one channel does not have to wait for all the data on another channel tobe sent to perform its I/O. This is very important when writing to several disks or devices simultaneously. Unlike Parallel SCSI, there is no contention for a bus, though as in Ethernet, there can be network congestion issues.

Fibre Channel Network Stack

Like most network technologies, Fibre Channel has a stack that defines the various layers of functionality. Each layer is defined by a set of standards and specifications that are maintained and expanded by the INCITS T11 committee (www.t11.org) in the same way that the T10 committee manages the standards for SCSI.

Each layer in the stack (Table 2-5) describes the behavior of Fibre Channel at a certain level, starting with the FC-0 level and progressing to FC-4.

Table 2-5. Fibre Channel Stack
Fibre Channel Layer	Description
FC-0	Physical
FC-1	Transmission
FC-2	Framing and signaling
FC-3	Common services
FC-4	Upper-level protocol mapping

FC-0 defines the physical layer including signaling, media, cable requirements, and receiver and transmitter specifications. FC-1 specifies the transmission encoding scheme. It uses 10 bits to encode 8 bits of data, the other two bits being used for error correction. This is why it is called 8b/10b encoding. The FC-1 layer also includes a set of specifications that define link layer behavior.

FC-2 specification defines the format of the FC frame, flow control, fabric login and logout, classes of service, and frame delivery characteristics. FC-3 provides specifications for common services, such as multicasting.

The top layer, FC-4, is most responsible for the success of Fibre Channel as a technology. FC-4 defines the mappings of common protocols to the FC infrastructure. From the beginning, FC was designed to allow all types of networking and application protocols to ride on top of the network infrastructure. The most common is the Fibre Channel implementation of SCSI (FCP), but many others are also available. Internet Protocol, or IP, is another common protocol readily available in an FC environment. IP is often used to carry management information in-band and has been used in the specialized applications in the same fashion as IP over Ethernet. Fibre Channel is capable of accommodating storage protocols over a network while maintaining a high-performance, low-latency environment

Fibre Channel Topology: Point-to-Point, Fabrics, and Loops

Fibre Channel supports three topologies: switched fabric, point-to-point, and arbitrated loop. Point-to-point refers to connecting two devices without the benefit of a network. Although fairly uncommon, it is sometimes used when there is no need to share devices but distance or performance is a problem. It has become less common since the advent of SCSI Ultra3 and SCSI Ultra 360.

Switched fabric, usually referred to simply as fabric, uses a Fibre Channel switch to provide full-bandwidth connections between nodes in the network. Fabrics may consist of one or many switches, depending on scale, availability, and cost considerations. Fibre Channel fabrics also provide other network services. Switches implement naming, discovery, and time services as part of the fabric. This is different from many other network architectures. Unlike Ethernet or IP, these are required to be implemented in the switch. DNS for IP networks does not require that it be part of a switch or router and is often a separate device. In Fibre Channel fabrics, these services are integrated into the fabric.

The terms fabric and switch are often used interchangeably. They are not the same and should not be used as such. Fabric refers to the topology and services, but not a device. A switch is a network device that implements the fabric.

Switches and Directors

Another type of Fibre Channel switch is called a director. Directors are switches, just large ones. They are loaded with high-reliability features, but at a higher cost. Switches can be quite small, as small as 8 ports, although 16-port versions are more common. Directors tend to have 64 ports or more; the largest have hundreds of ports. All directors have high-end reliability features such as two or more power supplies, automatic failover of critical components, and the ability to perform firmware upgrades without powering down the unit.

The third Fibre Channel topology is called Fibre Channel Arbitrated Loop, or just loop. Arbitrated loops were conceived of as an inexpensive way to implement a SAN by eliminating the relatively expensive switch, along with its integrated services. In an Arbitrated Loop, all nodes are connected in a loop, with frames passing from one node to the next. All nodes on the loop share the available bandwidth, which inhibits the scalability of Arbitrated Loop. The more nodes that are transmitting on the network, the less bandwidth is available for any individual node. As fabric switches have become less costly, Arbitrated Loop has fallen from favor. It is used mostly inside storage devices such as arrays.

Fibre Channel Addressing

All Fibre Channel nodes carry an address called a World Wide Name (WWN). The WWN is a unique 64-bit identifier. Much like an Ethernet MAC address, part of the WWN is unique to manufacturer of the equipment, and the rest is usually a serialized number. However it is created, the WWN is unique and specific to a physical port.

Changing the World Wide Name

Although it would seem that the WWN is immutable, there are some instances in which it can be changed. WWNs are often placed in nonvolatile memory (NVRAM) and as such can be changed, given the right utility. A utility such as this would need to be available at boot time, before the port was fully initialized. In the early days of Fibre Channel, it was not uncommon to find host bus adapters with this capability.

Why would anyone want to do such a thing? Having the same WWN would have a similar effect to having the same MAC address in an Ethernet environment. At least one of the devices would not be able to log into the network, or devices would become confused as to the origin of a frame.

The reason this facility sometimes exists is that the NVRAM that holds the World Wide Name can become corrupted, making the port unusable. This used to happen with host bus adapters placed in poorly shielded computers. It is a dangerous utility to have around (as will be evident in Chapter 6: Storage System Security) and should be locked away, where no one can get at it.

Some devices do this on purpose. In certain multiport devices, some ports are kept offline in case a port fails. If one does, the spare port becomes active with the original port's WWN. This makes it look like the original port to the network. This method of failover has the advantage of not requiring hosts or applications to do anything. On the other hand, I/O is usually lost, and some applications may fail during the changeover.

A 64-bit address is very large. Needng to have a frame carry two of these one for the source and one for the destination makes routing packets between ports cumbersome. To combat this problem, Fibre Channel also uses an alternative addressing scheme within Fibre Channel switched fabrics. Each port is assigned a 24-bit port address within the fabric when the port logs in. There are two advantages to this. One, it is faster to route packets on a smaller address (and takes less processor time). Second, the addresses are dynamically assigned and managed by the fabric operating system. This also makes it easy and faster for the OS to deal with changes in the fabric.

Fibre Channel SAN Components

Like other network technology, Fibre Channel needs certain basic components to create the network infrastructure and connect to it. In addition, a Fibre Channel SAN needs to have storage; otherwise. it's not really a SAN.

There are basically three major components to a SAN. The first is the host bus adapter (HBA). Deriving its name from the SCSI HBA, the FC HBA allows host devices to connect to the Fibre Channel network. Almost all FC HBAs support fabric, point-to-point, and arbitrated loop topologies. The host connector depends on the type of computer that it is intended for, but most support the PCI bus. Newer HBAs support PCI-X, and a smaller number of older adapters have support for Sun Microsystems' S-bus technology.

The second piece of the Fibre Channel SAN is the network equipment. Because most FC networks are installed as fabrics, switches are needed. The type of switch will depend on the nature of the SAN applications. A small SAN will likely use one or more 16-port switches, whereas a large, high availability SAN is a good candidate for a director.

Finally, it would not be a SAN without storage, so Fibre Channel-enabled disk arrays, tape libraries, and other storage devices will be deployed in the network. There are other components that can be added, including device management software, storage resource management (SRM) software, gateways to other networks, and various appliances. However, with HBAs, a switch, and some storage, the basics of a SAN are in place.

Zoning

A concept important to Fibre Channel SANs is zoning. In a nutshell, zoning is a crude method of limiting access to SAN resources. It is specific to fabrics (it is a fabric service) and allows a set of nodes to see only other nodes in the same zone. Similar to LUN masking, zoning doesn't restrict frames from being sent from one node to another. Originally conceived of as a way of keeping operating systems from seeing and grabbing hold of resources that were already claimed by other hosts, zoning has become important as a tool for managing large SANs.

There are two types of zoning. The first is based on the World Wide Name of a Fibre Channel port. Called soft zoning, it has the advantage of flexibility. Zones do not need to change when you move a device to a different port. The disadvantage is that it is a less secure method of zoning. This is discussed in detail in Chapter 6, Storage System Security.

The second type of zoning is called hard zoning. Hard zoning creates zones based on the switch port address. In some ways, this is less amenable to changing environments, because the zone has to be changed when devices are moved from one port to another. There are some security benefits to hard zoning.

The default zone for new devices is no zone. There is no default zone. It is considered good practice to zone all ports, even ones not in use.

Port Blocking

Fibre Channel switches do not have unlimited bandwidth. The amount of bandwidth available to service connections is a function of the amount of memory in the switch, the speed of the internal processors, and the design of the switch chips. It is hardly ever cost effective to provide enough bandwidth to handle full-bandwidth connections to all ports at the same time.

Instead, switches deploy a strategy of port blocking. When the devices on the Fibre Channel SAN are trying to transmit a quantity of data that exceeds what the switch is capable of handling, some ports will be temporarily blocked and applications forced to wait until system resources are available. This is not usually much of a problem. It is unlikely that all ports will want to receive and send at full bandwidth at the exact same time. Even when that does happen, which can occur in some very demanding environments, the amount of time that a port is blocked is small enough not to be noticed by an application.

Tip

Some director-class switches claim to have all ports nonblocking. Read the fine print to see if this is really the case. It may be true for limited configurations but not for a full port count.

IP-Based SANs

For many years, the terms SAN and Fibre Channel were nearly synonymous. Although there were crude SCSI SANs using specialized and quirky devices, any real SAN was a Fibre Channel SAN. That has since changed. SANs using IP over Ethernet have proved to be a viable alternative for many applications.

Because Fibre Channel SANs already exist, it begs the question "Why bother?" FC SANs are proven technology, and IP networks are not nearly as deterministic and, hence, not as conducive to block I/O. Why introduce the SAN architecture into a new environment? The answer is cost and stability. Fibre Channel is still very expensive to implement. The components are specialized and very costly for many applications. It needs a specialized skill set within the IT department, as well as specialized management software that is largely not integrated with other networking suites.

An IP SAN, in comparison, leverages existing skills and knowledge in the IT department. It uses equipment common to other networking needs, including switches, NICs, test equipment, and software. Having one homogenous network architecture also reduces management costs by providing only a single problem set to deal with. For many IT professionals, maintaining an entire separate network just for storage is difficult to justify.

Many of the components are less costly, especially switches, on a per-port basis. Even where equipment is on par, from a cost perspective it is expected to drop faster, because Ethernet products sell in much higher volume than does Fibre Channel. Over time, the cost of an IP network should be less than that of a Fibre Channel one.

Fibre Channel has been plagued by incompatibility issues since its inception, and IT professionals are tired of having to deal with these problems. There is the perception that Ethernet and IP products are more stable. There is some truth to this. Ethernet and IP have been around much longer than Fibre Channel (by more than 20 years) and have very strong standards.

Simply put, Fibre Channel gives some people sticker shock, is overkill for many applications, and has a history of incompatibility problems. IP is a known problem set. Most IT departments understand IP and are comfortable with it.

IP SAN Performance: IP SAN Performance: When Is It Good Enough?

The biggest drawback to implementing a SAN using IP networks is performance. Fibre Channel fabrics are deterministic, meaning that they can be relied on to deliver frames within a certain amount of time. Block I/O was originally built around a bus architecture that delivers all the data requested right away, as soon as the data transfer starts. Thus, block I/O depends on deterministic behavior to some degree. Fibre Channel can deliver that.

IP networks are by nature nondeterministic. Packets are delivered based on best effort. Congestion, loss of connection, and other disruptions in data delivery are common. This mode of operation can give high-performance storage applications fits, causing timeouts and other undesired behavior. It is fine for file I/O but can cause problems with block I/O.

That doesn't mean that IP networks are unusable for block I/O. Instead, there are limitations as to what can be reasonably expected in terms of performance. For many applications, it won't matter at all. Tape backup, for example, benefits from a SAN because of connectivity, not performance. Most backup is already done over IP networks, and the slow speed of most tape drives does not put demands on the network.

In situations where performance is not the chief concern, but cost and connectivity are, IP SANs are good enough.

iSCSI

Many protocols have been introduced for building SANs over IP networks. Most were brought out by now-defunct companies. The industry settled on iSCSI instead, and it has emerged as the standard for SANs on IP networks. iSCSI, as its name implies, is a form of the SCSI protocol that operates over an IP network instead of the Parallel SCSI physical layer. It has the same general architecture, with initiators sending commands from targets, which deliver data and responses.

Like Fibre Channel, iSCSI allows for many-to-many configurations, very long distances (though how long remains to be seen), and a much larger address space. iSCSI maps SCSI LUN addresses to IP addresses and port numbers, allowing for millions of potential addresses.

iSCSI uses IP as its transport, which means it can work on any physical network that supports IP (including Fibre Channel). Gigabit Ethernet will be needed to do almost anything useful. Storage applications are heavily affected by bandwidth and latency. The high speed of Gigabit Ethernet will be necessary for applications to perform effectively. New generations of NICs that place the TCP/IP stack on a chip will also help with high-performance applications.

Storage NICs

Some iSCSI interface cards use the term storage NIC instead of host bus adapter. This is, in part, a recognition that iSCSI runs over Ethernet and IP, where NIC is the term of choice. A storage NIC differs from a standard Ethernet NIC only in that the iSCSI protocol is onboard and not in the OS network stack. Even then, some manufacturers have chosen to call storage NICs host bus adapters.