Network Storage Architectures That Overcome DAS Limitations | Storage Networking Fundamentals: An Introduction to Storage Devices, Subsystems, Applications, Management, and File Systems (Vol 1)

The Internet makes it clearly evident that we live in the Information Age, where data access is paramount to the success of an organization. It has become obvious to many that the most valued asset in their entire IT infrastructure is their data. The efficiencies of conducting automated business functions depend on having data stored and providing access to it. In some cases, an organization's data was collected over decades, and it represents unrecoverable business intelligence that cannot be replaced.

One-to-Many Relationships with DAS

The primary problem with DAS storage is that there are single points of failure, which can block access to data. Bus topologies for data access have not been able to reflect the seminal role data has in the overall scheme of data processing.

DAS uses a one-server-to-many-storage relationship structure, in which a single server controls one or more storage devices, as shown in Figure 1-5.

Figure 1-5. One-to-Many Relationship Structure of DAS

A subtle detail in DAS's relationships is that a single bus controller manages all bus activity. In general, storage entities on a DAS bus do not communicate directly with other storage entities. Even though the technology might support storage-to-storage communications, those implementations are not common.

Many-to-Many Relationships with Storage Networking

If data is the most important asset, the connectivity providing access to it must also be highly valued. Communication topologies for data access need to be flexible, resilient, scalable, and secure.

In traditional client/server environments, server systems have been the center of attention. Client systems located throughout the organization can access applications running on centralized servers. However, with the DAS one-server-to-many-storage relationship structure, there is no way to build a flexible connectivity infrastructure between servers and storage that provides the necessary options for alternative paths to data.

The missing link is clear: The communication topology between servers and storage has to change.

Network storage changes the relationships between servers and storage by creating peer-level communications between all entities in the network, servers and storage alike. Whereas DAS uses a one-to-many relationship structure, network storage has a many-to-many relationship structure where multiple servers and multiple storage devices can all communicate with each other.

This new arrangement creates all kinds of new opportunities for storage applications. Servers can communicate with storage but can also communicate with each other over a storage channel. Storage can communicate with servers and other storage to provide a variety of management functions, including copying data from one location to another.

Beyond traditional server/storage relationships, network storage introduces new network entities into the mix. Switches, routers, and gateways can all have active roles as peers in storage communications. There is no question they are essential parts of the plumbing, but they might also be used as active participants in the management of storage.

Connection Flexibility of Storage Networks

When SANs were first introduced, they were commonly described as the storage network "behind" the servers in a client/server network. Sometimes this was illustrated by placing a LAN between clients and servers and a SAN between servers and storage, as shown in Figure 1-6.

Figure 1-6. A LAN "Front End" and a SAN "Back End" Network

One of the main differences between the simple SAN model illustrated in Figure 1-6 and the DAS data-access model is the presence of a spare server in the SAN model. The spare server is the "+1" part of an N+1 deployment. If an application server needs maintenance or fails, the spare server can assume its IP address and take its place in the client/server network.

More importantly, where data access is concerned, avoiding the problems of DAS-style "electric love" that necessitate downtime, the spare server simply logs in to the proper storage and immediately starts accessing data. In the case of failures, there will be some necessary data integrity checking to perform first, but there is no need to power off the storage or the spare server to make wiring changes.

The concept of a spare server can easily be extended to incorporate server upgrades. A server system lacking CPU resources for an increasing workload can be replaced with a different, more powerful machine. The new system can noninvasively be plugged into the storage network and tested before it connects to the storage. Depending on the capabilities of the operating system, the cutover from the old server to the new one can take seconds, as opposed to hours. Figure 1-7 shows a new server taking over for an aging server that is being retired.

Figure 1-7. Upgrading Servers in a Storage Network Environment

Similarly, storage can also be upgraded. New storage can be connected to the storage network for testing and prepared for use. The process of upgrading storage is more complicated than upgrading a server because the data has to be moved from one storage location to another. While there are several ways to accomplish this, the key point made here is nothing has to be powered off and no cabling changes have to be made during the cutover process.

Architectures for Data-Centric Computing

In today's computing environment, it is virtually assured that systems and storage products will come and go, but the data itself will remain useful and important for a much longer period of time.

One of the most significant contributions storage networking has made is the introduction of new data-centric computing architectures. Whereas client/server computing uses an application-centric architecture, with many client systems accessing centralized application servers, storage networking facilitates data-centric architectures with multiple servers accessing centrally located storage subsystems.

Figure 1-8 depicts a data-centric computing architecture implemented with a storage network.

Figure 1-8. A Data-Centric Computing Architecture

One of the most important aspects of Figure 1-8 is the presence of multiple access points for storage and systems in the storage network. Obviously, if there is going to be many-to-many connectivity, the individual components have to be able to support it. These access points can be physical network ports with unique addresses, such as network interface cards (NICs) and host bus adapters (HBAs), or they could be logical in nature, such as service access points (SAPs), subaddresses, and virtual network addresses.

With plentiful connection options, it is possible to build storage networks in which dataand the access to itare given the highest emphasis through the architecture. Data-centric computing is not possible without robust, fault-tolerant, multiported storage that can easily be connected to many different servers simultaneously.

The Mighty Notebook Weighs In

Sometimes it's easier to understand concepts like "data-centric computing" by changing the context a bit. This time, notebook computer storage is used to illustrate data-centric computing.

Notebooks are self-contained, complete computing solutions that give you everything you need. The problem is that strange things happen to notebooks. They become lost, get stolen, get dropped, and become moody from travel abuse. If access to data is the goal, it's clear that the notebook's data needs to be protected in a way that makes it readily available.

A system-centric storage approach would back up the notebook's data to one or more CDs that are kept with the laptop. Obviously, if you lose your notebook, there is a good chance that the CDs and your data will also be lost. In the best-case scenario, a physically damaged notebook can be replaced with another using the first machine's hard drive. It's also possible that you might be lucky enough to restore data from CDs that survived. Otherwise, you might be out of data and out of luck. In any case there is a definite loss of productivity (and opportunity).

The data-centric model does not assume that data is part of the notebook, but exists to be processed by applications running in a system. The data could be stored on an external disk or memory device. In fact, multiple copies could be made on multiple devices for safety and convenient access. Now, if something destroys the notebook, the data is still immediately available on any machine that has the software that can use it.

Of course, most of us do something like this with our own notebooks and desktops (we DO, don't we?). The rationale for building storage networks and data centric computing environments is the sameit's just that the scale and stakes are different.

By contrast, if the data-centric approach is taken, the notebook's data could be stored on a small, portable, external disk drive with a universally common interface such as USB. This way, when the notebook goes AWOL, the data can still be accessed almost immediatelyall that is needed is any compatible computer with the same applications and a connecting port for the external disk drive. This other system could be located at work, home, school, a friend's house, or a commercial business that has systems available for use, such as a Kinko's Copy Center or an Internet café.

The real power of the external drive is that it provides almost limitless connection options to systems. A USB portable disk drive can connect to millions of computers around the world.

Obviously network storage subsystems designed for data center environments are not portable, nor do they use USB as an interface. But they can have the same powerful ability to connect storage to many potential servers that could be called upon to work with data stored within.

Enterprise Storage

Data-centric architectures were initially promoted by EMC, IBM, and Hitachi as a benefit of large storage subsystem products classified as enterprise storage products.

Enterprise storage is a product concept that is something like a data "fortress." It has broad connectivity to support many servers in addition to having many redundant components and technologies for withstanding component failures and environmental problems. Data is stored with redundant protection within the enterprise storage subsystem.

Enterprise storage subsystems are discussed in greater detail in Chapter 5, "Storage Subsystems."

Redundancy and Data-Centric Topologies

This chapter closes with a glimpse of a topology that provides redundant data access paths to centralized storage. Figure 1-9 shows a data-centric environment with storage at the center of the infrastructure.

Figure 1-9. A Topology for Centralized Storage with Redundant Paths

Both servers in Figure 1-9 have the ability to access centralized storageand all the data stored therethrough two different paths. Chapter 11, "Connection Redundancy in Storage Networks and Dynamic Multipathing," covers redundant pathing in much greater detail.

In general, the potential for any-to-any data access is the desired goal, not necessarily the reality, of universal storage connectivity. As long as a connection can be created instantaneously when it is needed, the storage network is doing its job. In practice there are many reasons to restrict access between servers and systems that have nothing to do with each other under normal operating conditions.