Connecting into the Data Center | Storage Networks: The Complete Reference

Connectivity refers to how the SAN as a whole integrates into the data center. In this discussion of connectivity, we will draw a theoretical line between SAN components and server processing. As such, the SAN as a whole includes the components that connect to the FC switch with a dotted line drawn at the HBA. This allows us to position the FC SAN in a set of server populations and existing configurations. Even if only used as a logical model, it helps us visualize the inevitable problems. Figure 16-1 shows the logical demarcation line we will use throughout the chapter.

Figure 16-1: Viewing the SAN as an entire solution

Implementing a SAN configuration into a data center presents three challenges right off the bat. The first challenge is the need to configure a multiswitch SAN. As I/O workloads, driven by supporting applications, are applied, the need to provide multiple connections, discrete zones, and paths for performance drive the configuration into multiple FC switches. Theres just no way around it. Over and above this is the need to provide some level of recovery and redundancy for data availability and protection scenarios. Existing policies and practices will drive this challenge; if not, its good to start thinking about recovery, redundancy, and single point of failureof which, a single switch strategy presents .

The second challenge is in providing support to the server population within the data center, which can be driven by the justification for the SAN in terms of server consolidations and supported applications. Note, though, that it also moves into the realm of data ownership, production cycles and processing, and data sensitivity. More importantly, it pushes the need for a heterogeneous SAN supporting more than one operating system at the attached server level. In many cases, this becomes the big SAN imbroglio in which implementation becomes political and ceases being productive, or relatively anxiety-freeor, for that matter, fun in any way.

The third challenge is supporting external networks, which encompasses all of the issues mentioned previously. The ability to provide the data stored within the SAN to corporate networks is dependent on the external network points within the data center, as well as the capability of the attached server to distribute this data within the infrastructure it is connected to.

Implementing Multiswitch Configurations

The need to configure a multiswitch SAN solution will become evident when designing your first solution. Any solutions should encompass a rationale for the configuration orientation used (for example, cascading, core /edge, or mesh). Initial implementations often move from a simple cascading architecture to a multilayered cascading architecture without the benefit of, or options for, moving into a more complementary configuration. Keep in mind that SAN configurations are driven by supported I/O workloads. As such, the need to provide recovery and redundancy should play a significant role in the design. Here are a few ideas to chew on.

OLTP Workloads Using a core/edge configuration enhances performance through storage access at the edge, while reducing instances of single point of failures for high availability transactions accessing multiple I/Os through the SAN configuration (shown in Figure 16-2).

Figure 16-2: OLTP workloads supported by a core/edge configuration
Web Applications/Messaging Workloads Using a mesh design provides alternate paths for high traffic, asynchronous I/O processing, while reducing instances for single point of failure. It relies on effective switching methods for high traffic management within the SAN configuration (shown in Figure 16-3).

Figure 16-3: Web and messaging applications supported by a mesh configuration
Data Warehouse/Datacentric Workloads Using a cascading design provides the necessary access and performance of datacentric transactional workloads. Here, availability requirements are reduced, but access to large data bases remains paramount when it comes to processing time and I/O intensive transactions (shown in Figure 16-4).

Figure 16-4: A data warehouse application being supported by a cascading configuration

Coming up with the appropriate SAN design for I/O workloads also requires some thoughts on recovery. Though driven by the applications needs, a working design should encompass current backup/recovery processes that are in place within the data center. Its good to reconsider recovery issues relative to the storage capacities of the SAN and the recovery requirements of the data. In a normal backup situation, data is copied from the storage arrays to an attached server and shuttled onto the network so it can get to the backup servers attached to tape drives . In all likelihood , these are located close to the tape library. However, if the SAN starts out with capacities over 500GB of user data, this scenario might just overcome existing backup practices.

Consideration should be given to backup/recovery within the SAN infrastructure, which means the addition of external tape mediatape drives and tape libraries providing backup/recovery operations integrated into the SAN. Designs will have to be modified, regardless of configurations (core/edge, mesh, cascading), in order to accommodate the backup traffic and backup/recovery softwaremeaning the addition of a bridge/ router component. This setup adds another benefit (which is largely dependent on recovery requirements and storage capacities): the evolving capabilities of FC fabric to provide a direct copy operation in support of backup software will eliminate a significant amount of I/O overhead. Often referred to as server-free backup or server-less backup, these configurations will be covered again in Chapters 22 and 23.

Supporting the Server Population

The next challenge is determining which servers to support. Not only is this driven by what applications are moving over to the SAN, but also what strategy justified the SAN in the first place. If consolidation justified the SAN, then someone will be looking to retire or redeploy a number of servers through the consolidation efforts, which brings up an interesting set of activities.

First is the modification of existing capacity plans to calculate the requirements of new or redeployed servers, and to calculate the necessary I/O workloads as they contribute to the design of the SAN (some examples and guidelines for estimating I/O workloads are upcoming in Chapter 17, with specifics for SANs following in Chapter 18). This will very likely make you have to think about moving applications that are running on both UNIX- and Windows-based servers. And though this is possible, and many vendors provide a mountain of literature on the theory and ability to support this configuration, it significantly raises the cost of the SAN, not necessarily in terms of added hardware or software, but in complexities encountered implementing and supporting these configurations. Consider the following, carefully .

UNIX- and Windows-based servers do not access or handle storage in similar fashions . Even though there are multiple solutions based on POSIX standards, the capability to share devices within the SAN is not yet a reality, which means strict segregations have to be enforced with system-level zoning and LUN masking strategies that are both time- consuming and complex.

Applications supported by UNIX- or Windows-based servers have drastically different service levels. Design and configuration of the SAN will either add resources to the Windows area, which dont need it, or compromise the UNIX environment by not providing sufficient SAN resources. Plus, recovery requirements are going to be different, which means they will either share the tape media, likely to result in problems, or they will have to have their own tape media for backup.

We should stop here and make sure were not overlooking one of the big-time workhorses within the data center, the IBM and IBM-compatible mainframes. The need to interconnect a mainframe to open computing has been with us since the first UNIX server deployed its data throughout the enterprise, and the Windows PC began to develop its own legacy data, while needing to be integrated with the corporate record on the mainframe. Unfortunately, mainframe solutions continue to elude vendors, IT systems organizations, and research agendas not that disparate and site-specific solutions dont exist, they do, if only in order to support specific business applications. Integration of IBM mainframes into server environments remains elusive , even counting all the money IBM has thrown at making the old, but reliant, MVS operating system both compliant with POSIX as well as available to networking infrastructures like Ethernet and TCP/IP.

To its benefit, and its everlasting detriment, IBM mainframe-processing models predate many of the innovations we are now working with, including storage networking, data sharing, device pooling, and scalable configurationsboth internally at the high-end SMP level as well as in the multiprocessing, multitasking, transaction performance that no other processing platform can beat. Sadly, and perhaps tellingly, the IBM mainframe- processing model was a proprietary system that resisted evolving into the client/server architecture too long and was thus eclipsed by its own inability to deploy and support new and distributed applications in a cost-effective manner.

There is, however, hope on the horizon, in the form of a serious revolution in the IBM mainframe world. The traditional operating environment has given way to a new operating system: the zOS. This next step in the evolution of MVS provides a multipartitioned operating environment in which you can run an IBM Linux system on user-defined partitions that leverage the power of the mainframe while ensuring that it remains cost-effective.

So what does this have to do with storage and, specifically , integrating a Storage Area Network into the data center? Quite a bit, if you want the truth. Because heres the first question youre going to hear: Great! When can we connect to the mainframe?

The ability to share storage resources within the complete populations of servers, keeping in mind that zOS facilitates a logical set of servers, is a big-time perk for data center managers. Now, heres the flipside. IBM mainframes are moving toward their own orientation to storage area networking and are connected to their data paths by a fiber optic connection, which the IBM folks refer to as channels . This evolution from legacy bus, tag cables, and enterprise connectivity (or ESCON cabling: a parallel bus implementation), not to mention the low-level channel link protocols to Fibre Connectivity (or FICON), supplies the first real direct connect to the IBM mainframe usable by other open storage networks.

This level of service in SAN requires a FICON connection at the FC switch port level. IBM has been the pioneer in connecting storage through a switched environment, having integrated director-level switches for some time. At the end of the day, providing switch-to-switch communication with IBM storage switches may be an alternative to the evolving mainframe connectivity conundrum . Entry through an IP address will also provide connectivity, but the bandwidth restrictions make this alternative unlikely .

Figure 16-5 illustrates the many alternatives to supporting a heterogeneous OS data center through a SAN infrastructure.

Figure 16-5: A fully heterogeneous OS SAN-supported environment

Supporting the External Networks

What if a SAN has to support an OLTP application for a highly available business process in three time zones? Not a problem. At least its not my problem. Im just a little ol Storage Area Network.

Though maybe not.

The ability to access the SAN remotely, instead of going into the data center every time you need to access SAN operations and information, can be configured to meet administrators needs through external network access. The SAN configuration discussed in Chapter 15 required that SAN operations be performed through a dedicated workstation that is directly attached to one of the SAN switches. Given this is where software configuration tools can be directly accessed for setup, maintenance, and initialization activities, most switch products are enabling users with IP access points that allow some of the SAN operational tools to be available remotely.

There are several innovations waiting in the wings that will drive advanced connectivity options for the SAN. Key among these is the need to further integrate IP- based storage solutions with FC networks. However this will require a more sophisticated usage of storage systems driven by user data that is increasingly distributed throughout a corporation. By analyzing current data usage scenarios, iterations of data must make several stops within an enterprise infrastructure in order to satisfy distributed application requirements.