The storage admin team recommended the SAN solution with subsequent approval from IT executives and reluctant concurrence from the application design team. Because this would be the first installation of the storage networking technology for the data center and because of the visibility of the applications, the design teams reluctance was understandable. However, an integrated plan was developed to provide beta support for the application testing of new data warehouse and data mart prototypes .
The decision was based upon both price and performance. Given the increased throughput with the Fibre Channelbased storage arrays, the SAN solution appeared to be one of the most adaptable solutions for this type of high-end throughput application that is, an aggregate 20GB per second of data. Specifically , the ability to source the data warehouses from the mainframes into the UNIX servers for subsequent preprocessing, loading, and updating was appealing. Moreover, the shadow of the impending corporate data warehouse project provided the pivotal requirement that pushed the decision toward the SAN. The decision makers realized that if the UNIX mainframe strategy was chosen , another would be required to handle the additional load of the corporate data warehouse. Finally, the ability of scaling the SAN or adding another SAN would be more cost-effective , but it would also be more responsive to the application needs in the short and long term .
The installation of the SAN provided additional internal requirements and surprises , but none that were unmanageable or extremely cost intensive . These centered on the new operational characteristics of the SAN, the additional software tools required, and the responsibility for maintaining the SAN software and hardware (from an IT organizational view). Each of these issues was handled within the beta period of installation and subsequent prototype testing of the applications. These issues are summarized in the following section.
The new SAN required a learning curve and subsequent integration into the existing data center hardware and software processes. The storage admin team and select members of the systems administration team took vendor classes, which provided an overview specific to the vendor product selection and additional guidelines on installation and preinstallation planning.
The installation was somewhat more problematic because of the problems any new infrastructure has in moving into the data center. The issues were centered on three areas: facilities and location; fundamental management processes, such as meeting the existing data center operations rules; and integration into existing server, network, and storage wiring mechanisms.
Facilities and Location This was handled by accommodating more space within the existing server and storage areas. However, given the less restrictive lengths, the storage was initially planned for installation in the mainframe storage area on a different floor. This was scratched because local proximity (for example, server to switch to storage) was better during the initial production cut-over period. However, the flexibility of subsequently moving the storage to another area, given the increased length capability of Fibre Channels, turned out to be an additional plus for the data center facilities planners.
Management Processes Perhaps the most troublesome issue was the operational integration into the data center. Because of the fundamental processes of each SAN director, this required that additional data center operational policies and processes be developed, followed by training and orientation of key operational staff members. The ability to provide configuration information and real-time performance information continues to hamper this aspect of the operation; however, this is becoming more adaptable as the operations of SAN become familiar to both operations and systems/storage admin staff.
Integration into Data Center One of the most difficult activities surrounding the installation and subsequent expansion of the SAN is the integration of wiring complexities and integration into the existing complexities of server and network wiring and switching structures. The ability to implement, manage, and track this process remains quite difficult as more than 200 ports are routed to appropriate points of contact. Although not unlike their Ethernet counterparts, and comparatively small next to the LAN/WAN wiring closets, the critical sensitivity to a lost, unconnected, or inoperative port can affect the highly visible application such as the companys data warehouses. Existing plans call for the SAN infrastructure to be separated from the existing LAN/ WAN configurations.
Once prototype testing was in place, it became apparent that new tools were necessary for managing the storage across the SAN. The requirements specifically came from the need to manage the storage centrally across the supported servers. This required some level of both centralization and drill-down capability for specific array and individual devices access. This was accomplished through the acquisition of new volume management tools and vendor-specific management tools for the storage arrays, both having specific functionality for Fibre Channelbased SANs.
The next level of tool that proved to be more problematic was the backup/recovery tool. Due to the nature of the application, data warehouses generally dont require the stringent restore functions that an OLTP application would need. Therefore, the ability to perform standard backups would impact the nightly update function or prove to have little value to an analysis that uses a 24-month rolling summary of data. Consequently, rebuilding or reloading the data warehouse can be done on a basis that is less time sensitivefor example, performing a rolling backup once a week and probably on the day when the least processing occurs. Given the large amount of data, the specific challenge to relational databases, a snapshot function is being pursued for those database tables that are the most volatile and time consuming to rebuild and reload. In the case of a complete volume or array outage , the snapshot would allow the DBAs to go back to a specific time period and reload and update the database within a minimum amount of time.
The responsibility to evaluate and justify the SAN initially was given to the storage administration team. The installation and support of the SAN beta testing was accomplished through a team approach, using the expertise of systems admins, application designers, and DBAs, with leadership from the storage admin team. As the SAN became a production entity, the team dissolved into its daily responsibilities and the nebulous nature of some of the SAN components began to affect the total mean time to recovery (MTTR) for any problem scenario.
Consequently, IT management is faced with an interesting dilemma of providing shared responsibility for the SAN or developing a new infrastructure group focused on storage. There are pros and cons for either direction. This study indicates the somewhat conservative nature of the industry to continue to share responsibility across the SAN components. The network, the systems, and the storage staff are all responsible for the SAN or various components of the SAN. This unfortunately results in the increase of the MTTR, since this type of arrangement creates the its not my problem scenario. (Refer to Appendix C for details on how this manifests itself in a storage networking management business case.)
The final configuration provides the increased storage capacity and enhanced performance that was expected of the solution. The data warehouse/data mart project was accomplished on time and has proved to be responsive to the I/O workload. Because the estimates were accurate and allowances were built into the configuration for expansion, the storage capacity and access performance is ahead of the I/O workload estimates and business requirements, making both of these execute within service levels.
The current configurations are moving into the planning for the next highly visibly project, the corporate data warehouse. However, this time it has complete concurrence from the application design teams , the DBAs, and system administrators. Further planning is being considered to consolidate additional storage and servers within the data centers. The two outstanding caveats remain the organizational fluidity of responsibility and the continued organizational challenges to effective wiring management. The first continues to elongate any problem associated with the SAN configuration, and the second continues to facilitate the singular failure of port mismanagement, exacerbating the first issue.
The configuration shown in Figure B-3 illustrates a summarization of the data warehouse/data mart application systems. The ability to portray a 200-plus port configuration is beyond the scope of this study. However, its important to point out the results that the SAN solution provided this company with the ability to manage its business more effectively through the use of multiple technologies. The functionality of the relational database to facilitate the analysis and relationships among business functions has provided a 40 percent increase in the amount of parts sold and a 25 percent cost containment in the area of technical support. More importantly, the applications have been used to reduce the warranty receivables significantly, and that results in an important improvement to the company bottom line.
The SAN solution provided the infrastructure for these applications. The ability to move and access data at the increased levels of Fibre Channel SAN configurations has provided this company with a significant competitive edge. This case study illustrates the success that storage networking can have with an adequate and reasonable application and I/O workload planning set of activities.