AN INFORMATION TECHNOLOGY PLAN TO MEET BUSINESS CONTINUITY REQUIREMENTS

CME's server-based computing environment makes the implementation of these requirements possible. CME West, as the recovery site for Tier one and Tier two, will need to have hot support for 500 users. These users will require the defined Tier-one data, applications, and access through the Internet. We assume either that Tier three will be implemented back at the Chicago facility or that a temporary facility (which could be CME West) will be used. During the two-week window between Tier two and Tier three, CME IT will have to work feverishly to acquire all of the required hardware to replace any hardware lost in the disaster.

Hot Backup Data Center Design

A hot backup data center is a backup data center with real-time servers, ready to be used at a moment's notice. The advantage of a hot backup data center is that it provides a fast resumption plan. The disadvantage is that it requires redundant hardware that generally remains idle except to receive updates and periodic testing.

The most important element of the data center design is geographical location. In order for the backup data center to truly provide resumption, the data center must be located a significant distance from the main data center, and it should not be subject to the same disasters as the main data center (for example, both data centers should not be close enough that a single hurricane could render them both useless).

The rest of the data center design components should mimic the main design center. In the case of CME, we have defined that the backup data center only needs to support 500 users, so the data center will be much smaller than the corporate data center that supports 3,000 users. Additionally, there is no need to replicate the testing and training environment, or some of the redundancy that exists at the main data center. Thus, the CME West hot backup data center will be about ten percent of the size and cost of the main data center.

Backup Data Center Components

Although the backup data center is much smaller than the main data center, defining the critical components is still an important part of the business continuity plan to ensure that everything will work upon fail-over . Although the list of required hardware and software for most organizations will differ , studying the components required at the CME data center and comparing these to the headquarters' data center will allow you to extrapolate what is needed for your organization.

CME's backup data center will require the following components:

  • Ten Citrix Presentation Servers imaged from the CME headquarters data center to support the 500 possible users required upon fail-over.

  • A DMZ-based Secure Gateway/Web Interface Server and an internal Web Interface.

  • One Secure Ticket Authority server

  • One Oracle Database server

  • One Microsoft SQL server

  • One Microsoft Exchange server

  • The LAN and WAN networking components defined in Chapter 17

  • Internet connectivity utilizing a separate ISP than what is used at the Chicago data center

  • A firewall with DMZ and VPN hardware

  • An Internet-based secondary mail server to queue mail in case the Exchange server is offline

  • Internal and Internet DNS servers

  • Storage area network solution (SANS)

  • Appropriate tape backup units to facilitate the recovery of archived data and any information not located on the SAN

  • Backup power for the data center

Hot Site Data and Database Resumption

The most critical part of the business continuity plan is the ability to recover the file and database data (the disaster recovery section of business continuity). Even if the full business continuity plan is not enacted, the recovery of data is critical. For example, if the Oracle data becomes corrupt or the Oracle cluster should completely fail, even though this does not constitute a disaster, it is critical that the data be recovered quickly and easily. Worse yet, if a government seizure should happen, there must be a plan to restore the data to nonseized hardware in a timely manner. In order to service this, all databases, files, and e-mail data must be copied to the backup data center nightly at a minimum. Although this is easy to accomplish with file data, doing this with database and e-mail data is more difficult. The larger SANS vendors (HP, EMC, and LeftHand Networks) all support a snapshot technology to effectively copy Exchange and database data across a WAN to another similar SANS device. There are also some non-hardware-based technologies such as NSI Software's Double Take that mirror Microsoft Exchange and other database software. Note that in the CME scenario we are only copying the data at night. Thus, if a disaster happens late in the day, requiring fail-over to CME West, all data created in the course of the day will be lost. If your organization requires less data loss risk than this, the solutions from LeftHand, EMC, HP, and NSI can provide up-to-the-second transaction redundancy (typically called double-commit), but the dedicated bandwidth requirements and associated costs increase dramatically. Chapter 17 defined that, for CME, 6 Mbits of their 12-Mbit dedicated pipe will be partitioned at night to support the data mirroring.

Restoration of the Applications and User Access

For any environment that wishes to have a robust, fast resumption plan, all applications requiring immediate availability and flexible user access following a disaster must be installed in a server-based computing environment at the backup data center. In CME's case, all applications required for Tier-one and Tier-two business continuity are installed on the on-demand access server farm at CME West. Thus, fail-over of the applications simply requires repointing users from the CME Corporation data center to the CME West data center Secure Gateway/Web Interface server. The Web Interface server and Citrix Presentation Server farm will be configured identically to the larger farm at CME corporate. All applications, load balancing services, and user services supported from the corporate Citrix Presentation Server farm will be fully supported from the CME West farm, with no additional configuration or work following the fail-over.

User access to these applications becomes the remaining hurdle . As seen in Figure 19-2, all CME remote offices have an Internet/VPN connection, with the exception of the American sales offices. CME has also defined that all of the Chicago Tier-one and Tier-two users who may have been displaced from the disaster will have access from their home Internet connections ( assuming , of course, that Chicago's Telco infrastructure has not been rendered unavailable by the disaster). Thus, with the exception of the American sales offices, all users will have full access to the CME West backup data center through the Internet. The BC plan calls for all Tier-two employees at the American sales offices to utilize Internet connections (home-based, coffee shop-based, and so on) for connectivity until their Frame Relay connections can be repointed to the DS3 ATM in Seattle (about 72 hours typically).

image from book
Figure 19-2: CME's network infrastructure

All Tier-one users will be trained to use a backup URL to access Internet-based Citrix Secure Gateway resources at CME West. This provides for immediate access and allows for propagation delays in "repointing" both public DNS resources and BGP routing tables to claim the corporate identity at CME-West, as discussed in Chapter 17. Within the 24- hour window, the BGP and DNS changes will have propagated, allowing Tier-two users access through the standard Internet-accessible URL.

It is important to note that this entire business continuity plan hinges on the Internet connectivity at CME West. Chapter 17 specifies that CME West has a DS3 line with a 1.5 Mbit virtual circuit that can be increased in a 24-hour period to 15 Mbit. This bandwidth increase will be required to support the 500 Tier-two users needing access over the Internet, and eventually, retermination of the VPN-connected branch and regional offices. Manual BGP fail-over will provide a seamless fail-over for all Internet-based connectivity (including VPN connections and Internet e-mail) between CME Corporation and CME West. (The CME-West firewall will be reprogrammed to serve as the CME Corporation firewall after BGP convergence to allow IPsec connections without changing the remote sites.) All directly connected networks (Seattle and American sales offices) will use an Internet connection in the case of ATM or frame relay failure. If the ATM at CME Corporation will be down for an extended period of time, the frame relay links in the American sales offices can be repointed to CME West over the private WAN ATM DS3 within 72 hours.

In addition to remote user access, some employees will need co-located office space. CME West was designed with sufficient capacity in the form of WLAN hardware and prepositioned access switches (see Chapter 17 for further discussion) to support temporary users from other locations.

Full Restoration Plan

Following a major disaster, and an accompanying fail-over to CME West, and if the disaster requires a new facility, there is a risk that restoration of the original Chicago location may not happen within two weeks or ever. Accordingly, the Tier-three plan may require either enhancing the temporary infrastructure at CME West and salvaging the Chicago site (to make CME West the new corporate home) or rebuilding a new CME Corporate data center in Chicago or some other location to house and support all the CME Corporate users long term .

Again, if the CME Corporate facility is rebuilt and the server and network infrastructure restored and tested , a period of downtime (usually 24 hours) must be planned to manually fail back the BGP and DNS to point back to the primary location and to return users to that facility.

Documentation

Now that CME's plan is falling into place, an all-inclusive document needs to be created. This document should, at a minimum, include the following:

  • Emergency phone numbers for all manufacturers and support vendors

  • Names and contact information for the 50 Tier-one people

  • Specifics on how the plan will be implemented, and who will implement it

  • Network diagrams

  • Security policies

  • Emergency IT response information

This document should be reviewed and updated twice per year by the BC committee. Additionally, the 50 Tier-one employees should receive formal training annually to keep them updated with policies and procedures. Tier-two employees should receive a yearly e-mail or other document to keep them updated on the procedures.

Maintenance of the Hot Backup Data Center

Although the hot backup data center will not be used for general day-to-day activity (other than the storage area network that will receive the backups every night), in order to guarantee two-hour fail-over, the backup data center must be maintained . The same maintenance items that are logged to the main data center must also be replicated to the backup site. Items such as service packs , hotfixes, application updates, security updates, and so on must all be kept up-to-date. A simple approach to keeping the Citrix and Windows 2003 servers up-to-date is to use the imaging procedures discussed in Chapter 11 to image the backup site servers monthly. Additionally, the SANS should be checked weekly to ensure that the data being copied over every night is indeed current and usable.

Test of the Business Continuity Plan

Twice a year (for example, once during a summer break and once during a winter break), the business continuity plan should be tested. It is imperative that all Tier-one personnel be included in this test. The test should ensure successful connectivity, availability, and data integrity, as well as confirm that everyone knows how and when to set procedures in motion.



Citrix Access Suite 4 for Windows Server 2003. The Official Guide
Citrix Access Suite 4 for Windows Server 2003: The Official Guide, Third Edition
ISBN: 0072262893
EAN: 2147483647
Year: 2004
Pages: 137

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net