Chapter 14: Disaster Recovery and Continuity Planning


Overview

An important component of the information security policy is the section that describes the steps that the business will take if a disaster were to strike the company. There are two essential questions that a disaster recovery plan must answer. The first is what are the immediate steps that must be taken during or immediately after the disaster to ensure the safety of our employees and ensure that the company is able to resume critical business functions in the shortest time possible? The second question relates to the longer term. Now that the disaster has struck and we have recovered to a minimally operational state, what steps do we take to bring the business back to the state it was prior to the disaster? The first step is the disaster recovery process and the second step is commonly referred to as business continuity planning.

Almost by definition, a disaster is an unplanned event that may cause the entire company location to be unusable for a day or longer. This is compared with nondisasters that are simply service interruptions and local device or software failures. The nondisasters are hopefully addressed in the main security policy and have been identified during the risk assessment phase of developing the security policy.

The companies that recover from a disaster are the ones that plan ahead and create and implement a disaster recovery plan prior to the disaster striking. This means that the company has taken steps to identify their most important assets, estimate the risks that a given disaster may present, and implement steps to ensure a speedy recovery from any exposure. It is a sobering statistic that more than 40 percent of companies that experience a disaster never recover, and an additional 30 percent of companies experiencing a disaster of some sort close within two years. Altogether, that adds up to 70 percent of U.S. businesses closing within two years of a disaster. Some studies put that number much higher. If your business is important, then disaster recovery is a critical component of the information security policy.

Unfortunately, many companies underestimate the impact of a disaster on their organization if they even consider it at all. Sadly, many companies fail to even reopen after a disaster that puts them out of business for more than five days.

If you read the chapter on creating and implementing a security policy, then much of what is being said here should sound familiar. Identify assets, perform risk analysis, and implement solutions. The primary difference here is that the scope of the risk has changed. Instead of considering a single system, as is often the case when evaluating nondisaster scenarios, the disaster recovery plan attempts to evaluate the business organization as a whole and identify critical interdependencies between systems. Another comparison might be that the security policy is interested in what happens if a server goes down or the effect that denial-of-service attacks might have on the goals of the security policy. The disaster recovery plan attempts to plan for the contingency that the entire building is wiped off the map.

Nevertheless, the process is very similar to the overall process of the security policy. In fact, much of the risk analysis can occur at the same time or at least use the same data. Like the incident response policy, management initiates the disaster recovery plan by creating a disaster recovery team, providing the leadership for setting goals, and making available the required resources. Once this has been accomplished, the process of creating a disaster recovery policy is a series of defined steps like most other elements of good information security.

The first step is to identify any major legal or policy constraints that must apply to any disaster recovery planning in the same way that there may be legal or regulatory influence in the creation of the information security policy as a whole. Is there the requirement that patient or customer data remain confidential even in the face of a disaster? Do executives have the legal responsibility to perform due diligence regarding disaster recovery? Does the company's own security policy make any particular requirements regarding the treatment of data? It is these high-level considerations that will impact the steps implemented in the disaster recovery policy just as they do the information security policy as a whole. This will help define the overall goals of the disaster recovery plan.

The next step is to perform a risk analysis. Most of the time, if a company has taken the process of information security to heart, this step has already been taken. While most people remember to consider such natural disasters as fires, floods, earthquakes, tornados, and hurricanes as potential threats, it is important to recall that threats can also be technological or human in nature. A riot or civil unrest can damage your business just as much as a flood. More so than the risk analysis that is performed as part of the information security policy, the emphasis for the disaster recovery risk analysis should be on numbers. It will be important when planning contingencies to know the total dollar value of assets to the organization. What is the potential financial impact of a disaster on an organization? How much money would the organization stand to lose per day that it could not operate? How many days could this go on before the business is unable to open again?

Many disaster planning models use what is known as a time-loss graph. This tool allows the disaster planning team to describe the dollar effects of an extended outage over a period of time. Simply put, it describes the amount of cash that is lost on a per-hour or per-day basis when information is not available. This may reveal that significant losses are not realized until the third day of nonoperation. This information, of course, is then used to plan the recovery strategy. If your information reveals that the most significant losses to the company occur within the first three days of an outage, then this is the timeframe that must be considered. Knowing your own business cycle will greatly impact the values and timeframes that are determined. Does your business have a critical customer that requires its orders on a certain day of each quarter? What if you suffered a loss two days before that order? What if the loss occurred six weeks before that critical date?

As with the risk analysis that occurs for the information security policy, the intangible monetary effects of a disaster must also be considered. How will this affect your market share or competitive advantage? How long can you be out of production before suppliers start looking elsewhere? Are there any possible legal fees or regulatory requirements that would affect your loss estimates?

With a clear idea of the likely threats and assessment of possible losses, it is time to perform what is called a business impact analysis (BIA). The BIA uses the information from your risk analysis and attempts to identify the interdependencies that exist between departments and systems. The goal of this step can be broadly summarized as "knowing what needs to happen to get the products to the consumers." To obtain this knowledge, the essential business functions and departments that support these functions must be identified. Not only must they be identified, but also the order in which they must be brought back online must be prioritized. Some business functions are going to be critical to the survivability of the business and must be prioritized; other functions can wait a bit longer.

With this information, it is time to begin the process of developing the standards and procedures that will be implemented before disaster strikes and after the unfortunate event. While the exact implementation of the disaster recovery plan will vary from company to company, there are some common threads that appear in almost every plan and should be considered.

The first is the chain of command. Like incident response, in a crisis there needs to be a single knowledgeable person, with good leadership qualities who is responsible for the oversight of all other disaster recovery activities. Make sure that it is very clear who is the "go-to" person when a disaster strikes.

It is this person who will also be responsible for declaring a disaster. From there, the notification chain should be established. The disaster recovery manager calls two managers; they call more managers and so on, until all affected employees are notified.

Some consideration must also be given to the scale of disaster recovery that will be required. Some organizations may have the requirement to be up and running within hours after a disaster. The only way that this will realistically occur is for the company to maintain a parallel site. If the company has multiple offices, this may mean mirroring business-critical operations in near-real-time at multiple sites.

If the company does not have such stringent recovery requirements, then a hot site may be an acceptable option. A hot site is a remote facility that is close enough to the primary site to be easily accessible, but not so close that it would likely be affected by the same disaster that struck the main location. A hot site has equipment on hand and copies of all data ready for production. In practice, this requires that the hot site operating environment closely mirror that of the primary site. Hot sites are often maintained by the organization itself; however, some companies that specialize in disaster recovery solutions will maintain an organization's hot sites for a fee.

Hot sites have the primary disadvantage of being expensive to maintain. More often, companies are interested in warm alternate sites. Warm sites may include the critical data or simply servers and basic networking equipment ready for configuration by the company IT staff during a disaster. Unlike a hot site, which is essentially a remote office, the warm site has only the materials on hand to supply the business functions described as most critical. Warm sites allow a reasonably rapid response to a disaster, but they cost less to maintain than hot sites.

With hot and warm sites, it should be no surprise that the final option for recovery sites is known as the cold site. The cold site is little more than storage and has only the most basic of business needs such as power and cabling. Of all the options, this is the cheapest, but it will also require the most time to bring up. Imagine if the hardware of the main site is destroyed. How long would it take for replacement hardware to be shipped to the cold site, configured from scratch, and brought online to replace critical business functions? Full recovery may take several weeks at best.

The final option is known as a reciprocal agreement. This is a business agreement with another organization that if some disaster should strike one company, then the other would provide a temporary alternate location during the recovery process. While this sounds like a good, inexpensive idea, careful consideration should be devoted to this idea before it is implemented. Most offices do not have excessive extra space or cabling to comfortably support more than one organization. Tensions will rise as the hosting organization attempts to fulfill its business objectives while the business in recovery is scrambling to get back on its feet. Remember that time you thought it would be "no problem" to have that old college roommate stay with you for a "few days" until he got some things sorted out? Apply that scenario to two businesses and you can see the potential problems.

Like incident response, training is a critical component of any disaster recovery plan. Some training is self-evident, such as CPR and emergency first aid training. Other training would apply to how the employees can ensure their own personal safety and the safety of others, and then the safety of the organization's information infrastructure. All of this training should be brought together in organizationwide testing of the disaster recovery plan. While having a plan is a good start, it does little good to the organization if in the first minutes of a disaster you realize that you have made no provisions for emergency communications when the phone system has been disabled.




Network Perimeter Security. Building Defense In-Depth
Network Perimeter Security: Building Defense In-Depth
ISBN: 0849316286
EAN: 2147483647
Year: 2004
Pages: 119
Authors: Cliff Riggs

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net