How do we increase availability? Is planned downtime considered unavailability ? What exactly is high availability ? These are key questions that must be answered when discussing any high availability platform. However, to answer these questions, it is first important that one understands what availability is. There are many different definitions throughout the industry when it comes to the subject of availability or, more correctly, unavailability. The terms high availability, assured availability, continuous availability, unavailability, five nines, and other expressions are widely used throughout the industry even though most people have little understanding of their meaning.
One of the most widely used terms in the industry today that relates to high availability is nines (which is discussed in detail in Chapter 1). One cannot walk the halls of any IS shop these days without hearing the terms three nines, four nines, and five nines bandied about. How many nines are we at?, Does that include planned downtime?, and What can we do to reach five nines? are questions often heard but seldom answered correctly because their answers are based on perspective.
For example, in a mission-critical environment where users only make use of their systems Monday through Friday, downtime is scheduled on the weekends for maintenance activities, and the unplanned downtime amounts to less than five minutes per year. Could one say that the availability of this system is five nines? From a user s perspective, this system could be defined as five nines, and probably rightfully so. However, from a platform perspective, the availability numbers reflect a much lower percentage.
Why? From an industry perspective, this system does not truly meet the criteria for five nines because in the world of high availability no differentiation is made between planned and unplanned downtime. The reasons for this are simple. Even though planned downtime is scheduled well in advance of any outage , and provisions are made to account for these outages, the system must be available during this time to perform maintenance. If the system is nonfunctional at the time maintenance is to be performed, then the time must be counted against normal availability requirements.
Given these requirements, how does one go about developing a highly available platform? What are the key factors that cause systems to be unavailable? How do you mitigate these factors? How do you develop a disaster recovery plan to maximize your system availability? The answers to these and other questions can vary widely, as every system is different. The answers lie in employing the concepts and best practices in Microsoft SQL Server 2000 High Availability .
In this book, Allan Hirt discusses the key drivers that affect availability of systems, particularly as they relate to Microsoft SQL Server. He discusses in depth the driving factors that can cause downtime and the best practices that can be employed to increase the availability of SQL Server environments. For the past several years , he has consulted with numerous corporations designing highly scalable and available SQL Server solutions, presented at conferences, and authored white papers on the subject of SQL Server availability. Within the realm of SQL Server, Allan has gained a reputation for being an authority on the subject of SQL Server availability.
My group is responsible for managing the SQL Server operations for Microsoft.com, the third-largest Web site in the world. Our world currently consists of more than 150 SQL Servers servicing almost 1,100 databases. Maintaining the highest availability on the Internet is our charter. The concepts and practices discussed in this book are what we live every day, and, thanks to the author, they can be brought to you.
Group Operations Manager