7.3 Key storage technologies for Exchange Server 2003 | Mission-Critical Microsoft Exchange 2003: Designing and Building Reliable Exchange Servers (HP Technologies)

In our quest to build mission-critical Exchange servers and deployments, there are many keys and technologies in which we should invest. Understanding the technology, solid management practices, procedures, personnel/staffing, training, and so forth are some of the important points. We must look to all areas of hardware, software, and peopleware as we select where to focus our attention and investments. In my humble opinion, the single most significant area in which we can invest and educate ourselves is in the area of storage technology. I believe that storage technology is a fundamental piece (if not the most fundamental piece) of mission-critical Exchange servers.

When I think about storage technology for an Exchange server, I consider it to be central to keeping the server healthy and happy from a client and systems-management perspective. The key to a highly available system is access to data with the highest level of performance and the utmost data protection and integrity. After all, if the data is not available quickly and with a guarantee of validity, the system has not served its intended purpose. Thus, storage technology is important in two areas—performance and data protection. My theory is that without solid storage technology, no Exchange server will meet the availability requirements of a mission-critical system. To put it another way, no amount of management, training, and investment in other areas can help you if the underlying storage strategy for your Exchange server and across your entire deployment is weak. Therefore, the purpose of this part of chapter is to discuss some basic fundamentals of storage technology available at the time of this writing and how that technology can be applied to an Exchange 2000/2003 deployment to achieve the highest levels of performance, scalability, reliability, availability, and manageability.

7.3.1 Storage technology basics

Let us start with a discussion of the basics of disk subsystem technology. I will assume that you are somewhat familiar with this subject and will provide mostly an overview. In addition, since this book is focused on mission-critical Exchange servers, I will attempt to keep the discussion around storage aimed at that goal and not digress into detailed or even “religious” arguments concerning the pros and cons of various approaches and technologies. My goal will be mainly to put on the table the various technologies and products available.

Since the early days in the PC (Intel) server space, when Compaq (now HP) first announced and shipped the Compaq Systempro in 1989, storage subsystem technologies for servers have differed significantly from desktop and portable (single-user) systems. Compaq did not invent this differentiation between server and desktop system storage. Some tried-and-proven technology from the mainframe and minicomputer world was taken and applied to PC servers. This differentiation mainly centered around the difference between one or more individual disk drives attached to a system (sometimes called JBOD—Just a Bunch of Disks—in storage systems speak) and an array of disks configured as a single entity providing a higher level of performance and/or data protection. This differentiation essentially invented what we now know and use (or should be using) every day in our Exchange deployments. At the time (circa 1989), Compaq and other vendors used Intelligent Drive Electronics (IDE)–type drives because of their superior performance, reliability, and price point for the PC server market. When server manufacturers began shipping systems for operating systems such as NetWare, OS/2, and UNIX, they provided intelligent controllers that made drive arrays available for these platforms. A drive array is, very simply, an array of individual physical disk drives. Physical drives are grouped together to form logical drives. Using drive array technology, the data can be distributed over a series of physical drives, providing higher performance. This is due to the fact that the distribution of the data across multiple independent drives (called disk striping) allows requests to each drive to be processed concurrently, yielding a higher I/O rate than drive configurations that are not configured in an array fashion. When multiple physical drives are grouped into a logical array, they appear to the host computer as a single disk drive equal in capacity to the sum of all physical drives in the array. However, the drawback to this approach is that the reliability of a drive array decreases based on simple statistics. Table 7.4 provides some reliability formulas commonly used to determine the reliability of a system or a device such as a disk drive. When an array of physical drives is configured, it becomes a series system (subject to the formula in Table 7.4).

Table 7.4: Reliability Formulas
Failure rate ( )	= 1/MTBF Where: MTBF = mean time between failures
Reliability (R)	R = e – T Where: e = natural logarithm = failure rate T = time (period to be measured)
Reliability of a parallel system (Rp)	R p = 1 – [(1– R 1) x(1– R 2) x... (1– R n)] Where: R 1... Rn = reliability (R) of each system in parallel
Reliability of a series system (Rs)	R s = R 1 xR 2 x... R n Where: R 1... Rn = reliability (R) of each system the series
Source: MIL-STD and IEEE.

In other words, the reliability of the array will be a function of the reliability of an individual drive and the number of drives in an array. For example, since drive reliability is typically measured as mean time between failures (MTBF), an array of four physical drives in which each drive has an MTBF of 100,000 hours, would have a yearly reliability measurement of 70.44% using the following calculation:

Reliability of individual drive ( R = e ^{– T}) = 0.9161 (91.61%)

Reliability of series system (drive array) = (0.9161) ⁴= 0.7044 (70.44%)

As you can see, as more drives are added to a logical array, although performance may increase, reliability decreases. Since a typical Exchange server may use many drives per array, the ramifications for reliability would not be worth the performance advantages gained by array technology.

Applying the preceding formulas further in a specific example, let’s look at the most critical subsystem on an Exchange server: the disk subsystem. Not only is the disk subsystem the most critical in terms of reliability, it also can have the most drastic effect on reliability. Also, because disk failure rates are widely known, this makes an excellent example for applying the preceding reliability formulas. Most disk drives manufactured today have an MTBF as high as 800,000 hours or more. Less than 10 years ago, disk MTBF was at less than 100,000 hours for most IDE and SCSI devices. Over the last few years, the specified MTBF rating of hard disk drives has risen significantly. OEMs and other customers who purchase these drives from manufacturers such as Seagate, Quantum, and IBM have grown to expect much better reliability than they did just 10 years ago. Some manufacturers are even touting MTBF ratings as high as 1 million hours! Be aware, however, that there is a substantial difference between theoretical MTBF ratings and their operational equivalents. When manufacturers set out to design new disk drive products, they create reliability models for these products early in the development process.

These models, however, are based on predictions and are theoretical. It is not until these products are developed, shipped, and put into production that field data on failure rates and root causes can be used to validate the models developed during the design phase. In addition, since a disk drive is actually made up of several components such as the head, platter, and associated circuitry, these models are much more granular than simple MTBF ratings for the entire disk drive. These models also lack the ability to factor in the other significant causes of drive failures such as design faults, firmware bugs, or failures induced in the manufacturing process. Mishandling, shipping damage, or the notorious No Trouble Found (NTF) are some other reasons for drive returns that are not factored into the theoretical MTBF ratings. From personal experience on the hardware side of things during the last 12 years, I have observed the many factors outside the MTBF ratings to be much more likely causes of disk drive failures. Table 7.5 provides an illustration of this from Quantum, comparing operational versus theoretical MTBF ratings.

Table 7.5: Comparing Operational Versus Theoretical MTBF
Failure Root Cause	Operational MTBF	Theoretical MTBF
Handling damage	Not included	Not included
NTF returns	Included	Not included
Infancy failures	Included	Not included
Mfg. process related	Included	Not included
Mfg. process test escapes	Included	Included
Random steady state failures	Included	Included
Firmware bugs	Included	Not included
Example disk drive product MTBF	200,000 hours	700,000 hours
Source: Quantum, Inc. 2000.

Since there really is no industry standard for how MTBF ratings are determined for various components of a server such as a disk drive, many of these ratings will vary from vendor to vendor. Also, it is important to remember that vendors use these ratings for marketing purposes and are interested in advertising the highest ratings possible. Operational MTBF is much more valuable, but is not available from most vendors (for the preceding reasons). Adding to this is the fact that most server hardware vendors from which you buy your Exchange servers have different standards as well. For example, most hardware vendors have different qualification processes for when they use OEM disk drives from vendors like Quantum and Seagate. Quite often, OEMs like HP and IBM have different firmware for disk drives due to integration needs such as specific reliability-monitoring techniques or support for management applications and utilities. For example, in order to support HP’s Insight Manager (HPIM) server-management application and reliability features such as predictive failure alerting and monitoring, HP requires its drive vendors to add special firmware changes that otherwise are not available in generic drives from that vendor. As is quite obvious, there are many factors that determine disk drive reliability.

To digress on this subject further as an illustration of reliability calculation, let’s use a disk drive with an MTBF of 500,000 hours. If you used this drive in an Exchange server with a single disk drive, the failure rate ( ) is calculated as 1/500,000 or 0.000002. Thus, the reliability of the disk subsystem for that server (applying the formula from Table 7.4) is 98.2%. This means that, within a 1-year time frame (8,760 hours), we should only expect a failure approximately every 8,608 hours of operation of that subsystem (ignoring all other subsystems and causes for failure). This is not what I would call a mission-critical system.

Solving the disk drive reliability problem with RAID

To counteract the impact on reliability that drive arrays suffer from, some researchers at the University of California Berkley (David Patterson and Garth Gibson), in 1988 conceptualized a technique called redundant array of inexpensive (later changed to independent) disks (RAID). RAID uses redundancy information stored in the array to prevent data loss, thereby offsetting the reliability issues of a simple drive array. In the initial paper introduced by Patterson and Gibson, there were only five RAID levels defined—RAID1 through RAID5. Figure 7.3 illustrates the RAID levels most commonly implemented in today’s storage systems.

click to expand
Figure 7.3: Illustration of RAID levels 0, 1, 0+1, and 5.

In later years, through marketing schemes and legitimate engineering work, additional RAID levels have emerged. The concept of RAID had three main objectives: increased performance, higher reliability, and lower cost. Each RAID level offers a different mix of performance, reliability, and cost. The objective that researchers at Berkeley had for RAID technology was to provide disk subsystems that yielded higher performance and reliability while at the same time reducing the overall cost. RAID levels 1 and 5 are the most commonly deployed. Table 7.6 presents the commonly known RAID levels, their functionality, characteristics, and trade-offs. In 1992, the RAID Advisory Board (RAB) was formed to define generally accepted RAID levels as well as to educate the industry. The RAB was formed by a group of interested users and vendors. Since that time, the RAB (www. raidadvisory.com) has evolved to a pseudo standards body for RAID technology, and it provides certification and conformance ratings (for a fee, of course) for vendors of disk technology seeking RAID conformance.

Table 7.6: Comparison of the Five Original RAID Levels
RAID Level	Implementation	Data and I/O Characteristics	Application Fit	Cost Factor
RAID0 (drive striping)	A simple drive array. No data protection is offered by RAID 0.	Excellent. No data protection overhead and high I/O concurrency.	All. No performance penalty for read or write.	Low. No redundancy requirement.
RAID1* (drive mirroring)	Data is replicated to the redundant device that can be used for both performance and data protection gains. Most vendors implement this also with RAID0 striped mirroring (also known as RAID10)	Excellent. Read operations suffer no penalty and write operations suffer only a 2:1 penalty.	All. Slightly sensitivity to write-intensive applications.	High. Twice the number of disks required for redundancy.
RAID2	Data is striped across all disk with parity (ECC) information on multiple disks.	Adequate for read operations, but incurs a substantial penalty for write operations. Superseded by RAID3 since it offers no advantage.	Applications that are primarily read oriented and not impacted by write overhead.	Medium. Multiple parity disks required.
RAID3	Data is spread across all data drives in one I/O operation, and parity is stored on a dedicated drive.	Adequate for read- and write-oriented applications, but inferior to RAID1 and RAID5.	Applications that are primarily read oriented. Great for single-user/single-threaded I/O applications.	Low. One parity disk required.
RAID4 ( dedicated parity)	Data is spread across disks in a striped operation. Overlapped I/O on write, but not on read.	Adequate for read-oriented applications but poor for write-sensitive applications due to overlapping I/Os on writes.	Applications that are primarily read oriented and not impacted by write overhead.	Low. One parity disk required.
RAID5* ( distributed parity)	Data and parity is striped across all disks in an array. I/O operations are	Excellent for read, but sensitive in heavy write environments. Each logical write I/O operation incurs four physical I/O operations.	Applications that are predominantly read oriented with minimal write operations or are not sensitive to write operation latencies.	Low. All disks can be utilized for parity and data. Same cost factor as RAID3 and RAID4.
*Recommended for Exchange servers.

Many vendors have improved or combined the various RAID levels to achieve even greater performance and reliability. For example, most controllers combine RAID0 (striping) and RAID1 (drive mirroring) to provide striped mirroring (often called RAID0+1 or RAID10). Table 7.7 highlights some of the RAID levels that have emerged since the original RAID concept materialized in 1988.

Hardware versus software RAID

The argument of hardware-based versus software-based RAID configurations may be moot since most understand the nonviability of software based RAID in mission-critical environments. However, it is important to discuss this subject since most operating systems like Windows Server provide software-based RAID, and many system managers are tempted to cut corners and implement it in place of superior hardware-based RAID. Technically, RAID technology can be software-based, hardware-based, or a combination of both. In general, the only time software-based RAID is combined with a hardware-based solution is for controller duplexing. With controller duplexing, two hardware-based RAID controllers can be mirrored, eliminating the controller as a single point of failure. However, since many vendors now provide hardware-based redundant controller configurations, the more risky controller duplexing via software is less often used.

Table 7.7: Comparison of RAID Levels beyond the Original Five Introduced in 1988
RAID Level	Implementation	Data and I/O Characteristics	Application Fit	Cost Factor
RAID6	Similar to RAID 5, but incorporates a secondary parity distribution scheme that offers high data redundancy.	Excellent for read, but sensitive in heavy write environments. Each logical write I/O operation incurs four physical I/O operations per parity scheme.	Similar to RAID 5.	Medium. Specialized controller and additional overhead for secondary parity scheme.
RAID7	“Computer in the disk” approach in which a real-time OS and high-speed proprietary bus provide many characteristics of a separate stand-alone computer as an array.	Excellent for both read and write.	All. Typically only specialized or proprietary applications.	High: RTOS and high-end hardware
RAID10* (drive mirroring with striping— RAID0+1)	Data is replicated to a redundant drive set that is a RAID0 stripe set and that can be used for both performance and data-protection gains. Most vendors implement RAID1 with RAID0 (striped mirroring).	Best for read operations, but overhead of redundant write operation. However, gains are achieved because of striped mirror sets.	All. Slight performance penalty for write, but not as substantial as RAID levels 2–5.	High: Twice the number of disks required for redundancy.
RAID 53 or 35	An array of stripes in which each stripe is a RAID3 array.	Higher performance than RAID 3 or 5 by taking the best of both.	Applications that are mostly read oriented and are not heavily impacted by slower write	Medium: Specialized controller and more disks for RAID3 stripes.
*Recommended for Exchange servers

Windows Server supports RAID1 and RAID5 via the operating system. This is implemented with a Windows filter driver that intercepts I/O requests and redirects them according to the RAID configuration. Software RAID, in general, uses more system resources since the operating system or device driver must handle the processing overhead that a RAID configuration requires. This is most severe in a RAID5 environment in which parity encoding and decoding must occur on the fly. The main advantage to software RAID is that there is no expensive controller to purchase. Software RAID may have a lower cost than hardware RAID, but the real cost may be in lost processor cycles that could be spent doing application work. In my experience, I have seen software-based RAID perform very well in comparison with hardware-based RAID. This is particularly true with the fast processors available in today’s PC servers. The main problem with software based RAID (besides stealing processor cycles) comes when recovery is required. With software-based RAID, configuration data and intelligence are stored in the operating system device driver. If you cannot get the system to boot, it does you no good. To illustrate my point, take a look sometime at what is required to recover your system disk for a Windows 2000 server using software RAID1. Having to resort to my emergency repair disk and editing the BOOT.INI is not my idea of a clean recovery (although this is better in Windows 2003). While software-based RAID may be tempting for your Exchange server, just say no if a mission-critical system is your goal.

Hardware-based RAID offloads the overhead required to encode and decode parity information and other RAID overhead to the hardware controller. Most hardware-based RAID controllers are really just a computer system on a board running a real-time operating system complete with a high-speed processor, RAM, and ROM. This hardware device dedicated to the task of providing data protection frees the system processor(s) from having to manage the disk subsystem and complex processing tasks like parity generation. To further enhance performance, most controllers provide a cache that can be configured as read-ahead, write-back, or a combination of both. Of course, hardware-based RAID does come at some cost. This is the main reason some system managers may shy away from investing in this technology. However, if your interest is mission-critical Exchange servers, hardware-based RAID must be a part of that equation. I am not aware of any organization that has deployed Exchange Server without RAID disk subsystems (although I am sure many exist in the realm of small and medium-sized business deployments). By applying RAID to our original single drive (with 98.2% reliability) example from earlier, we see can how RAID can drastically improve our Exchange Server reliability by making the most unreliable subsystem more reliable. Taking the formulas from Table 7.4 for reliability of parallel and series systems, we can see the effect that RAID has on the reliability of the four-drive disk subsystem in Table 7.8.

Table 7.8: Reliability Impact of RAID
RAID Level	Reliability Impact
No RAID—single drive	Failure rate ( ) = 1/500,000 = 0.000002 Reliability ( R) = e – (.000002 x 8760) = 98.2%
RAID0—series system	Failure rate ( ) for each drive = (1/500,000) = 0.000002 Reliability ( Rs) = 0.982 x 0.982 x 0.982 x 0.982 = 92.9%
RAID0+1 (10)—series + parallel system	Failure rate ( ) for each drive = (1/500,000) = 0.000002 Reliability ( Rs) for each mirror set = 0.982 x 0.982 = 96.4% Reliability ( Rp) for subsystem (RAID 0+1): = 1 – [(1 — 0.964) x (1–-0.964)] = 99.87%

As you can see in Table 7.8, the reliability of a disk subsystem is greatly enhanced through the use of RAID technology (with the exception of non-redundant RAID0). With the addition of other features that server and controller vendors have added, such as redundant controllers, hot plug drives, on-line sparing, and on-line volume growth, RAID technology is further improved. However, keep in mind that, in an Exchange Server, the disk subsystem is not the only subsystem in a server. In addition, hardware failures are only part of the cause of Exchange Server downtime. Within the context of this chapter, however, hardware—and specifically the disk subsystem—is our focus.

RAID controllers

The market for RAID controllers in the PC server space has been going strong for more than 10 years now. First, there were the PC server vendors themselves. Compaq (now HP), IBM, and others entered into this market early to provide high-performing and reliable disk subsystems for their servers. Third parties like Adaptec and DPT soon followed with server-neutral RAID controllers for the masses. When selecting a RAID controller (whether Direct Attached Storage–based or SAN/NAS-based), four key characteristics come to mind—performance, data protection, manageability, and vendor support.

Performance —For applications like Microsoft Exchange Server, the controller must be capable of handling high levels of I/O throughput and provide maximum bandwidth and capacity. This requires that a controller be architected in such a way that RAID overhead is easily managed by onboard intelligence and that delivery of data to the host is fast and efficient by design. This requires a streamlined controller architecture that utilizes the latest SCSI specifications to provide multiple channels for SCSI or Fibre Channel (FC) devices to attach. The controller must also consist of a highspeed processor capable of handling the intensive operations that RAID technology requires. Many controllers utilize high-speed RISC processors for this task.

Another important implementation point is how the controller manages RAID operations. Some vendors have invested huge amounts of R&D into developing silicon-based application-specific integrated circuits (ASICs) specifically for the task of managing RAID operations. Others rely on the controller processor, real-time OS, or in worse cases, device drivers to handle RAID operations. Optimized firmware is another key feature of a hardware-based RAID controller. A controller’s firmware is the source of its intelligence. This intelligence includes complex algorithms that are tuned to allow the controller to provide optimal performance as well as data protection. Features such as write coalescing, tagged command queuing, and others are implemented in the controller firmware. Unfortunately, you can only rely on performance tests or empirical data to determine if your vendor’s controller firmware is optimized.

The cache is not only important to a RAID controller’s overall performance, but a well-designed controller cache will also aid in complex RAID operations such as RAID5 parity generation. The cache also has a significant impact on data protection, which I will discuss more in the next section. As I mentioned earlier, most controllers provide a cache that can be configured as read-ahead, write-back, or both. Most vendors also allow this cache to be configured on a per-volume basis as well. The size of the cache is not as important as you may think. From a marketing standpoint, it seems that “more is better” has won the argument. Most performance benchmarks I have seen indicate that, after a certain point, the size of the cache does not matter that much. As the number of devices and channels on a controller increases, the size of the cache will also need to increase to provide maximum performance benefits. In the early days when Compaq first shipped its Intelligent Drive Array (IDA) and Smart Managed Array Technology (SMART) controllers, the size of the cache was 4 MB.

Benchmark data revealed that a larger cache size did not impact performance significantly enough to justify the additional cost. The final performance factor for a RAID controller will be the overall controller data path design and the host bus attachment. Certainly, PCI has become the host bus attachment of choice in the Intel server space. However, many controller vendors also have complex internal controller bus designs that get the data from the disk drives to the host bus in the most rapid and efficient manner. All of these points are important performance options you should look for when selecting a controller for your Exchange server that provides optimal performance and scalability. Whether you are using locally attached RAID, controller-based storage or a SAN-based RAID controller, check out how well your vendor of choice fares on these points.

Data protection —Obviously, it does not matter how fast your controller is if it doesn’t protect your data. When looking at data-protection features that are important for a RAID controller, several come to mind. First and foremost, since it is a RAID controller, what RAID levels does a controller offer? Most controllers on the market do not offer all of the RAID levels shown in Tables 7.6 and 7.7. Most offer only a subset, such as RAID0, RAID1, RAID4, and RAID5. One of the most popular controllers on the market—the HP SMART (includes SMART-2, SMART 3100, 3200, 4200, and so forth)—offers RAID levels 0, 1, 0+1, 4, and 5. For Exchange 2000/ 2003, the recommendation is RAID 0+1 (striped mirroring—more on this later), so you want to make sure your controller supports it.

Another key data-protection point is the controller cache. When the cache is configured for read-ahead, there is no data-protection issue since the cache is only being used for caching read requests from disks. For write-back caching, however, a good cache design is critical to data protection. In fact, a fair amount of misinformation has circulated regarding using write-back caching with Exchange Server (see the sidebar for additional information). A cache design that ensures data protection should meet three criteria: memory protection, battery backup, and removability. To provide proper protection, a controller cache should use either error checking and correction (ECC) or mirrored configurations. Some vendors provide both options. For example, a cache of 16 MB may actually be 20 MB in a RAID5 memory configuration. It may also be 32 MB in a mirrored configuration. Often, a vendor would most likely select the former due to cost considerations.

Controller cache must also be battery backed. Batteries should be able to maintain cache data in the event of a power failure for several days (if you have not corrected the situation by then, you are not concerned about building mission-critical systems). In addition, batteries should be rechargeable and replaceable. As a side note, an uninterruptible power supply (UPS) does not constitute battery backup for a controller cache. Finally, the cache memory itself should be contained on some sort of daughtercard that can easily be removed in the event that the controller fails. Obviously, the daughtercard should also include the batteries (or what good is a daughtercard?). In the event that the system loses power or the controller fails, data can be maintained in the cache or transported to a new controller. Once a failure condition is resolved, the data maintained in cache will be properly written out to the drive array for which it was originally intended. The RAID level and cache design are the two most important points for ensuring data protection. Make sure you consider these points as you select hardware for your deployment.

Write-back caching and the infamous Exchange Server –1018 error

In my role at HP working on site with the Exchange development team, I have been working (and suffering with) –1018 errors since 1996 when, before Exchange 4.0 shipped, Microsoft’s internal IS organization was having these issues. Since that time, there have never been any documented cases from either Compaq or Microsoft of –1018 errors being caused by controller-based WB caching. Early on, however, a misguided but valiant effort was made by Microsoft PSS to explain –1018 errors as being a result of controller WB caching (actually mistaken for drive WB caching, which is different). Unfortunately this was bogus, as PSS had no data to backup its claims. Compaq responded at the time by asking Microsoft to fix the PSS Knowledge base Q article, and Microsoft obliged. Unfortunately, the Microsoft field staff received the mistaken and misguided information and began to recommend that WB caching be disabled. Based on case statistics and my experience, turning off WB caching does not impact (i.e., reduce) the incidents/occurrence rate of –1018 errors whatsoever—it only makes your Exchange servers run very slowly. In December 1998, the Exchange Server development team recognized that –1018 errors were as much an issue of how Exchange was responding to hardware errors as an issue of hardware errors. While –1018s are usually caused by hardware (although up until Exchange 5.5, Exchange was the leading cause of its own database corruption), the software can instrument in such a way as to recover from errors. Exchange 5.5 SP2 was a result of this work and included retry code to retry on –1018 errors up to 16 times. At the same time, Compaq began working with Microsoft and customers directly to look deeper into –1018 errors, and to capture customer cases. This was not because Compaq hardware caused –1018 errors but because Compaq was (and is) the only hardware vendor working in the Exchange development team and was willing to help Microsoft look closer at this issue. Again, this has nothing to do with WB caching. Compaq also recognized that one of the main issues was that some customers were not practicing good configuration management and did not upgrade firmware, drivers, and service packs as often as they should. In most cases of –1018 errors reported, the customer has not upgraded to a version of firmware or software that specifically addresses drive or controller issues leading to data corruption. Vendors are also working harder to make software and firmware upgrades easier to apply (which is the main reason that customers do not upgrade). Since Microsoft’s release of EX 5.5 SP2 and further improvement in Exchange ESE, both vendors and Microsoft have seen a reduction in the occurrences of –1018 errors. However, they have not been eliminated. Vendors and Microsoft are continuing to work together to look at this issue more closely.

The bottom line here is that the recommendation to disable WB caching on Exchange or SQL servers is outdated. Provided a controller has a battery-backed, memory-protected, and removable WB cache, there will be no issues. Vendors and Microsoft recommend that customers take full advantage of WB caching for optimal performance. Furthermore, organizations should invest in solid configuration-management practices to ensure that, when there are known software or firmware fixes, they are applied in a timely manner. Hardware devices will always fail, software bugs will continue to plague us, and we may never totally eliminate Exchange –1018s. However, let us at least pursue the right path and not go after red herrings such as WB caching.

Manageability —Being able to manage and configure your RAID controller easily may not be the most important criterion for selecting a RAID controller. However, in my opinion, this point makes mission-critical servers a nearer reality. Manageable RAID controllers have three key characteristics. First is the simple fact that a vendor provides rich configuration options. As mentioned earlier, some vendors allow the cache configuration to be modified and set to read-ahead, write-back, or both. In addition, many vendors let you select which arrays (defined as logical drives) on the controller to enable the cache for. In the case of Exchange 2000/2003, for example, you may select to enable write-back caching for an array that holds a particular storage group or database, but disable write-back caching for an array holding the other database for a particular storage group. To have this flexibility, the RAID controller needs a high degree of manageability.

Besides having many configuration options available for a controller, you need to have the ability to control some of these options on-line. Some controllers require the system to be rebooted to a BIOS-level configuration utility in order to modify any controller options or run controller diagnostics. Some vendors provide the ability to do both. System administrators can use the BIOS-level utilities when the server is off-line, but they are also able to manage their drive arrays while the server is up and running using Win32 utilities running on the Windows Server console. One example of this may be tuning the cache configuration (read-ahead versus write-back) on the fly based on application requirements. My third criterion for manageable RAID controllers is support for on-line capacity expansion. Since our Exchange deployments are constantly growing (more data and more users), the ability to dynamically grow storage while the server is on-line servicing users becomes paramount. Ultimately, this will only become a reality when the dream state of storage virtualization and dynamic storage resource management is ultimately reached.

In my discussions of SAN technology later in this chapter, we will look at this more closely. However, for locally attached, controller-based storage, many vendors offer on-line capacity growth options as well. In addition, Microsoft has added support for capacity growth in Windows Server.

Previous versions of Windows Server did not support on-line capacity expansion and required the server to be restarted for additional storage to be recognized. The last important criterion for manageable controllers is the ability to monitor and manage these devices remotely. Server vendors like IBM, HP, and Dell all offer advanced monitoring instrumentation of their RAID controllers via standard methods such as SNMP or WBEM. There are many other minor management capabilities for RAID controllers that various vendors support. While a manageable controller may be a luxury, it certainly will make life much easier for Exchange Server administrators as they manage the complex storage designs that Exchange 2000/2003 will create.

Vendor Support —Regardless of how well your RAID controller performs or how easily it can be managed, vendor support can have real impact of the reliability of your system. Vendor support is more than just a 7 24 toll-free number you can call when you have problems. It also includes how often the vendor updates key controller technology like firmware and device drivers. As I stated earlier, many data corruption issues I have seen have been the result of firmware or driver bugs. The degree to which a vendor will take responsibility and provide a timely fix is very important. Of course, once a fix is provided, you still need to apply that fix. Besides bug fixes, thorough on-line and off-line diagnostic utilities and easy updates are also important features that a controller vendor can provide. Sticking with top-tier vendors usually is the best bet here. There are many RAID controller vendors with products available. Choosing the vendor that can best support you will ensure maximum uptime for your Exchange system.