Exchange Server Sizing and Placement | Windows Server 2003 on Proliants. Deployment Techniques and Management Tools for System Administrators

< Day Day Up >

This section describes the different types of Exchange servers and provides a model that will allow you to get a feel for sizing and placement of Exchange servers. For the purpose of this section, we will reference a fictitious company named WeirMoving, which is moving from Exchange 5.5 to Exchange 2003. One of the primary goals for WeirMoving's migration is the same as many companies today: consolidate servers and datacenters to lower costs. The WeirMoving's Exchange 5.5 deployment was very distributed, and the company wants to move to either a single datacenter or to a few datacenters. Because of the network requirements associated with moving users further away from the servers, the company needs to plan for both the single datacenter option and the multidatacenter option. The scenario will present clustering as a solution for high availability. Clustering, of course, is not required, but with more eggs in one basket , some sort of high-availability solution is needed. So, with that as a base, let's address server placement for WeirMoving.

Server placement will be primarily based on network constraints. The rule for capacity planning is that each user accessing an Exchange server via Outlook 2003 will require 4Kbps of available bandwidth. Therefore, a single datacenter will require enough bandwidth from all user locations to support that level of traffic. The same bandwidth requirement holds true for multiple datacenters as well. So, before you can determine the appropriate placement of servers, and therefore the level of server and site consolidation, the current and future available bandwidth from every location must be well understood . Additionally, if political, legal, financial, or technical reasons require a server to be deployed outside of a datacenter environment, then server sizing will need to be given to incorporate such a deployment.

The key design element for this example Exchange 2003 migration project is centralization.

Storage Groups

One major difference from Exchange 5.5 is that Exchange 2000/2003 introduces storage groups . A storage group is a set of Exchange Server databases that all share the same transaction log files.

Exchange 2003 can contain up to four storage groups for each server. Each storage group can contain up to five databases. As a result, each server can host a maximum of 20 databases.

Because disk and memory are reasonably cheap, the choice of filling a storage group first with the maximum number of databases or creating multiple storage groups to host the same number of databases depends on the restore requirements and SLA. Because transaction logs are associated with storage groups, the number of transaction logs can be quite large for a single storage group with five databases; thereby increasing restore time and transaction log playbacks. In such instances, it is advisable to create multiple storage groups to allow for parallel backups and faster recovery times. For our example, the storage group design is as follows :

Single datacenter : A single datacenter deployment at the WeirMoving datacenter will be implemented on an N+1 cluster. Specifically, the cluster will contain four active nodes and one passive node. An active node is one that is actively serving end users. A passive node is a node configured to be a failover node in the case where an active node fails, typically via a hardware failure. Multiple passive nodes can be configured. A passive node should be configured with the same processing power as the active nodes. Simply put, all nodes in the cluster should have the same hardware configuration in a strict N+1 configuration. Each node will be configured with four storage groups per server.
Multiple datacenters : As with the single datacenter model, clustering will be implemented in a multidatacenter deployment. Multidatacenter in our scenario is being defined as at least three datacenters, with an optional fourth datacenter. This multidatacenter model will consist of smaller cluster deployments in each of the datacenters. Again, the clustering will be N+1 clustering, but in this case there will be two active nodes and one passive node. As before, each node will be configured with four storage groups per server.

Databases

An Exchange 2003 database actually consists of two files: the properties store (*.EDB) and the streaming store (*.STM). These files have different access characteristics depending upon the type of clients that will be supported. For MAPI clients interchanging e-mail with each other, the STM is not the final resting place for messages, even though the messages might have traversed SMTP when traveling between servers. This is because the message is tagged as Transfer Neutral Format (TNF), and the entire message (header information as well as content) is promoted to the .edb. For IP clients (such as IMAP, POP3, HTTP, and SMTP), the streaming store is the primary storage location with certain properties being stored in the properties store.

Although clients have their preferred file type, cross conversions can occur. For example, if a POP3 client submits a message via SMTP, the message data is physically stored in the .stm file. There is property promotion to the .edb file of the message header data. This takes place automatically. However, now assume that a MAPI client attempts to read the message. In this scenario, the message data in the .stm file is converted on-the-fly for reading purposes (the message and attachments are left in the .stm file and converted in memory). Promotion of an entire message happens only from the .stm file to the .edb file if the message is modified. There are no circumstances where data is promoted to the .stm file. All client conversions are performed in memory. In the majority of scenarios, there is no real benefit to splitting the .edb files and .stm files on separate spindles.

Database size can be controlled by limiting the number of mailboxes on each database as well as setting mailbox quotas through mailbox store policies. Users in the example environment have been categorized into three classes:

Regular : Regular users that will have a mailbox limit of 100MB (will receive a warning at 80MB). It is expected that most of the users, more than 80%, will belong to this class.
VIP : VIP or executive users with a limit of 200MB (will receive warning at 160MB).
Casual : Contractors, store workers, laborers, and so on. are examples of casual users, with a limit of 20MB (will receive a warning at 16MB).

Based on best practices from an operations perspective with respect to management, backup, recovery, and defragmentation, the following database or store design guidelines have been defined for the Exchange environment:

As a rule of thumb, 100MB per user (for all types of users) will be used as the limit for capacity planning purposes.
Regular, VIP, and Casual classes of users will exist on separate mailbox stores, where possible, so that different store policies can be applied to them.
Each storage group can hold up to five databases. In the example environment, each storage group will contain a maximum of four mailbox stores. The fifth one will be left open for expansion.
Each storage group and its associated databases will reside in the same logical volume.
Each transaction log associated with a storage group will be located in a separate logical volume of its own.
Database sizes within a storage group will not exceed 30GB. This is the absolute maximum that a database size will be allowed to grow to in the example Exchange environment. The 30GB limit is being determined based on the following:
What is the current defined SLA? (Remember, we referred to this in the business requirements section.)
What is the SLA service restoration time frame?
What is the speed of the restoration solution? In essence, how much data can be restored within your SLA window? (Ensure that enough time is allowed for transaction log playback; conservatively, 30 seconds for each log file.)

Based on the answers to these three questions, our environment can alter the maximum database size based on the support that can be provided. For example, if a restore solution restores data at 15GB/ hour , it will take two hours to restore a 30GB database. Factor in additional time for transaction log playback and other factors (getting the right people, process, delays, and so on), and it might require four hours to restore a 30GB database.

There must be at least 30GB (or whatever the size is of the largest mailbox store) of free space to conduct any offline maintenance operations.

For the purpose of this design, four categories of Exchange servers will be defined for database and server sizing:

Large, central cluster : Up to 20,000 users.
Smaller, regional datacenter clusters : Up to 6,000 users.
Large metropolitan server : Up to 600 users.
Small site server : Up to 300 users.

note

These numbers are specific to this fictitious environment and will likely be very different from your environment.

The HP and Microsoft recommended method for predicting the maximum size of an Exchange 2003 mailbox store is

 ((Mailbox Quota * Number of Mailboxes)/ Sharing Ratio) + % Allocated to Deleted Items

The following values are used in the calculations:

Mailbox capacity planning quota of 100MB per user.
Sharing ratio of 1.2; industry calculations suggest most servers have ratios from 1.2 to 2.5.
% Allocation to Deleted Items of 30 % based on industry calculations (20 % for 7 days, 30% for 14 days, and almost 100% for 30 days).

Based on this theory, the total message store size per database will be kept at a level to support 300 users and a database size of less than 30GB.

Smaller numbers of users per database will exist in the nonclustered servers, although that number can be increased if needed.

Transaction Logs and Circular Logging

Each storage group has a set of transaction log files. The log files contain all the page modifications applied to the databases belonging to the storage group. There is no practical limit to the quantity of log files per storage group; however, it is critical to monitor the growth of these log files: If an Extensible Storage Engine (ESE) instance (that is, a storage group) cannot write transactions to the log files, it stops, and therefore blocks any access to the databases belonging to the group. A log file is 5MB in size. Any difference in file size indicates a corruption of some sort.

There is only one supported means of removing transaction logs, and this is to back up the entire storage group (that is, with an Exchange-aware agent and a full normal online backup). Because transactions for the different databases within a storage group are mixed in a single transaction log file set, it is necessary to back up all the databases of the storage group to truncate the log files.

The STORE process uses a write-ahead logging mechanism to modify the contents of the database. It first proceeds to create transactions in the current transaction log file. A separate thread of execution then updates the database page in the database, as the checkpoint or memory pressure in the cache dictates.

Exchange 2003 introduces a new checksum process for transaction logs. From the beginning, ESE has been using checksum on the database pages to detect page corruptions. With Exchange 2003, checksums are also used for the transactions stored in the transaction log files.

It is very important to protect the transaction log files. They represent the most up-to-date status of the database. Because the write process to the databases is asynchronous, the current state of an Exchange database is represented by the set of modified pages buffered in memory, and the pages committed to disk. Only the transaction logs keep track of which pages have been modified in the Exchange database. In fact, it is better to lose a database than to lose the set of transaction logs. It is always possible to recover a database from the previous backup and the current transaction log.

Circular logging is configurable on a storage group basis and is disabled by default. When enabled, it does not maintain all transaction log files. Instead, it maintains a window of a few log files. These are overwritten with newer transactions after they are committed to the database and the log becomes full. This helps manage disk space by preventing transaction logs from building up. However, it does prevent data recovery from past transactions that have been overwritten. As a result, we will ensure circular logging is disabled on all mailbox (data) servers to ensure maximum data recovery.

Connector servers and Outlook Web Access (OWA) servers store only transient data and will not be backed up or restored to/from tape. As a result, we will enable circular logging on connector and OWA servers to prevent the transaction logs from consuming disk space.

Clustering

During a design workshop, the pros and cons of a clustered exchange environment should be discussed. Clustering is certainly useful while performing maintenance or upgrades on any one node of the cluster. Because in our example environment, we are moving to a single, or at least a few, datacenter(s), clustering will be used as a means of high availability. Clustering is more complex than single-server deployments, so it's imperative that you have your personnel appropriately trained. Testing all clustering configurations before production deployment is also critical. Processes and procedures specific to the cluster must also be developed and tested . Although clustering is a viable option, certain factors must be kept in mind; some items are repeated in an effort to signify their importance.

Exchange clustering is very picky about the hardware and software revision levels for it to work efficiently . Therefore, clusters must be tested thoroughly and, ideally , you should have a similar configuration in the test environment at all times to test any future patches, supporting software, or upgrades to firmware levels in the test lab prior to deploying it in production.

A clustered solution requires that you have expertise, and good knowledge and understanding of the different concepts in detail for troubleshooting and managing the Exchange cluster. Recovery processes for a failed clustered must be well documented and executed in the test lab.

Future upgrades of clusters to future versions of Exchange might be complex. This held true for Exchange 5.5 clusters, which could not be directly upgraded to Exchange 2000 clusters.

Failover duration from one node to another depends on the number of concurrent connections at that time. With Exchange 2003, however, cluster failover is much faster than in previous versions as the STORE process is " killed " after a maximum of three minutes. From there, failover will be as fast as the passive server can take over the service.

Server Sizing Considerations

The performance of Exchange 2003 servers is critical to uphold SLAs and provide an acceptable user experience.

The following will be used as guidelines for sizing Exchange 2003 servers in the example environment:

Processors : Exchange 2003 scales well up to four processors. As processors are added to the system, the incremental performance boost for each decreases (up to eight processors). After eight, additional processors start to degrade performance. Processor speed is important, but L2 cache size offers a greater boost on performance, typically 5 to 7%.
Memory : An Exchange 2003 server scales well up to 4GB of memory. If an Exchange server has 1GB of memory or more, the /3GB switch should be added to the Windows 2003 Advanced Server Boot.ini file so that 3GB of memory is made available for user mode applications (that is, the Exchange store process). The new /USERVA switch, also a boot switch, should also be used to make more efficient use of memory.
Disk : Exchange 2003 uses a transactional logging database. As a result, disk I/O is usually the first bottleneck for Exchange.
Exchange logs : Logs use sequential writes to record transactions. However, it uses random reads and writes to access the database files. As a result, logs and databases should be separated between physical partitions to avoid I/O saturation.
RAID : Use hardware RAID for fault tolerance and performance. Use RAID1 for log drives unless the drive size is not large enough. Use RAID0+1 or RAID 5 for database drives . Although RAID5 can be used, performance can degrade beyond seven disks in a RAID set, depending on the hardware manufacturer and implementation. There is a tradeoff between available disk space, performance, and reliability. The number of spindles required for RAID0 + 1 might be higher than RAID 5, but RAID0 + 1 does offer a higher degree of performance and reliability.
Performance : For Exchange performance considerations, assume .5 to 1.2 IOs per active user per second to determine proper spindle counts. For example, at .5 IO/user/sec, if you have 5,000 active users against an Exchange server, your IO subsystem must be able to handle 2,500 IOs per second, sustained.

These guidelines will be used to size the various types of exchange servers in the environment such as mailbox servers, connector/Bridgehead Servers (BHSs) and OWA front-end servers. The object of this exercise is to come up with different sizing guidelines, and then you can pick and choose the right configuration to host their user populations.

Server Categories and Hardware Sizing

For the purpose of this design, four categories of Exchange servers will be defined for database and server sizing:

Large cluster : Up to 20,000 users.
Regional datacenter cluster : Up to 6,000 users.
Large metropolitan server : Up to 600 users.
Small site server : Up to 300 users.

Based on the database size considerations and the server categories for hosting exchange, the hardware configuration shown in Table 12.5 is being used for this sample environment.

Table 12.5. Exchange Server Types

Server Type

Comments

Configuration

Small Site Server

(up to 300 users)

Exchange Mailbox Server

Database Size < 22GB

Single Storage Group

Two databases: 15GB each

Dual ^[1] CPU (512K, L2 Cache)

Memory: 512MB

NIC: 10/100Mbps

Storage minimum per server (internal, PCI RAID Controller w/onboard battery-protected cache)

RAID1 (2 x 18GB) - partitioned for OS, IIS and event logs, and Page file

RAID1 (2 x 18GB) “ Transaction logs

RAID1 (2 x 36GB) “ databases

Local Tape Drive (DLT)

Metropolitan Server

(up to 600 users)

Exchange Mailbox Server

Single Storage Group

Three databases: 20GB each

Dual CPU (512K, L2 Cache)

Memory: 1GB “ 2 GB

NIC: 10/100Mbps

Storage minimum per server (internal, PCI RAID Controller w/onboard battery-protected cache)

RAID1 (2 x 72GB) - partitioned for OS, IIS and event logs, and Page file

RAID1 (2 x 72GB) - Transaction logs

RAID 0+1 (4 x 72GB) “ Exchange Databases

Hot Spare

Local Tape Drive (DLT) or backup to Central Backup Server

Regional Cluster Servers

(up to 6,000 users)

2 Active + 1 Passive

Mailbox Servers with SAN storage

Total Database Size 960GB

Quad CPU (1MB, L2 Cache)

Memory: 4GB

NIC: 2 x 100Mbps (redundant)

Four Storage Groups for each server. Four databases: 30GB per SG

External or SAN based Storage (minimum of) per server

RAID1 (2 x 18GB) - Transaction logs for Storage Group 1

RAID1 (2 x 18GB) - Transaction logs for Storage Group 2

RAID1 (2 x 18GB) - Transaction logs for Storage Group 3

RAID1 (2 x 18GB) - Transaction logs for Storage Group 4

4 x RAID5 OR RAID0 + 1 (at 120GB/SG) “ at 600 Sustained IO/sec/SG, or 150 IO/sec per 30GB database, or 1,500 IO/sec/server, sustained

Hot Spare

Tape Library (SAN)-based SDLT tape library)

Single Datacenter Cluster

(up to 18,000 users)

4 Active + 1 Passive

Mailbox Servers with SAN storage

Total Database Size less than 1.92TB

Quad CPU (2MB, L2 Cache)

Memory: 4GB

NIC: 2 x 100Mbps (redundant)

Four Storage Groups for each server. Four databases: 30GB per SG

SAN based Storage (minimum of) per server:

RAID1 (2 x 18GB) - Transaction logs for Storage Group 1

RAID1 (2 x 18GB) - Transaction logs for Storage Group 2

RAID1 (2 x 18GB) - Transaction logs for Storage Group 3

RAID1 (2 x 18GB) - Transaction logs for Storage Group 4

4 x RAID5 OR RAID0 + 1 (at 120GB/SG) “ at 600 Sustained IO/sec/SG, or 150 IO/sec per 30GB database, or 1,800 IO/sec/server, sustained.

Hot Spare

Tape Library (SAN based SDLT tape library)

1,000 user

Front-end Server

OWA Front-end server

Dual CPU (512KB, L2 Cache)

Memory: 2GB

NIC: 1x100Mbps minimum (Redundancy recommended)

Storage (internal, PCI RAID Controller w/onboard battery-protected cache)

RAID1 (2 x 72GB) - partitioned for OS, IIS and event logs, and Page file

RAID1 (2 x 72GB) - Exchange 2003 logs

RAID1 (2 x 72GB) - Exchange 2003 database

BHS

SMTP Connector Server

Dual CPU (512KB, L2 Cache)

Memory: 2GB

NIC: 2x100Mbps

Storage (internal, PCI RAID Controller w/onboard battery-protected cache)

RAID1 (2 x 18GB) - partitioned for OS, IIS and event logs, and Page file

RAID1 (2 x 18GB) - Transaction logs

RAID 0+1 (4 x 72GB) - SMTP and X.400 folder queues

^[1] Most customers tend to have a policy to only purchase dual processor systems. A single CPU would provide enough performance for both the 300 and 600 user systems.

Some things to keep in mind related to the hardware recommendations being made:

On all servers in the previous table, the use of P4 Xeon processors is recommended where it fits the budget. P4 Xeon processors have been proven to provide increased performance with Exchange 2000/2003.
All Exchange 2003 servers will be installed with
- Windows 2003 Advanced Server, except for the smaller mailbox servers
- Exchange 2003 Enterprise Edition
All Exchange 2003 servers will be member servers in the child domain.
Every effort should be made to avoid deploying the small mailbox servers mentioned previously, as centralization will lower Total Cost of Ownership (TCO).

Server Placement

There are two basic contradictory approaches to server placement:

Centralized : Locate servers in centralized locations near administrators so they can be more easily supported, take advantage of server and product scalability, and lower the overall TCO.
Distributed : Locate servers near users so they receive optimal response times through their messaging clients.

The goal is to centralize servers while maintaining an adequate user experience. This is heavily dependent upon the available network bandwidth. As part of this design effort, some key server placement guidelines have been developed. We will deploy Exchange servers based on these guidelines only:

Exchange servers must be placed in datacenter locations (Tier 1) only. Every effort must be made to avoid placing Exchange servers at nondatacenter (for example, local office) locations.
An Exchange server should be placed at a location only if that location is also defined as a Windows 2003 AD site.
An average of 4Kbps per active Outlook 2003 user can be utilized to calculate bandwidth required for concurrent client connections across the WAN to a Tier 1 location to access Exchange. (That number rises to 6Kbps with prior versions of Outlook.) This metric can be used to decide which services to provide locally (AD authentication services), and which services will be supplied remotely (application such as Exchange).
Adequate expertise and support personnel must be available at the Exchange server locations to provide any onsite support for management and recovery procedures. Exchange is a critical business application for nearly any customer.
Adequate reasoning has to be provided to the governance committee regarding the need of an Exchange server at a nondatacenter location or a branch office. To better manage Exchange in a central consolidated environment, it is recommended to increase the network bandwidth to remote locations to host messaging services centrally .
Although in the example environment there is a lot of consideration for consolidation, there might be instances where it is critical to provide local exchange services to branch office locations. This situation will be avoided as much as possible, but if such locations do exist, it is important to provide a vision or roadmap of consolidating these Exchange servers to a datacenter location at a later date.

Based on the server placement guidelines in the previous list, in our example we will deploy servers in either of the following scenarios:

Central Datacenter : This style of deployment uses a single five-way cluster (N+1 style clustering) with four active and one passive node.
Regional Datacenters : Made up of three or four locations.

As part of the design, a list of all exchange servers by location and scenario should then follow, similar to that shown in Table 12.6.

Table 12.6. Central Datacenter: Exchange Server Placement

Server Name	Server Function
Drohan Datacenter
< servername >	Exchange Mailbox Server (Active)
<servername>	Exchange Mailbox Server (Active)
<servername>	Exchange Mailbox Server (Active)
<servername>	Exchange Mailbox Server (Active)
<servername>	Exchange Mailbox Server (Passive)
<servername>	Exchange Bridgehead Server (SMTP)
<servername>	Exchange Bridgehead Server (SMTP)
<servername>	Exchange OWA Front-End Server
<servername>	Exchange OWA Front-End Server

It is important to restate that a central datacenter will require a network infrastructure that can support 4Kbps of available bandwidth from every user's desktop to the datacenter.

< Day Day Up >