Performance Requirements

                 

 
Special Edition Using Microsoft SharePoint Portal Server
By Robert  Ferguson

Table of Contents
Chapter  22.   Example Scenario 3 ”Enterprise-Wide Solution


Selecting an appropriate server platform is an extremely important step in a successful portal deployment. Leveraging your hardware vendor's Microsoft product expertise ( especially SharePoint, Exchange, and SQL) can make a huge difference in the success of your Portal project. Of course, the key individual server hardware requirements are related to CPU speed, RAM, and hard disk space, but assistance in designing servers for specific server roles (such as Portal servers dedicated to server crawling, Index servers, corporate vs business unit searching, or other servers dedicated to Exchange, SQL2000, Active Directory Domain Servers, and so on) will assist you in budgeting, planning for, and installing each of these components .

The Sizing Questionnaire

To intelligently configure a Portal solution, details regarding areas like the following must be determined or estimated. Prior to Global selecting their hardware partner for the SPS Pilot project, a hardware-vendor “provided questionnaire covering these areas or questions will be obtained. Most questionnaires drive what is commonly called user -based sizing. That is, once Global completes the questions in the questionnaire, each vendor will take a shot a putting together a sizing, or proposal, describing what the overall SPS landscape should look like. User-based questionnaires seek to understand the number and nature of end users ultimately using the production (and/or test, or sandbox, or development, and so on) system during the peak hour of the peak month or season , so as to estimate "operations per second," and include

  • Number of users ” The total number of users who will ultimately use the site. In the case of Global Corporation, it is expected that 70,000 of the 90,000 total employees will be end users. However, for the purpose of the pilot, 700 of the 1,000 identified MANX personnel will actively use the SPS pilot.

  • Percent of active users each day ” This percentage represents the total number of specific dashboard users on any given day, and therefore must represent all dashboards across the enterprise. In many enterprise Portal deployments, it is not unusual to see perhaps 20 “30% percent of all users actually "active." Sometimes, this is called the number of concurrent users. However, this number varies by implementation, business group , region, and other factors. Take special care to be conservative but at the same time to not grossly overestimate this number of concurrent users, as it drives sizing calculations in a big way.

  • Number of operations per active user per day ” Perhaps the most difficult data to capture, this represents the number of operations that a typical user performs from each dashboard over the course of a typical day. An operation could be many things ”searching, retrieving, or editing documents; browsing a Web site or the home portal page; and so on. This number usually ranges from 1 to 10, and should consist of page views as opposed to site hits. Note that one simple method of collecting this kind of data is to collect page views from the Web server log of a pilot portal or other existing portal implementation.

  • Number of hours per day ” This is the number of hours during which the system is planned to be available. This number ranges between 8 and 24 hours, depending on the specific customer-driven requirements. In the case of Global Corporation, the system is planned for 24-hour uptime/availability requirements, with the exception of a bi-weekly four-hour "maintenance" window every other Saturday morning (which, for those interested, equates to 98.8% "availability").

  • Peak factor ” This number represents the delta between the average dashboard throughput vs the expected peak. While sometimes expressed as a percentage, it is more commonly depicted as a number ranging from 1 to 5.

  • Breakdown of types of users ” Many hardware vendors or systems integrators try to characterize SPS users into different classes. For example, certain workers might be deemed "power users" and therefore be assumed to consume a certain percentage of server resources. Other users might be termed "casual users," and consist of users performing an occasional bit of document management or searching activities. In the end, characterizing users is an inexact science, subject to interpretation by the various hardware and software vendors in the SPS market.

CAUTION

It is imperative that each vendor or systems integrator tasked with sizing the portal be provided the same data. That is, Global must share the same data with all of its partners , to facilitate a true "apples-to-apples" comparison once each partner generates a sizing or proposal. And Global must also ensure that terms like power users and casual users are understood by all parties, so as to be consistent across different vendors.


In the end, capturing the previous data seeks to address the following formula, and provides us with a "peak throughput" number measured in operations per second:

Number of users x percent of active users per day x number of operations per active user per day x peak factor

It is with this peak "operations/second" that we can go to the various hardware and solutions vendors and request solution designs (sizings) for Microsoft SharePoint Portal Server implementations .

CPU and RAM Sizing

Sufficient CPU and RAM resources are required to provide an acceptable user response time. Sizing for CPU and RAM becomes critical when a large number of users is expected during a peak usage period. In this case, characterizing the users to determine an average number of "active" or "concurrent" users ”users performing work ”as well as an average number of operations per second ”is important. With this information and business information regarding peak days and hours, it is possible to plan around the "peak" hour for which a productive solution will need to be sized . Note that if insufficient CPU or RAM resources are available, users will experience unacceptable server response times during these peak periods, and the solution will likely be deemed a failure by anyone trying to perform work at this time, regardless of how well the system performs during non-peak periods.

Pagefile Sizing

Though CPU and RAM are critical to response time, hard disks play a huge role in overall response times as well. The first role that disks play is in the form of the Pagefile. Pagefile sizing tends to be an art form for many, but the following are generally agreed upon rules:

  • The total Pagefile size should be 2 “3 times the size of physical RAM in each of the Portal servers.

  • The Pagefile should actually be split across two disk partitions, residing on two different physical drives. Usually, one Pagefile resides on the C: drive (to capture memory dumps, if desired), and another Pagefile resides on another drive not busy servicing other disk requests . For example, in addition to a pair of mirrored drives for the operating system, oftentimes a separate pair of drives may be added to a server specifically for the Pagefile and bulk of the SPS application executables.

  • The Pagefile should never reside on disks dedicated for logs, such as the Property Store log files, or the Web Storage System database log. By definition, logs are very write intensive . So, too, is the Pagefile, as it busily accepts pages in RAM that are deemed no longer valuable .

  • By the same token, the Pagefile should never reside on disks dedicated to the various SharePoint data documents. This is because the Pagefile would detract from the overall read/write performance of the data drives, and the busy data drives would impede write performance of the Pagefile. Thus, any database or storage drives are off-limits to Pagefiles, too.

In the next section, we will take a look at Global's general disk space requirements, and how they will approach sizing and configuration.

Disk Space General Requirements

Another critical factor concerning hard disk sizing regards simple disk space ”if there is insufficient hard disk space, end users will simply not be able to save additional documents. In fact, a lack of disk space could also impede document search capabilities.

RAID 1 Versus RAID 5

Finally, with regard to sizing, the performance of the overall data-serving functionality of the SharePoint Portal Server is most impacted by the configuration of the disk drives earmarked for logs, or to store the data.

In all cases, the performance of RAID 1 (mirrored) drives, or RAID 0+1 (a set of drives with data striped across them, mirrored to a similar set of drives) surpasses the performance of a RAID 5 set (striping data across all drives, with parity across all drives as well) in terms of both reads and writes . In many cases, in fact, a RAID 1 configuration will provide a 30 “40% performance gain in reads over a similar RAID 5 configuration. And in RAID 0+1 or 1+0 configurations (depending on your hardware partner's implementation of this combination of "striping" and "mirroring"), write performance may be 3x that observed in a similarly configured RAID 5 solution. These are potentially huge numbers , and represent huge deltas in obtainable performance (see Figure 22.5).

Figure 22.5. Depending on the disk drives, controllers, and hardware-vendor specifics, RAID 1 typically exceeds RAID 5 read performance by 1 “30%, and write performance by 20 “300%, at a cost of approximately 2x in terms of drives, controllers, and storage system requirements.

graphics/22fig05.jpg

Blocksize Selection

Another huge area affecting disk performance regards the operating system blocksize selected for the data and log files. As you're probably aware, when a new Windows 2000 disk partition is created under NTFS, the default blocksize is 4 kilobytes, or 4KB. The blocksize is actually the increment of data used in file transfers. So, reading a 64KB file involves 16 discrete disk reads, or I/Os (input/output activity). Some of the fastest disk controllers can do about 20,000 I/Os per second, so if you do the math, you see that eventually the disk controller can simply move no more data. At this point, the disk controller becomes a bottleneck ” only the elimination of the bottleneck will speed up the system. That is, a true disk controller bottleneck will not be solved by adding more RAM or processing power.

Fortunately, today's high-speed disk controllers and drives are not only fast, but also optimized for larger block sizes. By leveraging large blocks to move data, the following benefits are realized:

  • A fewer number of disk I/Os is required to move the same amount of data.

  • The result is an increase in the number of megabytes (MB) per second that the disk controller effectively moves.

Thus, if we look again at our previous example, it becomes quickly apparent that we can move the same 64KB of data in fewer disk I/Os if we increase the blocksize. Reformatting the drive housing our data to an 8KB blocksize reduces the number of I/Os to only 8. And reformatting for 64KB blocksizes lets us move our data in a single I/O! It is no wonder that software and hardware vendors alike, including Microsoft, Compaq, Hewlett-Packard, and more, typically recommend 64KB blocksizes for SharePoint Portal Server data and log files.

CAUTION

Changing blocksizes requires reformatting the drive. Thus, all data is lost. To keep your data, back up all files to disk or tape. Once the drive is reformatted, restore the backup to the new drive.


Disk Configuration and Layout

While performance and availability are both important, the cost of the solution must also be weighed. Disk configuration is driven by two fundamental needs:

  • The need to provide a robust system that can also be rapidly restored in the event of hardware or software failure.

  • The need to maximize the I/O of the database and operating system by segregating these activities on dedicated disk sets.

One disk subsystem sizing methodology that Global was exposed to used the following table to illustrate the base configuration requirements. The disk sizes and RAID levels indicated (see Table 22.1) are designed to provide for the base requirements of a minimal productive system installation.

Table 22.1. Disk Sizes and Raid Levels

Volume

Minimum Size

Recommended RAID Level

Contents

1

9GB

1

W2K Operating

System Operating

System PageFile

2

9GB

1

SPS Executables

3

9GB

1

SPS Log Files

4

27GB

5 (RAID 0+1 preferred)

SPS Data/Database Tables

The OS, Pagefile, SPS Executables, and Log Files should always reside on a RAID 1 volume, regardless of cost. Note that a RAID 1 solution always costs more than a comparable RAID 5 solution, but that a RAID 1 solution can lose multiple mirrors, or copies, of the data, so long as at least one mirror remains intact.

RAID 5, on the other hand, is typically utilized for the data volume(s) when the cost of additional drives/controllers/storage systems is as much a consideration as raw I/O performance ”it represents an excellent balance between cost and overall performance. With RAID 5, the equivalent of one drive (in a set of drives) is lost to maintain parity. Thus, a set of four 9GB drives in a RAID 5 configuration does not equal 36GB. Instead, it is 27GB, as expressed in the following formula:

(# drives * size of each drive) “ (1 * size of each drive) = total usable SPS space

A RAID 5 solution may "lose" a single drive, therefore, and still remain operational. That is, the parity information striped across all of the drives allows for determining the data (by block) that would have been on the drive, so for all intents and purposes the system is up and available even when a drive has failed. Once a second drive in the array fails, though, you are in trouble!

More on the Disk Controller

Not only are the RAID levels selected or drive configurations important for disk subsystem performance ”the Disk Controller itself is equally, if not more, critical. For it is the Disk Controller that supports the ability to create high-performance and highly-available RAID sets in the first place.

NOTE

Do not be tempted to implement RAID at an OS-level for any enterprise-wide Portal implementation, as this robs the OS of CPU resources. Sure, it works, but only at the expense of performance. Operating system “based RAID incurs a layer of overhead that directly impacts the performance of all applications running on top of the OS. Protecting your data is best left to hardware-based controllers dedicated to the task.


Many Disk Controllers also support caching at the controller level. The best controllers provide battery- backed cache and configurable read/write caches, so as to assist you in both safeguarding and tuning your SharePoint Portal Server (see Figure 22.6). Check with your hardware vendor prior to buying ”and ensure an apples-to-apples sizing/capabilities comparison when two or more vendors are being considered .

Figure 22.6. This high performance Disk Array Controller is worthy of an SPS production implementation, as it includes battery-backed read and write cache. In this way, no data is ever lost in the event of a power failure.

graphics/22fig06.jpg

CAUTION

Be wary of using array controllers with write cache that is not battery-backed. Such controllers run the risk of losing or corrupting your data in the event of a power failure. That is, any writes still sitting in the write cache that have not actually been posted to the physical disk drives represent data that will be lost when the server loses power, if not backed up by a battery on the controller.


Log File Optimal Disk Configuration

For the two primary log types, then ”the log files for the Web Storage System, and the log files associated with the property store ”a pair of dedicated mirrored drives represents an optimal configuration. These files are shared across all workspaces, and therefore represent a potential bottleneck if not addressed properly in the first place. In systems requiring the highest levels of performance, dedicated disk controllers for the logs ensure that disk activity elsewhere does not interfere with log writing.

Data Store Optimal Disk Configuration

Sizing a disk subsystem to support the Microsoft Web Storage System Database, Streaming Database, and Property Store is more complicated that the log disk configuration. For many companies, the cost of a RAID 1 or 0+1 configuration, where only 50% of the physical disk space being purchased is indeed available, is prohibitive. For others, though, like Global, the inherent performance gains that cannot be realized in a RAID 5 configuration dictate going with the more expensive approach. Work with your hardware vendor to best determine a configuration that meets your performance, availability, and scalability needs. And keep in mind that changing the layout and configuration of a disk subsystem in production is not a trivial task ”get it right the first time!


                 
Top


Special Edition Using Microsoft SharePoint Portal Server
Special Edition Using Microsoft SharePoint Portal Server
ISBN: 0789725703
EAN: 2147483647
Year: 2002
Pages: 286

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net