Scaling for the File System | Microsoft Windows Server 2003 Insider Solutions

As companies begin to take advantage of new technologies such as Volume Shadow Copy, Redirected Folders, and desktop backups , there is a need for larger and larger file servers. The issue becomes that as more users are accessing the systems the servers are unable to keep up with the demand. Increasing the available disk space on servers only encourages users to store more data and this serves to further affect performance. Because historically the amount of data stored per user has consistently grown, the only option is to increase scalability of the file servers.

Scalability is the key to reducing operation costs

By properly scaling file servers it is possible to consolidate file servers. This reduces hardware costs and maintenance costs, and frees up valuable space in the data center. Scalability is the key to reducing operation costs.

Disk IO is Critical ”SCSI/RAID/IDE

Most modern servers come with a SCSI controller for the disk subsystem with the option of a hardware RAID controller. It is important to distinguish a hardware RAID from a software RAID. The easy way to distinguish them is that a hardware RAID is RAID "all the time." A software RAID is only RAID after the operating system has started. Software RAID requires processor time and is generally less efficient. Hardware RAID traditionally offers more advanced features in the area of distributing memory caches and in dynamic reconfigurations such as adding drives to an existing array.

SCSI comes in many flavors ”Wide, Ultra, Ultra Wide, Ultra 160, Ultra 320. Each of these flavors refers to a specific type of drive it supports and an overall bandwidth of the bus. Ultra 320, for example, has a total bandwidth of 320MB/sec. The important thing to note is that the bandwidth of the controller doesn't have anything to do with the bandwidth of a drive. An Ultra 320 hard drive doesn't have a throughput of 320MB/sec. The advantage of the controller having that amount of bandwidth is that it is able to control multiple hard drives before it becomes the bottleneck in the system. This allows a server to scale more efficiently because adding hard drives will increase performance until they saturate the bus. When this occurs you have the option of adding more controllers and reallocating disks such that none of the controllers are oversubscribed.

RAID traditionally refers to a Hardware RAID controller with an attached set of disks. RAID has the ability to take attached disks to another level. By writing to the disks in a specific manner a system can gain the ability to increase read and write performance and offer the ability to continue serving data even after disk failures have occurred. RAID is offered in several levels, each with different characteristics.

Using RAID over single attached disks allows servers to scale well because data is protected and access to data is improved. RAID technologies allow larger numbers of users to be supported on file servers. Striping disks allows the aggregate space to be treated as a single disk. This enables an administrator to surpass the physical limitations of a single disk.

BEST PRACTICE: Always Be Aware of the Implications

When making cache settings on a RAID 5 subsystem always be aware of the implications. Allowing write caching on a RAID 5 volume nearly eliminates the write penalty associated with RAID 5 as long as the cache is not full. Failure to commit the write cache to disk will almost always result in file corruption. Ensure that the cache has a functional battery backup if the plan for the RAID 5 calls for write caching. Some controllers allow the cache to be physically moved to another controller without losing the RAID configuration or any of the data stored in the cache. This allows the drives to be moved to another system and the cache flushed to disk.

BEST PRACTICE: RAID Types

RAID 0 is referred to as a striped disk array. The data is broken down into blocks and each block is written to a separate disk drive. The first block goes to the first drive, the second block to the second drive, and so on for all the drives and then the writes go to the first disk and follow sequentially. I/O performance is greatly improved by spreading the I/O load across many channels and drives because by having multiple read/write heads the disks can be accessed simultaneously . This scales performance in a nearly 1:1 manner.

RAID 1 is referred to as mirrored disks (or duplexed if there are two controllers). Any data that is written to disk 1 is written to disk 2 as well. There is no performance gain but if disk 1 fails there is an exact mirror of the data on disk 2 that can be utilized by the server.

RAID 5 is Independent Data disks with distributed parity blocks. This essentially means that as blocks are written as 0s or 1s the values are added up for n “1 drives and the resulting 0 or 1 (remember, this is binary and we are checking parity, so two 1s become a 0) is written as a parity bit on the remaining drive. In RAID 5 the parity is distributed across all drives. RAID 3 is a similar concept except that the parity is kept on a dedicated drive. This had a disadvantage of the parity drive seeing more accesses than any other drive and it became a bottleneck, as such it is rarely if ever seen in modern networks. RAID 5 has good read performance because the drives are read simultaneously and the additional heads will scale performance. Writes, on the other hand, suffer a penalty due to the need to check parity and possibly rewrite it. By having the parity bit a RAID 5 system can continue running if a drive is lost. When the drive is replaced the calculated parity is written back to the disk.

RAID 6 is similar to RAID 5 but with the addition of a second parity disk. This allows the system to survive the failure of two disks in the array. It suffers from even worse write performance and is rarely seen in use.

RAID 0+1 is a combination of striping and mirroring. The disks are striped for performance and mirrored for redundancy. It is the least efficient use of disks but it results in the best overall performance for applications that are both read- and write- intensive .

There are other forms of RAID (2, 7, 53, and so on) but they are rarely seen in production either because of a lack of performance advantage or because they are proprietary in design.

When Does an Environment Justify Using SAN/NAS?

As requirements for data storage and data access become extreme a server with locally attached SCSI or RAID storage can become unable to keep up with the rate of requests for data. Network operating systems such as Windows, Unix, or even Linux are very good at handling and servicing data requests but eventually they become overtaxed and another technology must be employed.

NAS stands for Network Attached Storage. SAN stands for Storage Area Network. These two technologies differ in one key area. NAS utilizes file level access and SAN utilizes block level access. SAN allows another system to believe that a portion of the SAN is local raw disk. NAS uses an additional abstraction layer to make another system believe that a portion of its disk is a virtual local disk. SAN is generally higher performance and is often used on databases because the performance is so high. NAS has only recently entered the area of databases as improvements in its technology and associated abstraction layers have resulted in performance that is sufficient for databases. NAS and SAN have a big advantage over attached storage in that they do not run a full operating system that was designed with hundreds of tasks in mind. They have very dedicated cores that are designed purely for high performance data access. The other key area in which NAS and SAN differ is in their method of attachment. NAS runs over ethernet (TCP/IP) and can take advantage of an existing LAN environment. The use of ethernet somewhat limits the bandwidth available to NAS. SAN, on the other hand, runs over fiber channel. This technology has much greater bandwidth than Ethernet but is also significantly more expensive. Not unlike most things in life, as performance goes up, so does cost.

NAS and SAN offer great flexibility in their ability to centrally manage storage and dynamically allocate space to servers. Some technologies such as large node clustering and large application farms would be nearly impossible without NAS or SAN.

When an environment gets to the point where the file servers are unable to service user requests for data in a timely manner or when attached storage capacity is simply exceeded it is time to strongly consider a NAS or SAN. For applications like Terminal Server farms, where users will attach to the system on any of the nodes, it is highly advisable to store the user's files on a SAN or NAS device. This ensures high performance access to these files from any server in the farm. Without this type of central storage, management of users and their data would be very difficult.

Fiber Channel

Fiber channel can be run across tremendous distances. Companies often use fiber channel networks to maintain mirrored data in other states. The bandwidth combined with the long haul features makes fiber channel a very valuable technology to use with data storage.

Remember RAM-disks?

Some situations call for extremely fast access to data but not necessarily large volumes of data. Computational analysis, databases, and system imaging software are just a few examples of applications that could benefit from extremely fast access to read-only data in the under 2GB range. This type of situation can greatly benefit from the use of RAM-disks. By partitioning off a chunk of system memory and treating it as a disk you can get memory speed performance for applications that traditionally accessed disks. Although this information can be written back to disk for storage, it somewhat defeats the purpose of the RAM-disk. By preloading information into the RAM-disk the system can spool out the data as fast as the network interface can handle. For situations like imaging hundreds and hundreds of desktops from a single server image the increase in performance can be stunning. Although Windows 2003 does not natively offer a RAM-disk, there are several third-party RAM-disk programs available such as RamDisk Plus from Superspeed or SuperDisk from EEC Systems.

RAM-disks Are Best Suited to Read-only Data

RAM-disks lose all data when the power is turned off. Ensure that the data will be committed to disk upon shutdown if the data will be read/write. Be aware that a system crash will result in any new data in the RAM-disk being lost. RAM-disks are best suited to read-only data.

Distributed File System

Another great way to scale file server performance is through Distributed File System. DFS essentially enables you to hide the file servers behind an abstracted layer. Instead of accessing shares in the traditional method of \\fileserver\share the user attaches to \\dfsshare\share. The DFS structure is comprised of links to other file shares. This hides the location of the data from the user. The advantage of this is that shares can be moved to larger servers without the user having to remap her resources. Replicas of the data can be created and Active Directory will allow the user to connect to the closest replica of the data. This allows a DFS structure to scale without consuming all available WAN bandwidth. It also offers a level of redundancy to the environment. If a DFS replica is down, users will connect to the next closest source of the data. This gives you tremendous flexibility in scaling the file servers.

Excellent Candidate for DFS

Read-only data that is accessed heavily is an excellent candidate for DFS. By placing multiple replicas of the data on the same network the DFS structure will load balance the access to the data, resulting in excellent end-user performance.