Most production systems use multiple disks to ensure recoverability, increase performance, and add fault tolerance. Multiple disks are also used for maintenance reasons. This lesson covers the various options available to accomplish these goals. These options range from simply placing each of your data files and transaction log files on separate disks to utilizing a combination of RAID and filegroups for very large databases (VLDBs).
A RAID system consists of two or more disk drives that provide greater performance and fault tolerance than a single large drive at a lower cost. RAID support is provided by software on the disk storage system (hardware RAID) or by the Windows operating system (software RAID). Because software RAID requires SQL Server 2000 to share processor resources with RAID, hardware RAID is generally used for maximum performance. However, software RAID will provide better performance than no RAID at all.
RAID levels 0, 1, 5, and 10 (also known as 1+0) are typically used with SQL Server 2000. A RAID system appears as a single drive to SQL Server 2000 on which a single file can be placed. Table 6.2 briefly describes each of these RAID levels and their performance characteristics.
There are three types of filegroups-primary, user-defined, and default. Each database can have a maximum of 256 filegroups. SQL Server 2000 always contains at least one filegroup, the primary file group. A filegroup can consist of multiple data files spread across multiple drives. Transaction log files cannot be part of a filegroup. The primary filegroup contains the primary data file containing the system tables. When you create secondary data files within a database, by default, these additional data files are placed in the primary filegroup. When you create database objects (such as tables and indexes) and add data to these objects, SQL Server 2000 uses the space within each of the data files within the filegroup proportionally, rather than allocating space from and writing data to one data file until it is full before writing to the next data file. This proportional fill method allows a database to be created that spans across multiple disks, with associated performance benefits. For example, if your SQL Server 2000 system has four disks, you could use one disk for the transaction log and the remaining three disks for the data files (one per disk).
Table 6.2 RAID Levels and Their Performance
Consists of two or more disks. Data writes are divided into blocks and spread evenly across all disks. Known as disk striping. This level is the easiest level to implement.
Fastest read and write performance; uses 100% of disk capacity. Not fault tolerant. The failure of one drive will result in all data in an array being lost. It is not advisable to use level 0 for data in mission-critical environments.
Generally consists of two disks (some RAID implementation support more than two disks). Data writes are written completely to both disks. Known as disk mirroring (or disk duplexing if two controllers are used).
Read performance almost twice as fast as a single disk, and fault tolerant. Relatively slow write performance, and uses only 50% of disk capacity.
Consists of three or more disks. Data writes are divided into blocks and written across all disks along with a parity stripe for data recovery.
Fast read performance, efficient use of disk space, and fault tolerant. Slow write performance. Uses 1/ n of disk capacity for parity information where n equals the total number of disks.
Consists of four or more disks. Data writes are striped across two or more disks and then mirrored across the same number of disks. Known as mirroring with striping.
Fastest read and write performance and fault tolerant. Uses only 50% of disk capacity.
Rather than placing all secondary data files in a single filegroup, you can create user-defined filegroups into which to place these secondary data files. On a system with multiple filegroups, you can specify the filegroup into which a newly created database object will be placed. This can have performance benefits, but also requires more administrative overhead and performance tuning expertise. If no filegroup is specified when a database object is created, it is placed in the default filegroup. The default filegroup is the primary filegroup until the default is changed. To change the default filegroup, use the ALTER DATABASE Transact-SQL statement.
Configuring your SQL Server 2000 disk subsystem for performance, fault tolerance, and recoverability means achieving the best read and write performance you can afford for the transaction log, the data files, and the tempdb database, while not sacrificing fault tolerance or recoverability.
You should choose your disk subsystem for your transaction log to reflect the fact that the primary function of the transaction log is to provide for recoverability of your data in case one or more of the disks containing your data files should fail. Next, the disk subsystem should reflect the fact that transaction log records are written serially on a continuous basis (sequentially and synchronously), but are only read for backups, to perform transaction rollbacks, and during database recovery when SQL Server 2000 starts. When the transaction log is read for backups, the load can be heavy.
Based on the primary function of the transaction log, the disks containing the transaction log files should not be shared with the data files for the database. Obviously, if your data files and your transaction log files share a common physical disk, you cannot completely recover your data if that disk fails. If your SQL Server 2000 installation contains multiple databases, at a minimum, place the database files from one database on the same physical disk as the transaction log file from another database. This will ensure recoverability of each database if any single disk fails.
The next level in optimizing your transaction log performance is to use dedicated disks for the transaction log files for each database. If you use a disk dedicated to a single transaction log, you can ensure that the disk head remains in place from one write operation to the next. Also read requests will be faster because the data will be laid down sequentially on the disk. Thus, using separate disks for the data files and the transaction log files has a transaction log performance benefit as well as being essential for recoverability.
The next level in optimizing your transaction log performance is to implement a RAID 1 system. This is more expensive (50 percent of your disk space is redundant), but yields significant performance and recoverability benefits. RAID 1 offers almost double the throughput on disk reads (for better backup performance) and minimizes downtime (if one disk fails, the data access shifts to the other). Data writes are somewhat slower, although faster than RAID 5. You can minimize the cost of this solution by minimizing the size of your transaction log (and thereby the size and cost of the disks) and by performing frequent transaction log backups.
You should choose your disk subsystem for your data files to reflect the fact that data loss and downtime are generally unacceptable. As discussed in the preceding section, the first step in improving the performance and recoverability of your database is to place the data files and the transaction log files on separate disk subsystems, and to use dedicated disks (if possible) for your data files so that data read and write (I/O) requests are not competing with other I/O requests on those disks.
The next level in optimizing your disk system is to improve I/O performance. Although one large disk can store as much data as a number of small disks, splitting your data among a number of disks results in increased I/O performance (particularly when multiple controllers are used). This increased I/O results from the ability of SQL Server 2000 to perform parallel scans for data using separate operating system threads when the data is spread across multiple disks. There are a number of ways to spread data across multiple disks.
Using multiple data files to spread across multiple disks in a single filegroup is one way to accomplish this. Using RAID to spread a single file (and its data) across multiple disks will also accomplish this, and can achieve superior performance. Using RAID can also provide fault tolerance. If you understand the data access patterns on your system, you can also use multiple filegroups rather than RAID (or in addition to RAID) to place heavily accessed tables and indexes on separate disks to improve I/O performance. However, in most cases, RAID provides most of the benefits of filegroups without the administrative overhead of multiple filegroups. When you have VLDBs, using multiple RAID systems and grouping RAID files in filegroups may be necessary for both performance and database maintenance reasons.
Given the administrative complexity of filegroups, database administrators usually choose RAID for their data files rather than filegroups. RAID 5 is frequently the first choice because of its efficient use of disk space and its fault tolerance. The downside of RAID 5 is relatively poor write performance. RAID 5 is a good choice for systems requiring high read performance and moderate write performance. However, if you need high write performance, choose either RAID 0 or RAID 10.
When you are choosing between RAID 0 and RAID 10, the lower reliability and slower recovery of RAID 0 dictate choosing RAID 10. However, short-term cost factors may make RAID 0 a necessary choice. Although RAID 10 is initially an expensive solution, it provides the performance of RAID 0 and the reliability and recoverability of RAID 1 (the cost of downtime when a drive fails may justify RAID 10).
If you need to use multiple RAID systems, you can choose to place each data file (each RAID system is generally a single data file) in the same filegroup, or you can use multiple filegroups. If you really understand the data access patterns in your database and are very good at performance tuning, you may be able to achieve some performance gain by using different filegroups. In a VLDB environment, using separate filegroups for maintenance reasons may determine how you configure your data files. For example, you may need to back up files or filegroups separately on a VLDB to be able to achieve an effective backup and restore strategy. Backup and restore strategies are covered in Chapter 8.
You should choose your disk subsystem for the tempdb database to reflect the fact that this database is used only for temporary storage for work files (such as intermediate result sets used in complex queries and DBCC operations). Optimizing the tempdb database means enabling it to handle a high volume of reads and writes. Recoverability is not an issue because tempdb is rebuilt each time SQL Server 2000 starts.
You should begin by placing the tempdb data file on its own disk so that it is not competing with other database objects for disk access. Next, consider using a dedicated disk controller for this separate disk. Finally, if tempdb is known to be a bottleneck, use RAID 0. The lack of fault tolerance is irrelevant to tempdb. No data is permanently stored in the tempdb database.
You can detach data and transaction log files from an instance of SQL Server 2000 and reattach them to the same instance or a different instance. Detachment is useful for moving a database to another instance or another server. It is also used to move data and transaction log files to different physical drives. You can use SQL Server Enterprise Manager or Transact-SQL to detach and reattach a database and its associated physical files.
To detach a database using SQL Server Enterprise Manager, right-click the database you want to detach, point to All Tasks, and then click Detach Database. See Figure 6.18.
The Detach Database dialog box.
If users are connected to this database, you can click the Clear button to disconnect them and complete the detachment process. You also have the option to update statistics before you detach. Updating statistics before detaching is used when you are moving the database to read-only media (such as a CD-ROM).
After a database has been detached, you can move one or more of the physical files to a new location (such as a dedicated disk or a RAID drive). To reattach a database using SQL Server Enterprise Manager, right-click Databases, point to All Tasks, and then click Attach Database. Figure 6.19 displays the dialog box that appears.
The Attach Database dialog box.
In the Attach Database dialog box, you must enter the complete name and path of the primary data file (the Browse button is available). The primary data file contains the information necessary to find the other files that make up the database. SQL Server 2000 reads this primary file and displays any secondary data files and the transaction log file to attach. If SQL Server 2000 does not find any of these files, it will place a red X in the check box next to that physical file. See Figure 6.20. This will occur whenever you move the secondary or transaction log files to a different location from the primary file. Edit the Current File(s) Location column for each file that has moved. You can also change the database name at this point. Finally, specify the owner of the database being attached and click OK.
Reattaching a database using SQL Server Enterprise Manager.
To detach a database using Transact-SQL, use the sp_detach_db system stored procedure. For example, to detach the TSQLDB database without updating statistics, use the following statement.
Sp_detach_db TSQLDB, TRUE
To reattach a database using Transact-SQL, use the sp_attach_db system stored procedure. When you use this stored procedure, you can specify up to 16 filenames that are included in the database that you want to attach. The filename list must include at least the primary file, because this file contains the system tables that point to the other files contained in the database. The filename list must also include any files that were moved after the database was detached. For example, to attach the TSQLDB database, use the following statement:
sp_attach_db TSQLDB1 , @filename1 = 'C:\Program Files\Microsoft SQL Server\MSSQL\Data\TSQLDB.mdf'
The first step in using multiple disks to optimize your SQL Server 2000 production environment is to use separate disks for your transaction log files and your data files. This separation will ensure recoverability in case of a disk failure. Use dedicated disks for all SQL Server 2000 files where possible. Next, use RAID 1 for your transaction log. Use RAID 10 if possible for your data files. Consider using either RAID 0 or RAID 5 if you cannot justify RAID 10 because of financial constraints.