Database Tuning Areas to Consider | Performance Tuning for Linux Servers

Consider the following areas within a database that can be tuned for improved performance:

I/O
Queue length and response times
Load balancing
Global memory
Logging device

Each of these sections gives an in-depth look at things to consider.

I/O Tuning

You must make many decisions when it comes to I/O tuning. Will you use raw devices or file systems? What about direct I/O? What blocksizes should you choose for your database? If you are running strictly an OLTP (online transaction processing, characterized by small, random reads/writes) workload, you want to choose a smaller blocksizefor example, 2K. For DSS long running queries, a database that implements a sophisticated query optimizer and complex memory (sort/hash area) parameter controls, a larger blocksize improves database scans8K (or larger if the database provides). What if your workload entails both OLTP and DSS? Careful database parameter tuning needs to be considered; in some instances, compromise is the way to go, perhaps settling with a 4K blocksize.

Queue Length and Response Times

On Linux, vmstat is a good tool for measuring I/O bandwidth. Two columns under the I/O section are labeled bi and bo. These columns are supposed to be blocks in and blocks out from a block device, as described in the man pages for vmstat. However, in various Linux distributions these columns actually report transfer rate in KBps from character devices (raw) or a block device (file system) during the measurement interval. For both workloads, if queue length is greater than 1, there is likely some contention. For OLTP, response times over 50ms are a concern.

Load Balancing

In Linux, several tools can help you determine whether your database system is in need of load balancing. A simple way to do this is to use iostat (for descriptions of iostat, see Chapter 4, "System Performance Monitoring," under the "I/O Utilization" section).

Following is an example of the iostat x output:

 Device:  rrqm/s wrqm/s   r/s    w/s   rsec/s   wsec/s \  rkB/s    wkB/s  avgrq-sz avgqu-sz   await  svctm  %util sda      0.00   0.00     272.86 97.37 2489.63  842.95\    1244.81  421.47 9.00     2.61       7.05   2.10    77.59 sdb      0.00   0.00     0.00   0.00  0.00     0.00\  0.00     0.00   0.00     0.00       0.00   0.00    0.00 sdc      0.00   0.00     272.22 102.00 2495.50 878.43\  1247.75  439.21 9.02     2.64       7.05   2.04    76.42 sdd      0.00   0.00     0.00   0.00   0.00    0.00\  0.00     0.00   0.00     0.00       0.00   0.00    0.00 sde      0.00   0.00     277.96 99.77  2532.04 865.89\  1266.02  432.94 9.00     2.59       6.87   2.05    77.29 sdf      0.00   0.00     0.00   0.00   0.00    0.00\  0.00     0.00   0.00     0.00       0.00   0.00    0.00 sdg      0.00   0.00     272.29 102.00 2483.49 878.69\  1241.75  439.35 8.98     2.62       7.01   2.06    77.20 sdh      0.00   0.00     0.00   0.00   0.00    0.00\  0.00     0.00   0.00     0.00       0.00   0.00    0.00 sdi      0.00   0.00     0.00   693.03 0.00    6326.38\  0.00    3163.19 9.13     0.63       0.91   0.91    63.00 sdj      0.00   0.00     0.00   0.00   0.00    0.00\  0.00     0.00   0.00     0.00       0.00   0.00    0.00 sdk      0.00   0.03   276.03  98.80   2542.71 855.49\  1271.36  427.74 9.07     2.60       6.94   2.05    76.67 sdl      0.00   0.00     0.00   0.00   0.00    0.00\  0.00     0.00   0.00     0.00       0.00   0.00    0.00 sdm      0.00   0.00     272.92 96.73  2480.03 836.28\  1240.01  418.14 8.97     2.55       6.89   2.06    76.30 sdn      0.00   0.00     0.00   0.00   0.00    0.00\  0.00     0.00   0.00     0.00       0.00   0.00    0.00 sdo      0.00   0.00     272.19 100.47 2486.96 866.16\  1243.48  433.08 9.00     2.64       7.10   2.07    77.21 sdp      0.00   0.00     0.00   0.00   0.00    0.00\  0.00     0.00   0.00     0.00       0.00   0.00    0.00 sdq      0.00   0.00     268.09 96.57  2445.08 838.95\  1222.54  419.47 9.01     2.53       6.93   2.09    76.18 sdr      0.00   0.00     0.00   0.00   0.00    0.00\  0.00     0.00   0.00     0.00       0.00   0.00    0.00 avg-cpu:  %user   %nice    %sys %iowait   %idle          90.63    0.00    8.90    0.47    0.00

Notice that some disks are not being accessed at all. You should make sure that all tables are evenly laid out across all disks if a software striping capability is not used. Note that disk sdi is actually performing writes during this read-only piece of the benchmark. This is because the logs are apparently located there. Logs should be on a separate stripe volume, or a separate disk where possible, so that disk sdi is not slowed down by other aspects of the benchmark.

Global Memory

In general, for OLTP workloads, as much of the I/O as possible should be moved to memory using the global cache area of the database. Most databases provide tools that allow you to see whether user transactions are being cached, including statistics on dirty buffers and buffers used. To properly size the memory under Oracle, the database_block_buffers need to be set. This is done simply by determining how much free memory is available to be dedicated to the database and then dividing that by the database_block_size, as follows:

4GB = 2.5GB for the database, so 2684354560 / 4096 = 655360

Following is an example of a db_block_buffers formula:

 Database heap (4KB)                           (DBHEAP) = 6654  Size of database shared memory (4KB)(DATABASE_MEMORY)\       = AUTOMATIC  Catalog cache size (4KB)            (CATALOGCACHE_SZ) = 386  Log buffer size (4KB)                      (LOGBUFSZ) = 2048  Utilities heap size (4KB)              (UTIL_HEAP_SZ) = 10000  Buffer pool size (pages)                   (BUFFPAGE) = 40000  Extended storage segments size (4KB)  (ESTORE_SEG_SZ) = 16000  Number of extended storage segments (NUM_ESTORE_SEGS) = 0  Max storage for lock list (4KB)            (LOCKLIST) = 16384

OLTP/WEB workloads that tend to rely on primary key index lookups can benefit from large bufferpools to cache results and reduce a bottleneck on I/O throughput (I/O per second) of your I/O subsystem. DSS workloads that tend to have large table scans returning many rows can benefit by configuring memory for large sorts and joins, to avoid overflows or spills to temporary disk space that can hamper large I/O bandwidth/throughput (MBps). This is done by configuring the hash and sort size database parameters. For these workloads, the global cache size does not need to be largeit can be an order of magnitude smaller than the global cache area required for an OLTP workload.

Here is an example of using vmstat to determine free and used memory, followed by a description of various relevant columns. Note that several columns of data that are normally returned by vmstat are not included in this example.

 [root@raleigh root]# vmstat 4     procs               memory                   swap   r  b  w    swpd   free    buff  cache       si   so  1  0  0    0      199328  56792 123040  0   0     4

The free column is the first column to look at. This column is reported in kilobytes. If you have free memory available, it is probably not the limiting resource.
The swpd column is looked at next. Reported in kilobytes, this column tells you how much virtual memory is used or how much of your memory was pushed out to disk.
The si column reports how much memory was swapped in from disk during the reporting interval.
The so column reports how much memory was swapped out to disk (virtual memory) during the reporting interval.

If you have a large swpd and you see heavy swapping activity in the si and so columns, you may want to add more memory or reduce the amount of memory that is allocated to your database, leaving more memory for the applications. Be sure memory is available to allocate to the database. Additionally, this assumes care has been taken to lock down the global cache area of the database in Linux.

You can also use the Linux top command to get more detailed information on your large memory consumers. If you press h while the top command is running, you can get a list of options. Press m to sort by resident memory usage to determine which processes are the largest consumers. The Linux tool /usr/bin/top is more intrusive than vmstat, using more CPU. Begin with vmstat and continue with top if you need additional information.

It is important to remember that on 32-bit Linux systems, it is possible to have more memory than the database software can address. In this situation, you should look for creative ways to use spare memory if you have an I/O problem. Use memory whenever possible to reduce I/O; in some cases, the databases' tempspace areas can be taken advantage of, particularly for individual processes using sort or hash areas (typical in DSS workloads). Be sure that the database parameters controlling these are set to the highest possible maximum (divided by the number of database agents) but still are under the system memory (including the kernel).

Logging Device

Most often, when all other bottlenecks are resolved, the optimization of the logging device ultimately determines OLTP database performance. It is important to separate the logfile from all other database files as much as possible.

Your next step is deciding whether to run logfile using raw devices or file system devices. Historically, raw devices are the preferred logging device for databases that support it. Some databases use direct I/O file systems, which can provide performance within 5% of raw device performance. Other (typically noncommercial) databases choose to utilize file system buffering, taking advantage of buffering as provided by Linux. A direct comparison in the individual proposed environment is advised.

Typical Database Workloads

It is important to distinguish between OLTP and DSS workloads, because these workloads are diametrically opposed in their resource utilization; your environment may contain some combination of these workloads. Tuning for one may cause performance degradation in the other.

OLTP workloads are database workloads composed of many (hundreds or more) users and heavy transaction processing. These workloads are characterized by small random reads/writes and a high number of iops (I/Os per second). A decision support system workload refers to a workload requiring intensive query processing, typically fewer users but significant query processing that fully utilizes total system resources. Note that the TPC's (Transaction Processing Council) TPC-C is an example of an intensive (steady state) OLTP database benchmark, whereas TPC-H is an example of a DSS workload benchmark, not a steady-state workload. In other words, barring logging or database checkpoints that are performed during or as part of the TPCC/OLTP benchmark, the workload remains at a steady state. That is, all system resources remain in a steady state of system utilization. However, the TPC-H workload does not maintain a steady-state workload, particularly during the power test. Rather, the workload exhibits wildly inconsistent resource utilization as it moves from queries that require memory, to memory that requires intensive I/O scanning, to queries that demand intensive CPU utilization. During the throughput phase of this test, the workload may at times appear to be in more of a steady state, but only if it is perfectly tuned. Otherwise, it may appear to be "stuck" on a certain query, causing inconsistent, not steady, performance. Particular care and attention to any optimization of a long-running query are critical, and different databases have varying query optimization technology. So, it is important to examine databases that emphasize their query optimization techniques if this is a priority in your individual workload.

Both these TPC benchmarks are mentioned here as examples of particular workloads. But for these benchmarks to run in a way that is meaningful to anyone, they need sophisticated tuning techniques that are beyond the scope of this book. Moreover, all TPC benchmarks require a paid certified auditor, and there are strict rules regarding the use of TPC benchmark metrics. These rules are not intended to prevent practical running of the workloads, but rather to protect existing published benchmark numbers and other proprietary TPC property. In general, these workloads are far too complex for desktop environments to attempt anyway; however, the existing published TPC metrics are very useful to any administrator considering a particular hardware or software platform, regardless of size. The complexity of these benchmarks lends itself well to a large enterprise system selection. Other benchmarks that might be of interest to both a desktop environment and an enterprise system are the SpecJappServer, Trade2, SAP, and Baan benchmarks, which incorporate simultaneous web/Java and accounting and database activity. Published numbers are readily available on a variety of hardware and software platforms and can be extremely helpful in database, disk subsystem, or overall system selections for desktop to enterprise environments.