Server Processes and Threads | SQL Performance Tuning

"The number of transistors on a microprocessor doubles approximately every 18 months."

Moore's Law

It is facile to think of a server as a machine with a DBMS program running on it. Technically, that statement is true, but we really must stretch the definition of "a machine" and the definition of "a program" before it becomes true.

All large DBMSs are oligolithic (from the Greek for "a few stones"). At any moment there are several processes or threads running simultaneously . (In this chapter, we'll say "process/thread" to mean both processes and threads.) There is a difference between a process and a thread.

Processes have their own memory.
Threads share the same address space.

The result is that a process tends to be large (a few megabytes) while a thread is smaller (a few hundred kilobytes). On the other hand, processes tend to be safer because they are less able to interfere with one another. Different DBMS vendors have different preferences. For example IBM, Informix, and Oracle favor processes, while Microsoft and Sybase favor threads (or even fibers, which are very thin threads on MS Windows NT systems).

The larger DBMSs can run most or all of the following types of processes or threads.

System Processes/Threads

The COMMIT writer watches for database pages marked as new or updated, organizes them, and issues disk-write calls for them.
The log writer ensures that log records are definitely put on disk.
The monitor watches for stuck jobs, new connections, and idleness.
The scheduler decides which job gets to run next . (Sometimes this is a middleware job, but a DBMS like Microsoft comes with its own "thread pooler" option.)
The over-nighter handles background jobs that require passes through many rows, for the sake of backup, replication, recovery, redistribution, index creation, loads, and reorganization.

Per-Client Processes/Threads

The temporary threads come into existence during execution of a single SQL statement. The threads run in parallel. Each thread works on a completely separate area of memory and disk. (If it were otherwise then threads would have to constantly synchronize with each other and there would be no point in assigning parallel threads.) Teradata's list of possible threads, which is larger than most DBMSs mention, is as follows :

Table Scan: Each thread gets a separate table file or chunk .
Index Scan: Each thread gets a separate index section.
Index Read: Each thread gets a separate index entry.
INSERT: Each thread gets a separate inserted row or page.
DELETE: Each thread gets a separate deleted row or page.
UPDATE: Each thread gets a separate updated row or page.
Sort: Each thread gets a separate group of result-set rows (prior to merging).
Join: Each thread joins a different pair of tables.
Catalog Access: Each thread gets a look at the system catalog.
Statistics: Each thread updates statistics while doing data I/O.
Pipeline: One thread maintains a common area so that job A can feed to job B one row at a time. For example, data from a "scan" thread goes directly to a "join" thread.
Parse: Each thread handles a separate statement portion, for example, a subquery.

Multi-thread architecture involves tremendously important considerations.

There can be no parallel threading if there is no separateness. That means it's important to partition your data and follow normalization rules; see Chapter 8, "Tables."
There can be no huge benefit from parallel threading if there is only one CPU. That is why you should be skeptical of benchmarks that compare, say, MySQL with Oracle. The benchmarks probably come from a single-processor computer, which makes MySQL look good because it has fewer threads. Oracle starts to look better when there are several CPUs so that each thread is truly processed in parallel.
There is no way to predict "response time" based on the number of rows or the number of steps in the statement process. For example, you cannot say that transaction A (which INSERTs two rows and then does a ROLLBACK) is much faster than transaction B (which INSERTs four rows and then does a COMMIT). Because transaction B can take advantage of "the Commit Writer" parallel thread as well as the "INSERT" parallel thread, it can be done nearly as quickly, in theory.

Remember that there are two ways to measure performance: response time (the time between arrival and departure of a particular statement) and throughput (the number of statements that the system can handle in a unit of time). Multi-threading is less helpful for throughput than for response time. This becomes obvious if the system is near saturation. For example, suppose one user does an INSERT. Because different parts of the statement are parallelized, the INSERT will be done more quickly, and response time improves . But now suppose that two users do a similar INSERT simultaneously. The statements will block one another with locking, will both try to get time-slices on the same CPU with their multifarious threads, and will both have to queue for the system processes like the Scheduler or the Commit Writer, thus throughput does not improve.

Benchmarks

We should explain that all timing tests for this book were done with a single-CPU computer, so comparisons between DBMSs would be misleading. The results wouldn't give full points to the DBMS with the best threading mechanisms.

Bottom line: Organize so that multi-threading is possible, but don't hope it will make things better on an overloaded system. The most important thing is to minimize synchronization. That entails separating parts of statements or parts of databases, so that different threads don't need to access the same thing or to communicate with one another. And if your DBMS is a process-per-client rather than a thread-per-client system, it behooves you to disconnect when you're done so the server won't waste space on you.

Separateness and Parallelism

We said earlier that there can be no parallel threading if there is no separateness. Separateness usually means partitioning. A partition is a part of a database that consists of its own data and indexes (and often, its own configuration files and transaction logs). Tables can be located in one or more database partitionsthat is, a partitioned table can have some of its rows stored in one partition and other rows stored in other partitions.

Because data is divided across partitions, you can use the power of multiple processors to execute SQL statements. The DBMS will automatically decompose SELECT and data-change statements into subrequests, and execute the subrequests in parallel among the applicable partitions. The fact that a table is split across multiple partitions is transparent to users issuing SQL statements.

Executing the components of a task (e.g., a SELECT statement) in parallel can enhance performance dramatically. The main types of parallelism are I/O and Query.

I/O parallelism refers to the process of writing to, or reading from, two or more I/O devices simultaneously and can result in significant improvement in throughput.
Query parallelism refers to the simultaneous processing of parts of a single SQL statement. In general, the DBMS subdivides what is usually considered a single database operation (e.g., creating an index or resolving a SELECT) into multiple parts, many or all of which can be run in parallel (a) within a single partition or (b) across multiple partitions on either one machine or on multiple machines. The result set is thus returned more quickly than if the query were run in serial fashion.

For an example of how to ensure you're getting the most out of your DBMS's parallelism abilities , consider how it's done with IBM.

One type of parallel environment supported by IBM allows multiple processors to share access to a database so that SQL statements can be divided among the processors. You can specify the degree of parallelism to implement when compiling your application by using IBM's non-standard CURRENT DEGREE niladic function, or the DEGREE bind option. (Degree means the number of concurrently executing parts of an SQL statement you want to allow.) Each degree of parallelism adds to the system memory and CPU overhead.
IBM provides three configuration parameters that you can use to control and manage parallelism. The first is the intra_parallel parameter; it enables or disables instance parallelism support. The second is the max_querydegree parameter; it sets an upper limit for the degree of parallelism that can be used for any query and overrides the CURRENT DEGREE function and/or the DEGREE bind option values. The third configuration parameter is the dft_degree parameter; it sets the default value for CURRENT DEGREE and the DEGREE bind option. If you run a query with DEGREE = ANY , IBM's optimizer chooses the degree of parallelism based on a number of factors, including the number of processors and the characteristics of the query. The actual degree used at runtime may be lower than the number of processors, depending on these factors.

The Bottom Line: Server Processes and Threads

Processes have their own memory. A process tends to be large (a few megabytes).

Threads share the same address space. A thread tends to be small (a few hundred kilobytes).

Processes tend to be safer than threads, because they are less able to interfere with one another.

There can be no parallel threading if there is no separateness. That means it's important to partition your data and follow normalization rules.

There can be no huge benefit from parallel threading if there is only one CPU.

There is no way to predict response time based on the number of rows or the number of steps in the statement process.

There are two ways to measure performance: response time (the time between arrival and departure of a particular statement) and throughput (the number of statements that the system can do in a unit of time).

Multi-threading is less helpful for throughput than for response time. Organize your work so that multi-threading is possible, but don't hope it will make things better on an overloaded system.

The most important thing is to minimize synchronization. Separate parts of statements or parts of databases, so that different threads don't need to access the same thing or communicate with each other.

If your DBMS is a process-per-client rather than a thread-per-client system, disconnect when you're done so the server won't waste space on you.