Parallel Query Processing


The query processor in SQL Server 7.0 and 2000 includes parallel query processinga new execution strategy that can improve the performance of complex queries on computers with more than one processor.

SQL Server inserts exchange operators into each parallel query to build and manage the query execution plan. The exchange operator is responsible for providing process management, data redistribution, and flow control. It is possible that a parallel query execution plan can use more than one thread, whereas a serial execution plan, used by a nonparallel query, uses only a single thread for its execution. Prior to query execution time, SQL Server determines whether the current system state and configuration allow for parallel query execution. If parallel query execution is justified, SQL Server determines the optimal number of threads, called the degree of parallelism, and distributes the query workload execution across those threads. The parallel query uses the same number of threads until the query completes. SQL Server re-examines the optimal degree of parallelism each time a query execution plan is retrieved from the procedure cache. Individual instances of the same query could be assigned a different degree of parallelism.

SQL Server calculates the degree of parallelism for each instance of a parallel query execution using the following criteria:

  • How many processors does the computer running SQL Server have?

    If your computer has two or more processors, it can use parallel queries.

  • What is the number of concurrent active users?

    The degree of parallelism is inversely related to CPU usage. SQL Server assigns a lower degree of parallelism if the CPUs are already busy.

  • Is sufficient memory available for parallel query execution?

    Queries, like any process, require resources to execute, particularly memory. Obviously, a parallel query will demand more memory than a serial query. More importantly, as the degree of parallelism increases , so does the amount of memory required. Realizing this, SQL Server carefully considers this in a query execution plan. SQL Server could either adjust the degree of parallelism or use a serial plan to complete the query.

  • What is the type of query being executed?

    Queries that use several CPU cycles justify using a parallel execution plan. Some examples are joins of large tables, substantial aggregations, and sorting large resultsets. SQL Server determines whether to use a parallel or serial plan by checking the value of the cost threshold for parallelism.

  • Are a sufficient number of rows processed in the given stream?

    If the optimizer determines that the number of rows in a stream is too low, it does not execute a parallel plan. This prevents scenarios where the costs exceed the benefits of executing a parallel plan.

Two server configuration optionsthe maximum degree of parallelism and cost threshold for parallelismaffect the consideration for a parallel query. Although it is not recommended, you can change the default settings for each.

The maximum degree of parallelism option limits the number of threads to use in a parallel plan execution. The range of possible values is 0 to 32. This value is automatically configured to , which uses the actual number of CPUs. If you want to suppress parallel processing, set the value to 1 .

You can affect the query optimizer's choice to use a parallel execution plan by changing the values for the maximum degree of parallelism and the cost threshold for parallelism server configuration options using either the sp_configure system stored procedure or the Enterprise Manager program. It is strongly recommended that you do not change this value on symmetric multiprocessor (SMP) computers. For single processor machines, these values are ignored.

To set the maximum degree of parallelism option, you can use the following:

  • The sp_configure system stored procedure

     USE master  sp_configure 'show advanced options', 1 GO RECONFIGURE GO sp_configure 'max degree of parallelism', 1 GO RECONFIGURE GO 
  • The Enterprise Manager

    1. Right-click a server, and then click Properties.

    2. Click the Processor tab. The Processor settings dialog box appears (see Figure 35.18).

      Figure 35.18. SQL Server Propertiesthe Processor settings tab.

      graphics/35fig18.jpg

    3. Under Parallelism, select the number of processors to execute queries in parallel.

The cost threshold for parallelism option establishes a ceiling value the query optimizer uses to consider parallel query execution plans. If the calculated value to execute a serial plan is greater than the value set for the cost threshold for parallelism, a parallel plan is generated. This value is defined by the estimated time in seconds to execute the serial plan. The range of values for this setting is 0 to 32767. The default value is 5 . If the maximum degree of parallelism is set to 1 , or the computer has a single processor, the cost threshold for parallelism value is ignored.

You can configure this option using either the sp_configure system stored procedure or the Enterprise Manager:

  • The sp_configure system stored procedure

     USE master  sp_configure 'show advanced options', 1 GO RECONFIGURE GO sp_configure 'cost threshold for parallelism', 15 GO RECONFIGURE GO 
  • The Enterprise Manager

    1. Expand a server group .

    2. Right-click a server, and then click Properties.

    3. Click the Processor tab.

    4. In the Parallelism box, enter a value from 0 through 32,627.

You can identify when a parallel execution plan is being chosen using the SQL Server Query Analyzer. Two formats are available for viewing the execution plan: graphical and tabular. The graphical execution plan uses icons to represent the execution of specific statements and queries in SQL Server. The tabular representation is produced by the SET SHOWPLAN_ALL or SET SHOWPLAN_TEXT statements. The showplan output for every parallel query will have at least one of these three logical operators:

  • Distribute StreamsReceives a single input stream of records and distributes multiple output streams. The contents and form of the record are unchanged. All records enter through the same single input stream and appear in one of the output streams, preserving the relative order.

  • Gather StreamsAssembles multiple input streams of records and yields a single output stream. The relative order of the records, contents, and form are maintained .

  • Repartition StreamsAccepts multiple input streams and produces multiple streams of records. The record contents and format are unchanged.

Figure 35.19 provides an example of a query plan using parallel query techniquesboth repartition streams and gather streams.

Figure 35.19. Graphical execution plan of a query using parallel query techniques.

graphics/35fig19.jpg



Microsoft SQL Server 2000 Unleashed
Microsoft SQL Server 2000 Unleashed (2nd Edition)
ISBN: 0672324679
EAN: 2147483647
Year: 2002
Pages: 503

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net