1285-1288 | Oracle Development Unleashed (3rd Edition)

Previous Table of Contents Next

Page 1285

Figure 56.3.

Oracle Parallel Server.

Parallel cache management plays a key role in providing scalable performance for online transaction processing (OLTP) applications characterized by highly concurrent update activities. In typical data warehouse systems, however, since the workload mostly consists of read-only access and bulk data additions and purges, parallel cache management doesn't play as much of a role. Oracle's unique nonblocking concurrency mechanism provides virtually unlimited concurrent data access to multiple readers and writers, practically eliminating contention and locking issues.

The shared data access scheme in Oracle's server architecture provides a logical single-system view of the data on all nodes, simplifying system administration issues. It also forms the foundation for providing superior fault toleranceon most systems, Oracle's Parallel Server technology enables access to all data even with a single surviving node.

Oracle Parallel Query Technology

The term Parallel Query is somewhat incomplete because this technology encompasses a range of other data management operations such as data loads, index creation, summary creation and recovery, in addition to parallel query execution. Oracle's parallel technology is designed as a core internal capability of the server, providing superior performance and reliability without any compromises in the range of available server functionality. The key elements of this industry-leading technology are described in this chapter.

Dynamic Parallel Execution: Key Elements

This chapter also describes the basic building blocks of Oracle's dynamic parallel architecture. The key elements are consistent across all parallel hardware architecturesSMP, clustered, and MPP systems. As you will see later, there are some unique optimizations that leverage characteristics of loosely coupled hardware systems.

Page 1286

Parallel Operators

The basic unit of parallel execution is a Data Flow Operator (or simply operator). An operator is a higher-level task and often combines multiple relational subtasks into a composite parallel operation. For example, in a query like select ename, empno from emp where salary > 50000, application of the filter SALARY > 50000 could be combined with the table scan into a single operator. Subtasks that can be executed in parallel include:

Table scans
Sorts
Nested loop, sort -merge, and hash joins
Aggregation (SUM, AVERAGE, and so on)
Grouping (GROUP BY)
Set operations such as UNION, UNION ALL
Duplicate elimination (DISTINCT)
Computation of user -defined functions
Table population (CREATE TABLE...AS SELECT)
Data loads
Index builds
Recovery

Query Servers

A set of processes known as Query Servers (sometimes called query slaves) execute each operator in parallel. Query servers are drawn from a configurable, system-wide pool of available servers. The user process where the query originates serves as the Query Coordinator and manages the tasks performed by multiple query servers. The coordinator also assembles and returns the result set.

Intraoperator and Interoperator Parallelism

As illustrated in Figure 56.4, parallel execution occurs at two distinct levels, intraoperator parallelism and interoperator parallelism. Intraoperator parallelism, sometimes called horizontal parallelism, is the execution of a single operation such as a table scan, sort, or join in parallel by multiple servers. Interoperator parallelism refers to the parallel execution of multiple distinct parallel operations concurrently. For example, while a set of query servers are scanning a table in parallel, another set of servers could be sorting the scanned rows in parallel. This is also known as pipelined parallelism or vertical parallelism.

Table Queues: Dynamic Data Redistribution

It's useful to visualize the parallel execution in terms of an operator tree with directly connected parent-child nodes having a producer-consumer relationship. An efficient, dynamic interprocess

Page 1287

communication mechanism called the Table Queue connects each producer-consumer pair and effectively redistributes the output from a set of producer servers to the next set of consumers. The redistribution happens dynamically and takes into account the optimal data partitioning requirement for a given operation. In the example of a table scan followed by a sort, producers generate output that is randomly partitioned, and the appropriate partitioning for a sort operation is by key value range. The table queue mechanism illustrated in Figure 56.5 takes rows retrieved by the table scan processes and transparently redistributes them, partitioned by key value ranges, to the sort servers.

Figure 56.4.

Parallel execution: building blocks.

Figure 56.5.

Table queues.

Depending on the type of operation, the table queue mechanism dynamically chooses an appropriate redistribution method from the available partitioning schemes: hash, key range, round robin , or broadcast. In performing the data redistribution, the table queue mechanism automatically incorporates several intelligent optimizations to achieve workload balance among the query servers. In the key range partitioning case, for example, the determination of range values is based on a dynamic sampling of the incoming rows to achieve equal- sized partitions.

Page 1288

The medium for interprocess communication among the query servers is shared memory on SMP systems and the high-speed interconnect for clustered and MPP systems.

Dynamic Table Partitioning

Parallel query execution typically begins with a parallel table scan operation as the first task. Oracle uses a dynamic table-partitioning scheme to automatically partition the table among the participating query servers. The intelligent performance and load-balancing optimizations incorporated in dynamic table partitioning underscore the superiority of Oracle's dynamic parallel architecture and its capability to deliver scalable, optimal performance without rigid physical data partitioning. The table being scanned is initially divided into logical partitions, one per available scan process. The initial partitioning takes physical location of data into consideration and minimizes disk contention. Further, in order to achieve optimal load balancing, each logical partition is subdivided into more granular chunks, and the query servers are assigned a chunk at a time to scan. If some query servers fall behind others in completing the scan, processes that have completed their work "steal" chunks of work from the busy servers, achieving improved throughput and optimal load balancing. As explained later, this strategy pays even richer dividends when combined with Oracle's shared-disk access scheme on clusters and MPP systems.

Oracle's dynamic table partitioning, coupled with intelligent redistribution of intermediate results using the table queue mechanism, enables customers to realize the benefits of parallel hardware immediately, without any need to perform expensive data reorganizations. Further, this approach delivers superior performance for a wide range of access patterns including ad hoc queries by knowledge workers. Although a static data partitioning approach could offer performance gains if the access patterns and data distributions are predictable and fixed, such a requirement is completely unrealistic in the real world. The frequent need to perform physical repartitioning of datawith changes in access patterns, additions in hardware, changes in data distributions, or growth in data volumes imposes an undue administrative burden . This chapter presents a more detailed examination of this issue in the section "Internode Parallelism."

Flexible Parallelism

Each database table (or index) is assigned a degree of parallelism that's appropriate for operations on that object. The degree of parallelism is a user-defined attribute of the table, specified as a parameter at table creation time or assigned later using a simple SQL command. An appropriate degree of parallelism takes into account factors such as table size , available processors, and the number of disk devices that store the table data. If a degree of parallelism is not explicitly specified, the system computes an intelligent default value. Because table sizes and available processing resources change over time, Oracle's dynamic architecture also provides easy ways to change the degree of parallelism. Users can alter the parallelism using a direct SQL command or even override the value at run time using a query hint. It's useful to contrast this flexibility against the static partitioning-based approach where a physical repartitioning of the table is typically required to alter the degree of parallel execution.

Previous Table of Contents Next