Page 369
SQL*Loader can concatenate records either by specifying the number of physical records that form one logical record or by specifying conditional evaluations based on character occurrences within the data. If the number of physical records that comprise a logical record varies, then you must base the concatenation on conditional evaluations. The CONCATENATE and CONTINUEIF clauses in the control file facilitate physical record concatenation.
SQL*Loader provides two paths for loading data:
The conventional path generates a SQL INSERT command with the array processing interface to load the data into the tables. Because of this interface, SQL*Loader competes with all other Oracle processes for buffer cache resources; Oracle looks for and tries to fill partially filled blocks on each insert. (See Figure 16.7.) If you are loading small amounts of data, this method is usually acceptable; with large volumes of data, however, this technique becomes too time-consuming and resource- intensive .
The following conditions exist when you load data with the conventional path:
Page 370
In contrast to the conventional path, the direct path is optimized for bulk data loads. Instead of using the buffer cache to obtain and fill the database blocks, the direct path option uses the Extent Manager to get new extents and adjust the high water mark. Direct path creates formatted database blocks and writes them directly to the database. (See Figure 16.8.)
Page 371
Although the direct path significantly increases performance, it does have some limitations:
Page 372
Direct path loads of single partitions have the following limitations:
NOTE |
The main advantage to the direct path method is performance. No hard and fast rules exist to determine when to use the conventional method or the direct path method because the definition of "large amounts of data" varies from application to application. Other external factors vary also, such as database availability, the cleanliness of the data being loaded, and system resources. I have seen loads go from days to hours using the parallel, direct path method instead of the conventional path method with clean data on a multiprocessor system. I have also seen cases where significant amounts of time were spent cleaning the table data so that an index could rebuild after a direct path load. You need to analyze your situation and, based on all the factors, decide which method is better for your application. |
Oracle version 7.1 introduced the functionality of performing direct path loads in parallel. This feature gives SQL*Loader nearly linear performance gains on multiprocessor systems. By using the parallel option, you can load multiple input files concurrently into tablespace files, with each file belonging to the tablespace in which the table resides. Maximum throughput is achieved by striping the tablespace across different devices and placing the input files on separate devices and preferably separate controllers. (See Figure 16.9.)
When using the parallel option, SQL*Loader creates temporary segments, sized according to the NEXT storage parameter specified for the table, in the tablespace file specified in the OPTIONS clause of the control file.
CAUTION |
The specified file must be in the tablespace in which the table resides, or you get a fatal error. |
Upon completion of the SQL*Loader session, the temporary segments are merged (with the last extent trimmed of any free space) and added to the existing table in the database above the high water mark for the table.