Postload operations

5.6 Postload operations

After the data is loaded, you may need to validate its quality, reenable constraints and triggers, rebuild indexes, update the cost-based optimizer statistics, and refresh the materialized views prior to making the data available to the warehouse users.

Step 1: Gather optimizer statistics for the tables

Run the dbms_stats package to update the optimizer statistics on any tables where you have added a significant amount of data. Statistics can be gathered for an index, index partition, table, partition or subpartition, and a materialized view.

Since we added an entire partition to the purchases fact table, the following example will gather statistics for the purchases_apr2002 partition.

 SQL> EXECUTE DBMS_STATS.GATHER_TABLE_STATS('easydw','purchases', 'purchases_apr2002',DBMS_STATS.AUTO_SAMPLE_SIZE); 

Step 2: Verify the dimensions

If using Summary Management, run the DBMS_OLAP.VALIDATE_DIMENSION procedure for each dimension to verify that the hierarchical, attribute, and join relationships are correct. It can also be used to determine if any level columns in a dimension are NULL. You can either verify the newly added rows or all the rows. Any exceptions are logged in the table MVIEW$_EXCEPTIONS, which is created in the user's schema.

In the following example, the product dimension (parameter 1) in the EASYDW schema (parameter 2) is verified for correctness. All rows will be validated (parameter 3 is set to false), and we'll check for nulls (parameter 4 is set to true).

 SQL> EXECUTE dbms_olap.validate_ dimension('product','easydw',false,true); SQL> select * from mview$_exceptions; no rows selected 

Step 3: Refresh the materialized views

After loading the April data into the purchases table, the PRODUCT_SUM materialized view became stale, as shown in Figure 5.19. It only contains information on the sales for January through March and must be refreshed to add April's data.

click to expand
Figure 5.19: Using Oracle Enterprise Manager to look for stale materialized views.

Figure 5.20 shows various options you can choose when refreshing your materialized views. In this case we've chosen "force" for the type of refresh, which will incrementally refresh the materialized view if possible. If it cannot be incrementally refreshed, a complete refresh will be done. We have chosen to refresh on demand. The refresh could also have been scheduled to run at a specific time. If data is being updated in the warehouse instead of being bulk loaded, it is also possible to refresh the materialized views after committing each transaction, which would cause them to go stale. A materialized view log must be created to be able to refresh upon commit.

click to expand
Figure 5.20: Refresh options.

By pressing the Explain button, you can determine the capabilities of your materialized view, including what type of refresh the system will be able to perform. Figure 5.21 shows that this materialized view will be completely refreshed. In Chapter 4, partition change tracking (PCT) was described. PCT allows materialized views on partitioned tables to be fast refreshed after partition maintenance operations. In order to use PCT, either the partitioning key for the detail tables must be included in the materialized view or you can include a special column known as the partition marker. In this example, it cannot use PCT, because there is no partition key or pmarker in the select list. Another reason it is not fast refreshable is that some columns are missing from the materialized view definition. We can see there is a SUM(expr) without a COUNT(expr). It cannot be fast refreshed after an insert or other DML statement, since it doesn't have a materialized view log. In order to use these optimizations, the materialized view definition needs to be modified. Explain materialized view can be a big help in determining how to best define your materialized views.

click to expand
Figure 5.21: Explain materialized view.

An alternate way to refresh the materialized views is by executing the DBMS_MVIEW.REFRESH procedure, as previously discussed in Chapter 4.

Step 4: Gather optimizer statistics for the materialized views

Next, update the optimizer statistics on any materialized views that may have significantly changed in size. If you used Oracle Enterprise Manager to refresh your materialized views, you have the option of analyzing the materialized view after it was refreshed, as shown in Figure 5.22.

click to expand
Figure 5.22: Analyzing the materialized view after refresh.

To gather optimizer statistics outside of Oracle Enterprise Manager, use the DBMS_STATS package. In this example table statistics will be gathered from the PRODUCT_SUM materialized view in the EASYDW schema.

 SQL> EXECUTE DBMS_STATS.GATHER_TABLE_STATS('easydw','product_sum'); 

Step 5: Back up the database, table, or partition after the load is complete (if you used the UNRECOVERABLE option)

Since media recovery is disabled for the table being loaded, you will not be able to recover in the event of media failure. Refer to Chapter 8 for a discussion on backup techniques.

Step 6: Publish the data

Send e-mail to notify the users of what data has been loaded the previous day and is ready for their use.



Oracle9iR2 Data Warehousing
Oracle9iR2 Data Warehousing
ISBN: 1555582877
EAN: 2147483647
Year: 2005
Pages: 91

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net