Refresh

4.3 Refresh

As new data is loaded into the warehouse, any materialized view based on this data must be updated. This operation is known as refresh. The frequency of refresh determines how recent the summary data is with respect to the detail data. The refresh operation can be very time consuming depending on the amount of data involved. You must carefully determine your refresh policy based on available resources and your application's need for current data. If your application needs the materialized view to be synchronized with the detail data at all times, choose the ON COMMIT option. With this option, Oracle will automatically refresh the materialized view when the changes to detail data get committed. If your warehouse gets new data only once a day during a specified maintenance window, deferring the refresh until then would be the right choice. Deferred or ON DEMAND refresh can be done using the DBMS_MVIEW package.

The refresh policy must be chosen when creating the materialized view. You can alter it at a later time by issuing an ALTER MATERIALIZED VIEW statement.

4.3.1 Using the DBMS_MVIEW refresh procedures

Oracle provides several procedures for ON DEMAND refresh in the DBMS_MVIEW supplied PL/SQL package, as follows:

  • DBMS_MVIEW.REFRESH

  • DBMS_MVIEW.REFRESH_DEPENDENT

  • DBMS_MVIEW.REFRESH_ALL_MVIEWS

If you want to refresh all your materialized views, which is often done after bulk loading new detail data, you can use the DBMS_MVIEW.REFRESH_ALL_MVIEWS procedure, as shown in the following example. In this example, the :failures bind variable returns the number of failed refreshes.

 EXECUTE DBMS_MVIEW.REFRESH_ALL_MVIEWS(:failures); 

If you have materialized views that are refreshed at different times—for example, some weekly and others monthly—you can specify a list of materialized views to refresh using the DBMS_MVIEW.REFRESH procedure, as follows:

 EXECUTE DBMS_MVIEW.REFRESH('monthly_sales_mv, sales_by_city'); 

Recall that at the time of creating the materialized view, you specify a default refresh method of COMPLETE, FAST, or FORCE. You may override this refresh method when issuing the refresh command as C (complete), F (fast), or ? (force). If you don't specify it, the refresh method given at the time of creating the materialized view will be used. In the following example, the materialized view MONTHLY_SALES_MV defined earlier as REFRESH FORCE is being refreshed using COMPLETE refresh.

 EXECUTE DBMS_MVIEW.REFRESH('monthly_sales_mv', 'C'); 

If you want to refresh all materialized views that are based on a particular detail table that has changed, use the DBMS_MVIEW.REFRESH_DEPENDENT procedure, specifying the detail table. For example, to refresh FORCE all materialized views that have changed when the customer table is updated, you would issue the following statement.

 EXECUTE DBMS_MVIEW.REFRESH_DEPENDENT(:failures, 'customer', '?') 

Refresh of several materialized views can be done simultaneously using job queues. To do this, the initialization parameter JOB_QUEUE_PROCESSES must be set to a nonzero value and the atomic_refresh parameter in the refresh procedure must be set to FALSE, as follows:

 ALTER SYSTEM SET job_queue_processes = 2; EXECUTE DBMS_MVIEW.REFRESH('sales_by_month, sales_by_city', 'C',                            atomic_refresh=>FALSE); 

Alternatively, you can use Oracle Enterprise Manager to refresh your summaries. To look at the properties of each materialized view in the system, expand the Databases node in the Navigator pane, followed by the Warehouse node, then Summary Management, and finally Materialized Views (as shown in Figure 4.2).

click to expand
Figure 4.2: Refreshing a materialized view in Oracle Enterprise Manager.

You can check if the materialized view is STALE by looking at the Refresh State in the Status section. If the materialized view is STALE, you can refresh it by pressing the Refresh button.

Hint: 

To get an accurate picture of the refresh state, the compile state field must say VALID. If it says NEEDS_COMPILE, press the Compile button to validate the materialized view first.

Pressing the Refresh button will bring up a refresh options screen (as shown in Figure 4.3) from which you can choose the refresh type, issue the refresh and optionally update the statistics on the materialized view.

click to expand
Figure 4.3: Refreshing a materialized view in Oracle Enterprise Manager—Options.

Once the refresh has completed, you can once again check the status of the materialized view in the Status section.

4.3.2 Fast refresh

As the volume of data in a warehouse increases, rebuilding the materialized view after each new data load can get prohibitive. Oracle provides the capability to refresh materialized views without a complete rebuild. This is known as fast refresh. The fast refresh algorithms used by Oracle either incrementally apply changes from the detail data or selectively recompute only affected portions of the materialized view. Oracle 8i supported fast refresh for materialized aggregate views after bulk loads to the data. Refresh after DML was only supported for a restricted class of materialized views. In Oracle 9i, fast refresh can handle regular DML (insert, update, and delete statements) and also partition maintenance operations.

For a materialized view to be fast refreshable after DML to the detail data, a materialized view log must be created on each of its tables. The materialized view log captures the changes done to the detail data tables by insert, update, or delete statements. Note that changes done by a bulk load such as INSERT /*+ APPEND */ or using SQL*Loader are not recorded in the materialized view log but are kept in an internal log.

In order for a materialized view to be fast refreshable, it must adhere to certain rules. The materialized view's SELECT query must include some supporting aggregates, depending on the aggregate. For each aggregate such as SUM(x) or AVG(x), you must have COUNT(x). The materialized view must also include the COUNT(*) aggregate. If you have STDDEV(x), you must have SUM(x), COUNT(x), and SUM(x * x).

The following example shows a fast refreshable summary along with the materialized view logs on its detail tables. Note that various options such as ROWID, SEQUENCE, and INCLUDING NEW VALUES must be specified for the materialized view log. The ROWID option logs the rowids of rows changed by DML statements. The INCLUDING NEW VALUES clause logs both the old and new version of a row changed by an update statement. The SEQUENCE option is new in Oracle 9i and is required to support fast refresh. If you omit this option, you will get fast refresh only under specific conditions (i.e., either when only one detail table has been changed or when only inserts have been performed). All detail table columns referenced in the materialized view's query must be included in the materialized view log.

Hint: 

You must create the materialized view logs before creating the materialized views; otherwise, you will have to completely refresh the materialized views once before fast refresh can be performed.

 CREATE MATERIALIZED VIEW LOG on time WITH ROWID, SEQUENCE (time_key, month, year) INCLUDING NEW VALUES; CREATE MATERIALIZED VIEW LOG on purchases    WITH ROWID, SEQUENCE (purchase_date, product_id, purchase_price)    INCLUDING NEW VALUES; CREATE MATERIALIZED VIEW LOG on product    WITH ROWID, SEQUENCE (product_id) INCLUDING NEW VALUES; CREATE MATERIALIZED VIEW monthly_sales_mv  PCTFREE 0 TABLESPACE summary  STORAGE (initial 64k next 64k pctincrease 0)  BUILD IMMEDIATE  REFRESH FAST  ON DEMAND  ENABLE QUERY REWRITE  AS  SELECT t.month, t.year, p.product_id,         SUM (ps.purchase_price) as sum_of_sales,         COUNT (ps.purchase_price) as total_sales, COUNT(*)  FROM time t, product p, purchases ps  WHERE t.time_key = ps.purchase_date AND        ps.product_id = p.product_id  GROUP BY t.month, t.year, p.product_id; 

You can use the EXPLAIN_MVIEW procedure (section 4.4) to determine if a given select statement can be used to create a fast refreshable materialized view.

The following example code shows a materialized join view that is fast refreshable with the ON COMMIT option. To make this type of a materialized view fast refreshable, you must include a ROWID column for each table in the materialized view. The materialized view log only needs to have the ROWID option. (In this example, we are assuming that the materialized view logs for product and purchases created earlier for the MONTHLY_SALES_MV are around.)

 CREATE MATERIALIZED VIEW LOG on customer WITH ROWID; CREATE MATERIALIZED VIEW product_customer BUILD IMMEDIATE REFRESH FAST ON COMMIT AS SELECT c.rowid r1, c.sex, p.rowid r2, p.product_id,        f.rowid r3, f.purchase_price FROM purchases f, product p, customer c WHERE f.customer_id = c.customer_id       AND f.product_id  = p.product_id; 

Hint: 

The same materialized view log on a given table is used for all materialized views that reference that table; hence, be sure to specify options to satisfy all the materialized views.

During fast refresh, Oracle will identify the changes that have occurred since the last refresh using the materialized view log and apply them to the materialized view. Once all materialized views have been refreshed to include the changes, the relevant rows will be purged from the materialized view log. You must never modify the materialized view log manually, as you will be forced to completely rebuild the materialized view.

In order to get good refresh performance, it is important to have accurate statistics on the data. The package DBMS_STATS should be used to gather optimizer statistics on the detail tables after each data load and prior to creating or refreshing the materialized views. It is also important to gather statistics on the materialized views upon creation and after refreshing them with large amounts of data. This will ensure that the optimizer has accurate statistics during refresh and when optimizing queries that are rewritten to use materialized views.

4.3.3 Fast refresh after partition maintenance operations

Partitioned tables are a common feature of a warehouse. Partition maintenance operations are used to reorganize the partitioned data. These include MERGE or SPLIT of existing partitions, TRUNCATE or DROP of an old partition, or EXCHANGE of a partition with a table. The use of partition maintenance operations in a warehouse is discussed in Chapter 7.

Oracle 9i introduced a new feature known as Partition Change Tracking (PCT). This allows materialized views on partitioned tables to be fast refreshed after partition maintenance operations.

When a partition maintenance operation is performed, Oracle keeps track of the updated partitions. It can then identify the data in the materialized view that corresponds to the updated partitions and recompute only that portion of the materialized view. This is referred to as PCT refresh.

Figure 4.4 shows a conceptual picture of how data changes are tracked in partition change tracking. In this example, the purchases table is partitioned by month into Jan 2002, Feb 2002 and Mar 2002 partitions. The materialized view MONTHLY_SALES_MV includes the sales data for each month. The correspondence between the detail data and data in the materialized view is shown by the dotted lines. If the Feb 2002 partition is updated, the corresponding data in the materialized view is stale and no longer available for query rewrite. Similarly, when the Apr 2002 partition is added to the purchases table, the materialized view does not contain this partition and hence cannot be used for queries for the Apr 2002 data.

click to expand
Figure 4.4: Partition change tracking.

To enable a materialized view for partition change tracking, you must include the partitioning key of the partitioned detail tables in the materialized view. If this is not possible, you must include a special column known as the partition marker using the DBMS_MVIEW.PMARKER() function for each partitioned table. This allows Oracle to map detail table partitions to rows in the materialized view. To take advantage of PCT, the materialized view itself does not have be partitioned at all!

The following example illustrates the use of a partition key in a materialized view. In the EASYDW schema, the purchases table is partitioned by time_key. The materialized view, MONTHLY_SALES_PARTKEY_MV, has the time_key column in the SELECT list and hence is enabled for partition change tracking if the purchases table gets updated.

 CREATE MATERIALIZED VIEW monthly_sales_partkey_mv REFRESH FAST AS SELECT t.month, ps.time_key,        SUM (ps.purchase_price) as sum_of_sales,        COUNT (ps.purchase_price) as total_sales, COUNT(*) FROM time t, purchases ps WHERE t.time_key = ps.time_key GROUP BY t.month, ps.time_key; SELECT * FROM monthly_sales_partkey_mv;      MONTH TIME_KEY  SUM_OF_SALES TOTAL_SALES   COUNT(*) ---------- --------- ------------ ----------- ----------          1 01-JAN-02   8989040.32       31709      31709          1 02-JAN-02        56.02           2          2          2 01-FEB-02   3401896.62        8902       8902          2 02-FEB-02     13811629       31203      31203          ...          7 02-JUL-02        56.02           2          2          7 05-JUL-02      2240.95          45         45          7 07-JUL-02        28.01           1          1          7 08-JUL-02        28.01           1          1 19 rows selected. 

In this materialized view, there is a row for each value of time_key and hence it is effectively stored at a daily level and could get quite large. To mitigate this problem, you could use a partition marker instead, as shown in the next example.

 CREATE MATERIALIZED VIEW monthly_sales_marker_mv REFRESH FAST AS SELECT t.month, DBMS_MVIEW.PMARKER(ps.rowid) as pmark,        SUM (ps.purchase_price) as sum_of_sales,        COUNT (ps.purchase_price) as total_sales, COUNT(*) FROM time t, purchases ps WHERE t.time_key = ps.time_key GROUP BY t.month, DBMS_MVIEW.PMARKER(ps.rowid); SELECT * FROM monthly_sales_marker_mv;      MONTH      PMARK SUM_OF_SALES TOTAL_SALES   COUNT(*) ---------- ---------- ------------ ----------- ----------          1      29566   8989096.34       31711      31711          2      29567   17213525.6       40105      40105          3      29568    429483.97         952        952          4      29707   4524335.65       10126      10126          5      29708   1268317.48        3158       3158          6      29709   7092585.46        7995       7995          7      29843      2352.99          49         49 7 rows selected. 

We see that the partition marker produces one value for all rows in the same partition. This makes the materialized view more manageable in size.

Hint: 

Include the partition key if there are few distinct values in each partition. If there are many distinct values for the partition key, such as time_key, using a partition marker will keep the materialized view small.

Partition change tracking is currently only supported for detail tables partitioned by range or range-hash composite partitioning. While PCT refresh is primarily useful after partition operations, this type of refresh may also be performed after regular DML and may be quite beneficial if a lot of changes have been done to detail data. Oracle will automatically try to use PCT refresh if it cannot perform fast refresh using materialized view logs.

Hint: 

To find out if PCT refresh could be used on your materialized view, use the EXPLAIN_MVIEW procedure discussed in the next section.

Partition change tracking also improves the availability of a stale materialized view to query rewrite—more on this in section 4.6.



Oracle9iR2 Data Warehousing
Oracle9iR2 Data Warehousing
ISBN: 1555582877
EAN: 2147483647
Year: 2005
Pages: 91

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net