The greatest benefit of using Summary Management in Oracle is automatic query rewrite. Traditionally, when summary tables were used, the queries on the base tables had to be modified to use the summary table instead. However, with query rewrite, the Oracle optimizer automatically determines if the result of a query can be obtained from a materialized view and, if so, transparently rewrites the query in terms of the materialized view. If the optimizer determines that the summary is insufficient to answer the query, it uses the detail data. This means that no application changes are needed in order to create and use summaries. Further, end users do not have to be aware of the existence of summaries and hence summaries can be created and modified without impacting users.
Oracle supports several types of query rewrite transformations, allowing a single summary to be used to answer several queries. We will illustrate several of these in the following sections, including:
Text match
Aggregate rollup
Join-back
Filtered data
Rollup using a dimension object
To enable queries to be rewritten, the materialized view must be created with the ENABLE QUERY REWRITE clause. Also, the QUERY_REWRITE_ENABLED initialization parameter must be set to TRUE and the cost-based optimizer must be used.
-- enable query rewrite ALTER SESSION SET QUERY_REWRITE_ENABLED=TRUE; -- enable cost-based optimizer ALTER SESSION SET OPTIMIZER_MODE=" all_rows";
Our first few examples will use the following materialized view, which computes the sum of sales for products by month.
CREATE MATERIALIZED VIEW monthly_sales_mv ENABLE QUERY REWRITE AS SELECT t.month, t.year, p.product_id, SUM (ps.purchase_price) as sum_of_sales, COUNT (ps.purchase_price) as total_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.time_key AND ps.product_id = p.product_id GROUP BY t.month, t.year, p.product_id;
To determine if the query has been rewritten, use EXPLAIN PLAN (described in Chapter 3) to look at the query execution plan. The output of EXPLAIN PLAN shows the name of the materialized view used to rewrite the query. We will use this in the examples in this section. Alternatively, in SQLPLUS, you can use the SET AUTOTRACE option to trace the optimizer plan. If you find that query rewrite is not occurring as expected, you could use the DBMS_MVIEW.EXPLAIN_REWRITE utility discussed later in this chapter to diagnose the problem.
Hint: | Name your materialized views with a _mv suffix. This will help quickly identify them in the execution plan output. |
The simplest type of query rewrite is when the SQL text of the materialized view's defining query exactly matches that of the incoming query. The text match is not case sensitive and ignores any whitespace differences.
The execution plan in the following code segment shows that the optimizer chose to access the materialized view MONTHLY_SALES_MV via a full table scan.
-- exact text match SELECT t.year, t.month, p.product_id, sum (ps.purchase_price) as sum_of_sales, count (ps.purchase_price) as total_sales FROM time t, product p, purchases ps WHERE t.time_key = f.time_key AND f.product_id = p.product_id GROUP BY t.year, t.month, p.product_id; MONTH YEAR PRODUCT_ SUM_OF_SALES TOTAL_SALES ---------- ---------- -------- ------------ ----------- 1 2002 SP1000 54966.53 332 1 2002 SP1001 48861.17 312 1 2002 SP1010 45432.43 268 ... ----------------------------------------------------------------- |Id| Operation | Name | Rows | Bytes |Cost | ----------------------------------------------------------------- | 0| SELECT STATEMENT | | 327 | 18966 | 2 | | 1| TABLE ACCESS FULL | MONTHLY_SALES_MV| 327 | 18966 | 2 | -----------------------------------------------------------------
Oracle will also try a text match starting from the FROM keyword of the query. This allows for differences in column ordering in the SELECT list and computation of expressions. In the following example, SUM(purchase_price) and COUNT(purchase_price) have been used to compute the average, and the order of t.month and t.year in the SELECT list is reversed. You can see that the materialized view has been used to rewrite this query.
SELECT t.month, t.year, p.product_id, AVG(ps.purchase_price) avg_of_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.time_key AND ps.product_id = p.product_id GROUP BY t.year, t.month, p.product_id; MONTH YEAR PRODUCT_ID AVG_OF_SALES ---------- ---------- ---------- ------------ 1 2002 SP1000 165.561837 1 2002 SP1001 156.606314 1 2002 SP1010 169.523993 ... ----------------------------------------------------------------- |Id| Operation | Name | Rows | Bytes |Cost | ----------------------------------------------------------------- | 0| SELECT STATEMENT | | 327 | 18966 | 2 | | 1| TABLE ACCESS FULL | MONTHLY_SALES_MV| 327 | 18966 | 2 | -----------------------------------------------------------------
If the text of the query and materialized view does not match, Oracle will then compare the join conditions, GROUP BY clauses, and aggregates in the query and materialized view to determine if the query can be rewritten using the materialized view. We will illustrate these rules in the following sections.
An aggregate rollup (often simply called rollup) occurs when the aggregates in the materialized view can be further aggregated to determine the aggregates requested by the query. A simple example of this is when the query contains only some of the grouping columns from the materialized view. For instance, if the materialized view groups by year and month, and the query asks for year, then the aggregates in the materialized view for each month for a given year are aggregated to produce one row for the year.
-- rollup over month column SELECT t.year, p.product_id, SUM (ps.purchase_price) as sum_of_sales, COUNT (ps.purchase_price) as total_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.time_key AND ps.product_id = p.product_id GROUP BY t.year, p.product_id; YEAR PRODUCT_ID SUM_OF_SALES TOTAL_SALES ---------- ---------- ------------ ----------- 2002 SP1000 142173.51 659 2002 SP1001 97910.04 475 2002 SP1010 122422.72 514 ... ------------------------------------------------------------------ |Id| Operation | Name | Rows | Bytes | Cost | ------------------------------------------------------------------ | 0| SELECT STATEMENT | | 327 | 14715 | 5 | | 1| SORT GROUP BY | | 327 | 14715 | 5 | | 2| TABLE ACCESS FULL| MONTHLY_SALES_MV| 327 | 14715 | 2 | ------------------------------------------------------------------
When a rollup occurs, the rewritten query will contain a GROUP BY clause, as highlighted in the previous execution plan.
A more interesting case of rollup is when your data has a hierarchy described by a dimension object. In this case, query rewrite can rollup data from a lower level to a higher level in the hierarchy. We will explain this later in this chapter.
For query rewrite to occur, all the columns in the query must either appear in the materialized view or must be derivable from some column in the materialized view. In the latter case, the materialized view must be joined to the base table to obtain the required column. This is called a join-back. In the simple case, for a join-back to occur, the materialized view must contain either the primary key or the rowid of the detail table. For instance, product_id is the primary key of the product table. So the following query, asking for product_name, can be answered with the MONTHLY_SALES_MV materialized view, using a join-back. The optimizer's plan shows that the query has been rewritten to use the materialized view MONTHLY_SALES_MV with a join to the product table.
-- join-back to product table using primary key constraint SELECT t.year, t.month, p.product_name, SUM (ps.purchase_price) as sum_of_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.time_key AND ps.product_id = p.product_id GROUP BY t.year, t.month, p.product_name; YEAR MONTH PRODUCT_NAME SUM_OF_SALES ---------- ---------- ------------- ------------ 2002 1 CD LX1 40208.33 2002 1 CD LX2 54031.52 2002 1 CD LX3 9145.54 ... ------------------------------------------------------------------ |Id| Operation | Name |Rows | Bytes | Cost | ------------------------------------------------------------------ | 0| SELECT STATEMENT | | 157 | 9734 | 9 | | 1| SORT GROUP BY | | 157 | 9734 | 9 | | 2| HASH JOIN | | 327 | 20274 | 5 | | 3| TABLE ACCESS FULL| PRODUCT | 162 | 2754 | 2 | | 4| TABLE ACCESS FULL| MONTHLY_SALES_MV| 327 | 14715 | 2 | ------------------------------------------------------------------
The materialized view MONTHLY_SALES_MV defined previously contained data for all products for each month and year. Sometimes you may only want to summarize data for a certain product, region, or year. In this case, the materialized view will only contain a subset of data indicated by a selection condition in the WHERE clause of its query. In Oracle 9i, sophisticated query rewrites are possible with such a materialized view. Oracle will determine if the query can be answered by the materialized view by analyzing and comparing the WHERE clauses of the materialized view and the query.
The following materialized view contains sum of sales and number of sales for the electronics category for the years from 1997 through 2002.
CREATE MATERIALIZED VIEW elec_sales_mv ENABLE QUERY REWRITE AS SELECT t.month, t.year, p.product_id, SUM(ps.purchase_price) as sum_of_sales, COUNT (ps.purchase_price) as total_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.purchase_date AND ps.product_id = p.product_id AND p.category = 'ELEC' AND t.year BETWEEN 1997 and 2002 GROUP BY t.month, t.year, p.product_id;
This materialized view can be used to answer the following query, which requests the sum of sales and number of sales for the electronics category for 1999. Notice that following the execution plan is a predicate information section. This is part of the EXPLAIN PLAN output and shows the predicates applied during each step of the execution plan. We show it here to illustrate that upon rewrite, the MV data is filtered to only select the row for 1999.
SELECT t.month, p.product_id, SUM(ps.purchase_price) as sum_of_sales, COUNT (ps.purchase_price) as total_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.purchase_date AND ps.product_id = p.product_id AND p.category = 'ELEC' AND t.year = 1999 GROUP BY t.month, t.year, p.product_id; MONTH PRODUCT_ SUM_OF_SALES TOTAL_SALES ---------- -------- ------------ ----------- 1 SP1000 54966.53 332 1 SP1001 48861.17 312 1 SP1010 45432.43 268 ... ---------------------------------------------------------------- |Id| Operation | Name | Rows | Bytes | Cost | ---------------------------------------------------------------- | 0| SELECT STATEMENT | | 2 | 116 | 2 | |*1| TABLE ACCESS FULL | ELEC_SALES_MV | 2 | 116 | 2 | ---------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("ELEC_SALES_MV"."YEAR"=1999)
On the other hand, the answer to a query that requests the sum of sales for the year 1999 for all categories must be obtained from the detail tables, because the materialized view does not have sufficient data to compute sales for product categories other than electronics.
A query can have additional conditions not mentioned in the materialized view. For instance in the following query, we are looking for monthly sales of digital camera products in the electronics category for 2002. The materialized view has all products within this category, and we can determine the product_name from product_id using a join-back. Hence, the query will be rewritten to use the materialized view with a join to the product table.
SELECT t.month, SUM(ps.purchase_price) as sum_of_sales, COUNT (ps.purchase_price) as total_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.purchase_date AND ps.product_id = p.product_id AND p.category = 'ELEC' AND t.year = 2002 AND product_name = 'Digital Camera' GROUP BY t.month; MONTH SUM_OF_SALES TOTAL_SALES ---------- ------------ ----------- 1 54966.53 332 2 32861.22 204 4 24575.66 82 6 29770.1 41 --------------------------------------------------------------- |Id| Operation | Name | Rows |Bytes| Cost| --------------------------------------------------------------- | 0| SELECT STATEMENT | | 1 | 75 | 4 | | 1| TABLE ACCESS BY | PRODUCT | 1 | 17 | 1 | | | INDEX ROWID | | | | | | 2| NESTED LOOPS | | 1 | 75 | 4 | | 3| TABLE ACCESS FULL| ELEC_SALES_MV | 2 | 116 | 2 | | 4| INDEX RANGE SCAN | PRODUCT_PK_INDEX| 162 | | | ---------------------------------------------------------------
This example illustrates how two rewrite mechanisms can be applied together-join-back and filtered data.
Aggregates in the query can be computed from different aggregates in the materialized view. We already saw a simple example of this in the SQL text match section. However, the power of query rewrite comes from the fact that many different transformations can be combined together. For instance, in the following query, we want to know the average purchase price of each item by year. The materialized view has the sum and count of the purchase_price at the month, year level. However, the average can be computed by first doing a rollup of months to years and then dividing the sum by the count of the purchase_price. The query is therefore rewritten to use the MONTHLY_SALES_MV materialized view
-- aggregate computability SELECT t.year, p.product_id, AVG(ps.purchase_price) as ave_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.time_key AND ps.product_id = p.product_id GROUP BY t.year, p.product_id; YEAR PRODUCT_ AVE_SALES ---------- -------- ---------- 2002 SP1000 215.74129 2002 SP1001 206.1264 2002 SP1010 238.176498 ... ------------------------------------------------------------------ |Id| Operation | Name | Rows | Bytes | Cost | ------------------------------------------------------------------ | 0| SELECT STATEMENT | | 327 | 14715 | 5 | | 1| SORT GROUP BY | | 327 | 14715 | 5 | | 2| TABLE ACCESS FULL| MONTHLY_SALES_MV | 327 | 14715 | 2 | ------------------------------------------------------------------
To compute aggregates in the query with a rollup, the materialized view may need to have auxiliary aggregates. For instance, to roll up AVG, the materialized view must have SUM and COUNT or AVG and COUNT.
All the query rewrites we have seen so far have not required any additional information from the user. However, to get the most out of query rewrite, you must inform query rewrite about relationships between data columns using constraints and dimensions.
One of the powerful features of query rewrite is the ability for a single materialized view to be used to satisfy a wide range of queries. The dimension object discussed in section 4.5, is extremely useful in this respect. By allowing you to declare relationships within columns of dimension tables, it provides query rewrite with information to roll up from a lower to a higher level of a hierarchy. For example, suppose your users may want to know the sum of sales by day, month, or year. You could create three materialized views to answer these queries, or you could create one at the level of day and then define a dimension object to show the relationship between day, month, and year. Now, when the query asks for data at the monthly level, the materialized view at the daily level will be used to roll up the data to the monthly level.
Recall that the relationships declared by dimensions are not validated automatically by Oracle. Hence, you must set the QUERY_REWRITE_INTEGRITY parameter to TRUSTED or STALE_TOLERATED, indicating that the relationships stated by the dimension object can be trusted.
Consider the following definition for the time dimension. The HIERARCHY clause tells query rewrite that the time_key rolls up into week_number, which, in turn, rolls up into quarter. This means that if we knew the time_key value for some row in the TIME table, we know which week it belonged to. Similarly, if we knew the month (say Jan 2002), we know the year it belonged to: 2002.
CREATE DIMENSION time LEVEL time_key is time.time_key LEVEL month is time.month LEVEL quarter is time.quarter LEVEL year is time.year LEVEL week_number is time.week_number HIERARCHY fiscal_rollup ( time_key CHILD OF week_number CHILD OF quarter) HIERARCHY calendar_rollup( time_key CHILD OF month CHILD OF year);
Suppose our materialized view MONTHLY_SALES_MV was defined to report the total sales by product and month, as follows:
CREATE MATERIALIZED VIEW monthly_sales_mv ENABLE QUERY REWRITE AS SELECT t.month, p.product_id, SUM(ps.purchase_price) as sum_of_sales, COUNT (ps.purchase_price) as total_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.time_key AND ps.product_id = p.product_id GROUP BY t.month, p.product_id;
In the following query, we want to know the total sales by product by year. Since we have a materialized view with the total sales by product by month, and months can be rolled up into years, as specified in the calendar_rollup hierarchy in the time dimension, the optimizer will rewrite the query to use the materialized view, MONTHLY_SALES_MV. Note that in order to determine the year value for the month, Oracle will do a join-back from the materialized view to the time table.
-- rollup to higher LEVEL in the HIERARCHY SELECT t.year, p.product_id, COUNT (ps.purchase_price) as total_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.time_key AND ps.product_id = p.product_id GROUP BY t.year, p.product_id; ------------------------------------------------------------------ |Id| Operation | Name |Rows |Bytes | Cost | ------------------------------------------------------------------ | 0| SELECT STATEMENT | | 1 | 39 | 9 | | 1| SORT GROUP BY | | 1 | 39 | 9 | | 2| HASH JOIN | | 176 | 6864 | 7 | | 3| VIEW | | 5 | 35 | 4 | | 4| SORT UNIQUE | | 5 | 35 | 4 | | 5| TABLE ACCESS FULL|TIME | 15 | 105 | 2 | | 6| TABLE ACCESS FULL |MONTHLY_SALES_MV| 246 | 7872 | 2 | ------------------------------------------------------------------
When a data warehouse is being defined, the dimension object is often overlooked, because its value to query rewrite is not fully appreciated. However, a dimension object gives you tremendous rewrite power at no extra cost. We have already seen how query rewrite can take advantage of the HIERARCHY clause to rewrite several queries with one materialized view. Query rewrite can also make use of the ATTRIBUTE clause of dimension. In the following example, we want to know the sum of sales by customers based on their gender and occupation.
SELECT c.sex, c.occupation, SUM(ps.purchase_price) as sum_of_sales FROM purchases ps, customer c WHERE c.customer_id = ps.customer_id GROUP BY c.sex, c.occupation;
We could have put the columns sex and occupation into a materialized view. But we know that given the customer_id, we can find information such as the customer's name, sex, and occupation. Such relationships within a table that are not hierarchical in nature are defined by the ATTRIBUTE clause in a dimension. The ATTRIBUTE clause tells us that each customer_id has a one-to-one relationship to the customer's name, sex, and occupation.
The following dimension defines the relationships within the customer table.
CREATE DIMENSION customer_dim LEVEL customer IS customer.customer_id LEVEL town IS customer.town LEVEL region IS customer.county HIERARCHY customer_zone ( customer CHILD OF town CHILD OF region ) ATTRIBUTE customer DETERMINES (sex, occupation);
Now that we have this dimension object, we only need to include the customer_id in the materialized view.
CREATE MATERIALIZED VIEW cust_sales_mv ENABLE QUERY REWRITE AS SELECT c.customer_id, SUM(ps.purchase_price) as sum_of_sales, COUNT (ps.purchase_price) as total_sales FROM customer c, purchases ps WHERE c.customer_id = ps.customer_id GROUP BY c.customer_id;
The execution plan of the query shows that the query was rewritten to use the materialized view. Note that a join-back was done to the customer table to retrieve the values of the occupation and sex columns.
----------------------------------------------------------------- |Id | Operation | Name | Rows | Bytes | Cost | ----------------------------------------------------------------- | 0 | SELECT STATEMENT | | 8 | 320 | 7 | | 1 | SORT GROUP BY | | 8 | 320 | 7 | | 2 | HASH JOIN | | 82 | 3280 | 5 | | 3 | TABLE ACCESS FULL | CUSTOMER | 40 | 800 | 2 | | 4 | TABLE ACCESS FULL | CUST_SALES | 82 | 1640 | 2 | ----------------------------------------------------------------|
When designing your data warehouse, you should try to identify relationships between your dimension tables and define dimension objects, wherever possible. This will lead to significant space savings and increase query rewrite opportunities, thereby improving your query performance.
In OLTP systems, constraints are used to enforce data integrity. In a data warehouse, constraints may be used to define the relationships between the fact and dimension tables. Typically, a primary-key constraint is defined on the unique key column on each dimension table. A foreign-key constraint and a NOT NULL constraint are defined on each corresponding key in the fact table.
As discussed in Chapter 2, the EASYDW schema has primary-key constraints on each of the dimension tables-customer, product, and time. Also, there are foreign key and NOT NULL constraints on columns in the purchases table that join to these dimension tables.
The relationship defined by these constraints indicates to query rewrite that a join between the purchases table and, for example, the TIME table, will produce a row for every row in the purchases table. Such a join is known as a lossless join, because no rows in the purchases table will be lost (or duplicated) by the join process. You can use a materialized view to rewrite a query that has fewer joins if the extra joins in the materialized view are lossless. For instance, we can rewrite the following query with the MONTHLY_SALES_MV materialized view even though the time table is omitted completely! This is because the extra join between the PURCHASES and TIME tables in the materialized view is a lossless join.
SELECT p.product_id, SUM(ps.purchase_price) as sum_of_sales FROM product p, purchases ps WHERE ps.product_id = p.product_id GROUP BY p.product_id; PRODUCT_ SUM_OF_SALES -------- ------------ SP1000 142324.24 SP1001 104369.24 SP1010 122422.72 ... ------------------------------------------------------------------ |Id| Operation | Name | Rows | Bytes | Cost | ------------------------------------------------------------------ | 0| SELECT STATEMENT | | 327 | 10464 | 4 | | 1| SORT GROUP BY | | 327 | 10464 | 4 | | 2| TABLE ACCESS FULL| MONTHLY_SALES_MV| 327 | 10464 | 2 | ------------------------------------------------------------------
As described in Chapter 2, when a constraint is enabled, you can choose to have Oracle validate the integrity of the data. Typically, in a data ware-house, you may have already validated the data on the operational system or during the ETL process. In this case, you could use the NOVALIDATE clause to tell Oracle that the data has already been validated.
ALTER TABLE purchases ENABLE NOVALIDATE CONSTRAINT fk_customer_id;
Declaring the constraint informs query rewrite about the relationship between the fact and dimension tables. An additional RELY clause should be used to tell Oracle that it can rely upon the constraints being correct and can use it in query rewrite even when the constraint has not been validated. It allows the database administrator (DBA) to say: Trust me. I've already checked the data validity. Query rewrite can rely on the relationship being correct.
ALTER TABLE purchases MODIFY CONSTRAINT fk_customer_id RELY;
Hint: | Use constraints to define the relationship between your fact and dimension tables. Use the dimension object to declare the relationships within your dimension tables, such as a time or a region hierarchy. |
We have mentioned the QUERY_REWRITE_INTEGRITY parameter several times in our examples. This parameter indicates to Oracle the extent to which the rewritten queries must reflect the data in the detail tables. This parameter can take three values:
ENFORCED
STALE_TOLERATED
TRUSTED
If the QUERY_REWRITE_INTEGRITY mode is set to ENFORCED (which is the default), Oracle will guarantee that the rewritten query will return the same results as the query without query rewrite. If it is set to STALE_TOLERATED, Oracle will use stale materialized views that do not contain all the detail data. If it is set to TRUSTED, Oracle will use data and relationships that have been "blessed" by the DBA, namely dimension objects, RELY constraints, and materialized views created from PREBUILT tables. Oracle does not validate that the relationships declared by the dimension are valid or that a prebuilt table is the same as the materialized view's query.
You must decide what query rewrite integrity level is appropriate for your application. Most of the time, you would like to get the result the fastest way possible, rewriting your queries to use materialized views. However, if your materialized views have become stale and no longer represent the summarization of all your detail data, you may prefer to get the results from the detail tables until you can perform your next refresh.
Hint: | When first testing a materialized view to see if query rewrite will occur, set QUERY_REWRITE_INTEGRITY to STALE_TOLERATED, because if the query does not rewrite in this mode, it will not rewrite in any other mode. Once you know it works, you can try setting the parameter mode to your desired level. |
If the results obtained from a materialized view are close enough for your application, you may want to use the materialized view even if it is stale. To determine the month-over-month growth rate of online sales, you do not need every single sales transaction in the materialized view. As long as the data is reasonably recent, you could still get an answer that was close enough. Or, if the application knew that the missing data was beyond the scope of the query, it may still want to use the materialized view. For instance, if the missing data is for the last month but your query does not need this data, you can use the materialized view. Or, it may be appropriate to use the materialized view when the fact table is stale but not when a dimension is updated.
To tell the optimizer that you'd like to use the materialized view even if it is stale, set the QUERY_REWRITE_INTEGRITY parameter to STALE_TOLERATED. You can determine if a materialized view is FRESH or STALE using the STALENESS column of the catalog view USER_MVIEWS.
The following example shows the difference between the two integrity modes STALE_TOLERATED and ENFORCED with regard to STALE data. We insert four new rows into the purchases fact table.
INSERT INTO purchases VALUES ( 'SP1061','1-FEB-2002', 'AB123456','1-FEB-2002', 0024,28.01, 4.50, 'Y'); INSERT INTO purchases VALUES ('SP1062','1-FEB-2002', 'AB123457','1-FEB-2002', 1024,28.01, 4.50, 'Y'); INSERT INTO purchases VALUES ('SP1063','2-FEB-2002', 'AB123457','2-FEB-2002', 0024,28.01, 4.50, 'N'); INSERT INTO purchases VALUES ('SP1064','2-FEB-2002', 'AB123457','2-FEB-2002', 1024,28.01, 4.50, 'N'); COMMIT;
The MONTHLY_SALES_MV materialized view is now stale. All the data in the detail table is not reflected in the materialized view.
SELECT staleness FROM user_mviews WHERE mview_name = 'MONTHLY_SALES_MV'; STALENESS ------------------- STALE
If you set QUERY_REWRITE_INTEGRITY to ENFORCED (or TRUSTED), Oracle will use the detail tables rather than the materialized view and the sales numbers are updated to include the new rows.
ALTER SESSION SET QUERY_REWRITE_INTEGRITY=ENFORCED; SELECT t.month, p.product_id, SUM(ps.purchase_price) as sum_of_sales, COUNT(ps.purchase_price) as total_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.time_key AND ps.product_id = p.product_id GROUP BY t.month, p.product_id; MONTH PRODUCT_ SUM_OF_SALES TOTAL_SALES ---------- -------- ------------ ----------- 1 SP1000 19683.57 209 1 SP1001 17377.68 189 1 SP1010 12500.83 145 1 SP1011 25094.03 249 ... 2 SP1061 362092.71 248 <- new value 2 SP1062 54775.9 292 2 SP1063 45862.38 286 2 SP1064 53637.11 248 ... ------------------------------------------------------------------ |Id| Operation | Name | Rows |Bytes| Cost| ------------------------------------------------------------------ |0 | SELECT STATEMENT | | 567 |23814| 415| |1 | SORT GROUP BY | | 567 |23814| 415| |2 | NESTED LOOPS | | 70585 |2895K| 43| |3 | HASH JOIN | | 70585 |2412K| 43| |4 | TABLE ACCESS FULL |TIME | 15 | 225| 2| |5 | PARTITION RANGE ALL| | | | | |6 | TABLE ACCESS FULL |PURCHASES | 94113 |1838K| 40| |7 | INDEX RANGE SCAN |PRODUCT_PK_INDEX| 1 | 7| | ------------------------------------------------------------------
Note that the TRUSTED mode will behave the same as ENFORCED with regard to STALE data-it will not use a STALE materialized view. However, if you can tolerate stale data and set QUERY_REWRITE_INTEGRITY to STALE_TOLERATED, then the materialized view is now used. However, the new rows that were inserted are not reflected in the total!
ALTER SESSION SET QUERY_REWRITE_INTEGRITY=STALE_TOLERATED; SELECT t.month, p.product_id, SUM(ps.purchase_price) as sum_of_sales, COUNT(ps.purchase_price) as total_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.time_key AND ps.product_id = p.product_id GROUP BY t.month, p.product_id; MONTH PRODUCT_ SUM_OF_SALES TOTAL_SALES ---------- -------- ------------ ----------- 1 SP1000 19683.57 209 1 SP1001 17377.68 189 1 SP1010 12500.83 145 1 SP1011 25094.03 249 ... 2 SP1061 362064.7 247 <- old value 2 SP1062 54747.89 291 2 SP1063 45834.37 285 2 SP1064 53609.1 247 ... ------------------------------------------------------------------ |Id| Operation | Name | Rows | Bytes | Cost| ------------------------------------------------------------------ | 0| SELECT STATEMENT | | 327 | 18966 | 2| | 1| TABLE ACCESS FULL | MONTHLY_SALES_MV | 327 | 18966 | 2| ------------------------------------------------------------------
You are probably thinking that setting the parameter to ENFORCED is the best mode to use, since it guarantees that you will see the same results from using the materialized view or querying the detail tables. The problem is that in ENFORCED mode, all constraints must be validated and dimension objects are not used. Unless you have validated all the defined relationships, such as constraints, query rewrite may never occur. Therefore, in a warehouse you will most likely use the TRUSTED mode. In this mode, materialized views are only used if they are fresh; constraints are used provided they have the RELY clause, and dimensions are also considered, even though they are not validated.
Caution | Query rewrite depends on the integrity of your dimension and constraint definitions. Does each product in the product table roll up to one, and only one, category, as specified in your dimension definition? Does each product in the purchases table have a corresponding product_id in the products table, as specified by your referential integrity constraints? If your data do not reflect the relationships defined by the constraint or dimension, you may get unexpected results. The same holds for materialized views on prebuilt tables. If the prebuilt table does not reflect the materialized view's query accurately, then results can be unexpected. To summarize, TRUSTED is the recommended mode to use in a warehouse, but you must ensure that all data is correct. |
Query rewrite is only available when using the cost-based optimizer. While it is normally faster to obtain the results from a materialized view, there may be cases where it is faster to retrieve the results from the detail table. The query optimizer compares the cost of the execution plan with and without query rewrite and uses the one with the lower cost. To ensure that the optimizer makes the correct choices, you need to collect statistics on the detail tables involved in the query and on the materialized views using the DBMS_STATS package.
However, if your application requires that the query must use the materialized view and must not use the base tables, you can set the QUERY_REWRITE_ENABLED to FORCE. In this mode, if there is a materialized view that satisfies the query, the optimizer will use it. It will not compare the cost of plan with and without rewrite. This parameter value is new in Oracle 9i Release 2.
If several materialized views are eligible to rewrite the query, query rewrite will choose the one that would need to read less data. You can, however, request rewrite to use a specific materialized view by using the /*+ REWRITE(mv) */ hint. You can disable query rewrite for a statement by using the /*+ NOREWRITE */ hint. For instance, suppose we had two eligible materialized views: MONTHLY_SALES_MV, which computes sum of sales by month, and YEARLY_SALES_MV, which computes the sum of sales by year. For the following query, you would expect query rewrite to pick the latter, since it would be smaller. You could, however, force query rewrite to use MONTHLY_SALES_MV with a hint.
SELECT t.year, p.product_id, SUM(ps.purchase_price) sum_of_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.time_key AND ps.product_id = p.product_id GROUP BY t.year, p.product_id; ------------------------------------------------------------------ |Id| Operation | Name | Rows | Bytes | Cost | ------------------------------------------------------------------ | 0| SELECT STATEMENT | | 82 | 2624 | 2 | | 1| TABLE ACCESS FULL| YEARLY_SALES_MV | 82 | 2624 | 2 | ------------------------------------------------------------------ SELECT /*+ REWRITE(monthly_sales_mv) */ t.year, p.product_id, SUM(ps.purchase_price) as sum_of_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.time_key AND ps.product_id = p.product_id GROUP BY t.year, p.product_id; ------------------------------------------------------------------ |Id| Operation | Name | Rows | Bytes | Cost| ------------------------------------------------------------------ | 0| SELECT STATEMENT | | 115 | 2415 | 10 | | 1| SORT GROUP BY | | 115 | 2415 | 10 | | 2| HASH JOIN | | 589 | 12369 | 7 | | 3| VIEW | | 5 | 35 | 4 | | 4| SORT UNIQUE | | 5 | 35 | 4 | | 5| TABLE ACCESS FULL|TIME | 15 | 105 | 2 | | 6| TABLE ACCESS FULL |MONTHLY_SALES_MV| 825 | 11550 | 2 | ------------------------------------------------------------------
Previously, we discussed the partition change tracking feature in Oracle 9i that allows materialized views to be fast refreshed after partition maintenance operations. Partition change tracking also increases the query rewrite capabilities of the materialized view. Normally, when a detail table is updated, the materialized view becomes stale and cannot be used by query rewrite in ENFORCED or TRUSTED integrity levels. However, if the detail table is partitioned and the materialized view supports partition change tracking on that table (by including the partition key or partition marker for that table), Oracle can determine if a portion of the materialized view is fresh. If a query can be answered only using the fresh portion of the materialized view, query rewrite will use the materialized view. For example, in Figure 4.4 (see section 4.3.3) the fresh portion of the materialized view corresponds to the Jan 2002 and Mar 2002 partitions. The materialized view cannot be used for Feb 2002 (updated partition) or Apr 2002 (new partition).
Consider the following materialized view containing sales data for products. The purchases table is partitioned by time_key, which is included in the materialized view.
CREATE MATERIALIZED VIEW product_category_sales_mv ENABLE QUERY REWRITE AS SELECT ps.time_key, p.category, SUM(ps.purchase_price) as sum_of_sales FROM product p, purchases ps WHERE ps.product_id = p.product_id GROUP BY ps.time_key, p.category;
If we query the view user_mviews, we will see that the materialized view is FRESH.
SELECT staleness FROM user_mviews WHERE mview_name = 'PRODUCT_CATEGORY_SALES_MV'; STALENESS ------------------- FRESH
The materialized view only contains data through Mar 2002. Now, suppose we added a new partition to the purchases table and loaded data for Apr 2002.
ALTER TABLE purchases ADD PARTITION purchases_apr2002 values less than (TO_DATE('01-05-2002', 'DD-MM-YYYY')); INSERT INTO purchases VALUES ( 'SP1063','2-APR-2002', 'AB123457','7-APR-2002', 0024,28.01, 4.50, 'N'); INSERT INTO purchases VALUES ( 'SP1064','2-APR-2002', 'AB123457','8-APR-2002', 1024,28.01, 4.50, 'N'); ... COMMIT;
If we query the view user_mviews now, we will see that the materialized view is STALE.
SELECT staleness FROM user_mviews WHERE mview_name = 'PRODUCT_CATEGORY_SALES_MV'; STALENESS ------------------- STALE
Now, consider the following query, which asks for the sum of sales for January through March 2002. The Oracle optimizer will determine that the query only needs to access partitions for Jan, Feb, and Mar 2002 of the purchases table. Since the materialized view is enabled for partition change tracking for this table, Oracle will determine that the materialized view is fresh with respect to these partitions. Hence, it can rewrite with the materialized view, as shown in the following execution plan.
SELECT ps.time_key, p.category, SUM(ps.purchase_price) as sum_of_sales FROM product p, purchases ps WHERE ps.product_id = p.product_id and ps.time_key BETWEEN TO_DATE('01-01-2002', 'DD-MM-YYYY') AND TO_DATE('31-03-2002', 'DD-MM-YYYY') GROUP BY ps.time_key, p.category; --------------------------------------------------------------- |Id| Operation |Name | Rows| Bytes | Cost | --------------------------------------------------------------- |0 | SELECT STATEMENT | | 1 | 26 | 2 | |1 | TABLE ACCESS FULL|PRODUCT_CATEGORY_| 1 | 26 | 2 | | | | SALES_MV | | | | ---------------------------------------------------------------
On the other hand, a query that requests the sum of sales for Apr 2002 is answered using the detail data, as shown in the following example.
SELECT ps.time_key, p.category, SUM(ps.purchase_price) as sum_of_sales, FROM product p, purchases ps WHERE ps.product_id = p.product_id and ps.time_key BETWEEN TO_DATE('01-01-2002', 'DD-MM-YYYY') AND TO_DATE('30-04-2002', 'DD-MM-YYYY') GROUP BY ps.time_key, p.category; --------------------------------------------------------------- |Id | Operation | Name | Rows | Bytes | Cost | --------------------------------------------------------------- | 0| SELECT STATEMENT | | 43 | 1376 | 442 | | 1| SORT GROUP BY | | 43 | 1376 | 442 | | 2| HASH JOIN | | 94113 | 2941K| 43 | | 3| TABLE ACCESS FULL | PRODUCT | 162 | 1944 | 2 | | 4| PARTITION RANGE ALL| | | | | | 5| TABLE ACCESS FULL | PURCHASES| 94113 | 1838K| 40 | ---------------------------------------------------------------
The SQL aggregation operators, CUBE, ROLLUP, and GROUPING SETS, provide a mechanism to compute multiple groupings in a single query. You can create a materialized view using a query with these operators to store multiple levels of aggregation, instead of separate materialized views for each level. Query rewrite can be used to rewrite a query that asks for any of these levels of aggregation.
The following example shows a materialized view, with grouping sets that computes the sum of sales for the three groupings: (category, time_key), (category, time_key, town), and (time_key, county). Note that the materialized view must have a GROUPING_ID or GROUPING function on the group by columns to distinguish rows that correspond to different groupings.
CREATE MATERIALIZED VIEW sales_mv ENABLE QUERY REWRITE AS SELECT p.category, t.time_key, c.county, c.town, SUM(f.purchase_price) sales, GROUPING_ID(p.category, t.time_key, c.town, c.county) gid FROM product p, purchases f, time t, customer c WHERE p.product_id = f.product_id AND t.time_key = f.time_key AND c.customer_id = f.customer_id GROUP BY GROUPING SETS ((p.category, t.time_key), (p.category, t.time_key, c.town), (t.time_key, c.county));
We will now show how this materialized view can be used to rewrite different types of queries.
This materialized view can be used to rewrite a query that asks for any grouping that is present in the materialized view or one that can be derived using a rollup. For example, the following query, which asks for total sales by category and time_key, can be rewritten to use the SALES_MV materialized view.
SELECT p.category, t.time_key, SUM(f.purchase_price) sales FROM product p, purchases f, time t, customer c WHERE p.product_id = f.product_id AND t.time_key = f.time_key AND c.customer_id = f.customer_id GROUP BY p.category, t.time_key;
From the predicate information in the EXPLAIN PLAN output, we see that rewrite was done by selecting rows from SALES_MV with gid = 3, which corresponds to the grouping (p.category, t.time_key).
--------------------------------------------------------------- | Id| Operation | Name | Rows | Bytes | Cost | --------------------------------------------------------------- | 0| SELECT STATEMENT | | 1 | 39 | 2 | |* 1| TABLE ACCESS FULL | SALES_MV | 1 | 39 | 2 | --------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("SALES_MV"."GID"=3)
If the query has multiple groupings, Oracle will try to find a materialized view that satisfies all the groupings. For instance, the following query can be rewritten to use the SALES_MV materialized view defined previously. The output of EXPLAIN PLAN indicates that rewrite used two groupings from the materialized views: gid = 3, corresponding to (p.category, t.time_key), and gid = 10, corresponding to (t.time_key, c.county).
SELECT p.category, t.time_key, c.county, SUM(f.purchase_price) sales FROM product p, purchases f, time t, customer c WHERE p.product_id = f.product_id AND t.time_key = f.time_key AND c.customer_id = f.customer_id GROUP BY GROUPING SETS ((p.category, t.time_key), (t.time_key, c.county)); ----------------------------------------------------------------- | Id| Operation | Name | Rows | Bytes | Cost | ----------------------------------------------------------------- | 0| SELECT STATEMENT | | 1 | 46 | 2 | |* 1| TABLE ACCESS FULL | SALES_MV | 1 | 46 | 2 | ----------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("SALES_MV"."GID"=3 OR "SALES_MV"."GID"=10)
If Oracle cannot find a single materialized view to answer the query, it will try to rewrite each grouping separately. (Note that a query with grouping sets can be expressed as a UNION ALL of queries with the individual groupings.) As a result, several materialized views may be used to rewrite the query, and some groupings may remain unrewritten and use the detail data. This rewrite transformation was introduced in Oracle 9i release2.
The next query cannot be rewritten using SALES_MV alone. The grouping (p.category, t.time_key) is present in the SALES_MV materialized view, and the grouping (t.time_key, c.town) can be derived using a rollup of (p.category, t.time_key, c.town). However, the grouping (p.category, c.county, t.year) is not present in this materialized view.
SELECT p.category, t.time_key, c.county, t.year, c.town, SUM(f.purchase_price) sales FROM product p, purchases f, time t, customer c WHERE p.product_id = f.product_id AND t.time_key = f.time_key AND c.customer_id = f.customer_id GROUP BY GROUPING SETS ((p.category, t.time_key), (t.time_key, c.town), (p.category, c.county, t.year)); ------------------------------------------------------------------ |Id | Operation | Name | Rows | Bytes | Cost | ------------------------------------------------------------------ | 0| SELECT STATEMENT | | 5 | 290 | 120 | | 1| VIEW | | 5 | 290 | 120 | | 2| UNION-ALL | | | | | |* 3| TABLE ACCESS FULL | SALES_MV | 1 | 39 | 2 | | 4| SORT GROUP BY | | 1 | 47 | 4 | |* 5| TABLE ACCESS FULL | SALES_MV | 1 | 47 | 2 | | 6| SORT GROUP BY | | 3 | 186 | 114 | |* 7| HASH JOIN | | 13374 | 809K| 16 | | 8| TABLE ACCESS FULL | PRODUCT | 162 | 1944 | 2 | |* 9| HASH JOIN | | 13374 | 653K| 13 | | 10| TABLE ACCESS FULL | CUSTOMER | 82 | 820 | 2 | |*11| HASH JOIN | | 13374 | 522K| 10 | | 12| TABLE ACCESS FULL| TIME | 7 | 84 | 2 | | 13| TABLE ACCESS FULL| PURCHASES | 13379 | 365K| 7 | ------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("SALES_MV"."GID"=3) 5 - filter("SALES_MV"."GID"=1) 7 - access("P"."PRODUCT_ID"="F"."PRODUCT_ID") 9 - access("C"."CUSTOMER_ID"="F"."CUSTOMER_ID") 11 - access("T"."TIME_KEY"="F"."TIME_KEY")
The EXPLAIN PLAN output shows that rewrite was done using SALES_MV for two groupings (gid = 3 and gid = 1) and using the detail tables for grouping (p.category, c,county, t.year).
Now, suppose we had a simple materialized view, SALES_MV2, that had the grouping (p.category, c.county, t.year).
CREATE MATERIALIZED VIEW sales_mv2 ENABLE QUERY REWRITE AS SELECT p.category, c.county, t.year, SUM(f.purchase_price) sales FROM product p, purchases f, time t, customer c WHERE p.product_id = f.product_id AND t.time_key = f.time_key AND c.customer_id = f.customer_id GROUP BY p.category, c.county, t.year;
Oracle can now use the combination of SALES_MV and SALES_MV2 to rewrite the query, as shown in the following execution plan.
------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost | ------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 84 | 4872 | 8 | | 1 | VIEW | | 84 | 4872 | 8 | | 2 | UNION-ALL | | | | | |* 3 | TABLE ACCESS FULL | SALES_MV | 1 | 39 | 2 | | 4 | SORT GROUP BY | | 1 | 47 | 4 | |* 5 | TABLE ACCESS FULL| SALES_MV | 1 | 47 | 2 | | 6 | TABLE ACCESS FULL | SALES_MV2 | 82 | 3034 | 2 | ------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("SALES_MV"."GID"=3) 5 - filter("SALES_MV"."GID"=1)
In the previous examples, we have used EXPLAIN PLAN to see if a query had been rewritten to use a materialized view. However, sometimes you may find that the query did not rewrite with the materialized view as you had expected. The rules governing query rewrite can be extremely complex, and the reasons for not using a materialized view may not be obvious. In some cases, the reason is extremely trivial, such as the parameter QUERY_REWRITE_ENABLED not being set to TRUE. In other cases, the reason could be more subtle such as a constraint not being present or validated, or some column required by the query not being present in the materialized view. Oracle 9i introduced a utility to assist with diagnosing the reasons for such missed rewrites. This utility is the PL/SQL API DBMS_MVIEW.EXPLAIN_REWRITE.
To use EXPLAIN_REWRITE, you provide the query and, optionally, the materialized view it should use to rewrite. The procedure will tell you if the query will use that materialized view and, if not, the reason for not doing the rewrite. Prior to using the procedure, you must create a table named REWRITE_TABLE in your schema, using the script utlxrw.sql in the rdbms/admin directory. The results of EXPLAIN_REWRITE are placed in this table. There is also a varray interface, which allows you to access the results through a PL/SQL program.
We will now illustrate how to use this utility. In the first example, the user forgot to set the QUERY_REWRITE_ENABLED parameter to TRUE. To diagnose the problem you issue EXPLAIN_REWRITE and select the results from the REWRITE_TABLE.
BEGIN dbms_mview.explain_rewrite(' SELECT t.month, t.year, p.product_id, SUM (ps.purchase_price) as sum_of_sales, COUNT (ps.purchase_price) as total_sales FROM time t, product p, purchases ps WHERE t.time_key = ps.time_key AND ps.product_id = p.product_id GROUP BY t.month, t.year, p.product_id', 'MONTHLY_SALES_MV'); END; / SELECT mv_name, message FROM rewrite_table; MV_NAME MESSAGE ------------------------- -------------------------------------- MONTHLY_SALES_MV QSM-01001: query rewrite not enabled
EXPLAIN_REWRITE can be used to check why a specific materialized view was not used to rewrite the query. The query in the following example was used earlier to illustrate the use of constraints with query rewrite. The materialized view has an extra join between purchases and time tables. In this example, however, the constraint fk_time is not marked RELY, and so the join between purchases and time cannot be determined to be a lossless join. Hence, query rewrite fails. This can be very difficult to diagnose without this utility.
BEGIN dbms_mview.explain_rewrite(' SELECT p.product_id, SUM (ps.purchase_price) as sum_of_sales, COUNT(ps.purchase_price) as total_sales FROM product p, purchases ps WHERE p.product_id = ps.product_id GROUP BY p.product_id', 'MONTHLY_SALES_MV'); END; / SELECT mv_name, message FROM rewrite_table; MV_NAME MESSAGE ---------------------- --------------------------------------- MONTHLY_SALES_MV QSM-01110: a lossy join in MV, MONTHLY_SALES_MV, between tables, PURCHASES and TIME, not found in query MONTHLY_SALES_MV QSM-01053: NORELY referential integrity constraint on table, PURCHASES, in TRUSTED/STALE TOLERATED integrity mode
Now, let us fix this problem and run EXPLAIN_REWRITE again. We will see that the query now rewrites.
ALTER TABLE purchases MODIFY constraint fk_time rely; BEGIN dbms_mview.explain_rewrite(' SELECT p.product_id, SUM (ps.purchase_price) as sum_of_sales, COUNT(ps.purchase_price) as total_sales FROM product p, purchases ps WHERE p.product_id = ps.product_id GROUP BY p.product_id', 'MONTHLY_SALES_MV'); END; / SELECT mv_name, message FROM rewrite_table; MV_NAME MESSAGE -------------------------- ----------------------------------- MONTHLY_SALES_MV QSM-01033: query rewritten with materialized view, MONTHLY_SALES_MV
Sometimes query rewrite may be possible with the requested materialized view; however, there may be a better materialized view that can be used. Suppose we create a materialized view, PRODUCT_SALES_EXACT_MATCH, for the query in the previous example, matching its text exactly. Query rewrite now uses this materialized view instead, since it is more optimal. EXPLAIN_REWRITE will tell you that this is the case.
begin dbms_mview.explain_rewrite(' SELECT p.product_id, SUM(ps.purchase_price) as sum_of_sales, COUNT(ps.purchase_price) as total_sales FROM product p, purchases ps WHERE p.product_id = ps.product_id GROUP BY p.product_id', 'MONTHLY_SALES_MV'); end; / SELECT mv_name, message FROM rewrite_table; MV_NAME MESSAGE -------------------------- ----------------------------------- MONTHLY_SALES_MV QSM-01009: materialized view, PRODUCT_SALES_EXACT_MATCH, matched query text