7.8 Features for data warehouse design

 < Day Day Up > 



When discussing data warehouse design, immediately the words ''star or snowflake schema'' should pop up in our minds. That is correct; data warehouses are popular because of this star/snowflake type of schema. The start schema provides a great mechanism to abstract data, based on business functions. While most of the databases today are moving towards a hybrid type of solution, there are others that continue to differentiate between pure OLTP and data warehouse type solutions.

We have discussed some of the database features suitable for an OLTP database or a hybrid database (OLTP or DSS). We also mentioned certain features that would be ideal if implemented in a data warehouse environment; for example, the bitmap index or a bitmap join index. Another excellent feature added by Oracle in Version 8i that benefits a data warehouse implementation is the materialized view option.

7.8.1 Materialized view

Materialized views are different from standard views. A standard view is a database implementation of a query stored permanently as a database object. At runtime, when data is selected from this view, the query is executed and data is returned to the process that executed the query. Data returned by these views is volatile in nature, because once the required row is queried the data in the view is lost and another process that requires the same set of data will requery the view and the data; the underlying query is executed and data is returned again.

In Oracle 8i, a feature was introduced where the data in the view, once queried, could be saved on disk like any other database table. Subsequently, when data changes in the underlying tables, the query is refreshed, which automatically refreshes the data. This is called a materialized view. From a storage perspective the materialized view behaves exactly like a table. However, data is not directly inserted into it; data is collected and stored from other tables. Oracle treats the materialized view just as it would an Oracle snapshot. The periodic updates to this snapshot are made based on a predefined schedule. Updates are accomplished by way of a refresh interval, which can range from instantaneous rebuilding of the materialized view to a hot refresh that occurs at a predefined interval.

The major advantage of this feature is that all the data is not queried every single time the view is invoked, like in the case of a standard view. In a materialized view the data that is queried during the materialized view creation is saved to disk in a permanent form and any changes to the data set is done through a refresh process, where only the changes are made and no rebuild or requery of the entire collection is required. Like most features, materialized views are not without any restrictions. The major restriction that currently affects many database designers building complex views is the restriction on the usage of subqueries.

However, to overcome this limitation, the workaround would be to use the multitier materialized view option. Under this feature a materialized view is based on other materialized views for their content. That is, a single materialized view will provide the contents of a subquery operation and many such materialized views will be consolidated into one big master materialized view.

Like any other database object, DDL operations are permitted against a materialized view. That is, a materialized view can be created, dropped, altered, etc.

For example, the following statement creates and populates a materialized view MY_SALES_MV. The materialized view will be populated with data as soon as the statement executes successfully and subsequent refreshes will be accomplished by executing the materialized view's query again.

CREATE MATERALIZED VIEW my_sales_mv   TABLESPACE sales_mv_data_p001    PARALLEL (10)     ENABLE QUERY REWRITE      BUILD IMMEDIATE    REFRESH COMPLETE AS    SELECT       TIM.MONTH,       GEO.STATE,       SUM(SALES) AS SUM_SALES    FROM FACT FCT,       TIME TIM,       GEOG GEO WHERE FCT.CUR_DATE=TIM.CUR_DATE AND FCT.CITY_ID=GEO.CITY_ID GROUP BY MONTH,       STATE 

Cost-based optimizer can use materialized views to improve performance by automatically recognizing when a materialized view can and should be used to satisfy a request. The optimizer transparently rewrites the original query at runtime, to use the materialized view.

Refresh process

A materialized view is a transactionally consistent reflection of its master, as the data exist at a specific point in time. To keep materialized views' data relatively current with the data of its master, the materialized view must be refreshed periodically. They can be refreshed automatically whenever the data is changed in the underlying tables. The refresh method can be incremental (fast refresh) or complete. If data does not change very frequently or if the updates to the underlying tables happen in a controlled environment, the refresh operation could be done manually.

There are many types of refresh options, including:

  • Complete refresh: Under this option, the materialized view definitions query is executed, which essentially recreates the materialized view. To refresh the materialized view, the result set of the query replaces the existing materialized view data. A complete refresh can take substantially longer to perform than a fast refresh. Once a complete refresh of a materialized view is performed, then the subsequent refresh should also be a complete refresh; if another type of refresh option is used, then the following error message is reported:

    ORA-12034 mview log is younger than last refresh.
  • Fast refresh: Under this refresh option, the changes that occurred in the master since the most recent refresh of the materialized view are identified and then changes are applied to the materialized view.

  • Force refresh: To perform a force refresh of a materialized view, the server that manages the materialized view attempts to perform a fast refresh. If a fast refresh is not possible, then Oracle performs a complete refresh.

The refresh options can be invoked either by scheduling a process to happen periodically or can be on-demand where a manual intervention will cause refresh of materialized view groups to immediately propagate the new rows of the master table to associated materialized views.

The on-demand refresh can be done using the following command:

EXECUTE DBMS_REFRESH.REFRESH('mview name');

Query rewrite

Oracle optimizer automatically recognizes when an existing materialized view can be used to satisfy a request. Once determined, the optimizer transparently rewrites the request to use the materialized view. Queries are then directed to the materialized view and not to the underlying detail tables, resulting in a significant performance gain.

To enable query rewrite the following should be set in the init.ora file or spfile:

QUERY_REWRITE_ENABLE =TRUE

Since most of the refresh activity in a materialized view happens transparently to the DBA or the users, it uses the job queues to accomplish its activity. Hence it is required to ensure that sufficient job queues are available. Job queues are defined by setting the following parameter in the init.ora or the spfile:

JOB_QUEUE_PROCESSES =3

Materialized views in RAC environment

Materialized views provide a good amount of benefit in a RAC implementation. When multiple instances have to perform complex queries against multiple tables to retrieve data, there is considerable GCS resource movement between instances. This activity is considerably reduced when materialized views are used in a RAC implementation because only a single object that has preselected rows is queried instead of joining quite a few underlying tables at runtime.



 < Day Day Up > 



Oracle Real Application Clusters
Oracle Real Application Clusters
ISBN: 1555582885
EAN: 2147483647
Year: 2004
Pages: 174

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net