Traditionally, OLAP refers to fast access of aggregate or summarized data, with the source data retrieved from a relational data warehouse and stored in a storage format called MOLAP (Multidimensional OLAP). Relational databases are nonetheless really helpful for detail-level data and also helpful for reporting. You learned that MOLAP storage is optimal and provides the best performance while ROLAP storage provides instant access to the latest data but does not have the same performance as that of MOLAP storage. Proactive caching is an important feature addition in Analysis Services 2005 because it aids the UDM to deliver the best of relational and OLAP worlds, most importantly real time data access with near MOLAP performance. Proactive caching helps UDM to achieve real time data by providing controls that help in data propagation from the source data to the UDM which is then available for users queries. When appropriate proactive caching properties are set then Analysis Services starts building a new MOLAP cache when data in the underlying data source changes. Analysis Services serves users from the existing MOLAP cache. As soon as the new MOLAP cache is rebuilt then all the users are served from the new MOLAP cache and start seeing the new data. You see the use of this concept in action throughout this chapter.
As mentioned earlier, the UDM merges the relational and OLAP worlds. We consider proactive caching to be a management feature that helps the administrator or database designer to specify certain settings that help to achieve real-time data access based on customer needs. Proactive caching can be applied to both partitions and dimensions. Figure 18-1, 18-2 and 18-3 you can see details of how this feature works. Figure 18-1 shows an UDM which has proactive caching enabled. Analysis Services 2005 creates a MOLAP cache of the UDM on Analysis Services 2005 from which users query for information. When there are updates to the relational database that affects the UDM data then that information is notified to Analysis Services which is shown in Figure 18-2. If proactive caching has been enabled on the UDM, there is a background thread within Analysis Services 2005 which we refer to as the Proactive Caching Management. This thread controls operations within Analysis Services based on certain parameters to ensure the customers get the real-time data access requested as shown in Figure 18-3.
The typical configuration which we predict users would use is to have the MOLAP storage mode for the UDM with proactive caching enabled which is shown in Figure 18-3. If users are sending queries to your cube, there is an existing MOLAP cache of the UDM on your server. If a user sends a query, data is retrieved from this MOLAP cache. The proactive caching thread looks for changes in the relational data warehouse based on certain mechanisms, which you see later in this chapter. As soon as there is a data change made known to the proactive caching management thread, the thread itself then initiates rebuilding of a new MOLAP cache and clears the existing MOLAP cache. Note that you can set properties that control the time at which the current MOLAP cache gets cleared and the time at which the new MOLAP cache needs to be rebuilt.
Any query that comes to Analysis Services is first checked to see if it can be served with the existing MOLAP cache. If the current cache is valid based on proactive caching settings, results are retrieved from that. If the current cache is not valid, that means a new cache is being rebuilt. Because Analysis Services does not know how long it will take to rebuild the cache, it will then directly go to the relational data warehouse to retrieve the data. Analysis Services creates SQL queries to retrieve the correct data. The SQL queries generated by Analysis Services are optimized to efficiently retrieve data from the relational data warehouse. Any calculation that cannot be done in the relational data warehouse is then computed within Analysis Services after retrieving the data, and the results are returned to the user.
Keep in mind that there might be slight performance degradation during the time data is being retrieved from a relational data source; this is likely due to involvement of query translation as well as network activity — nonetheless, users get the real-time data. If the users do not mind getting the data from the existing cache and they only want to see the refreshed data with good performance, they can set a proactive caching property called latency, which is the time up to which the current MOLAP cache will be valid even after the notification of change in data in the relational data warehouse. Setting the latency (inactivity time interval) close to the rebuilding time helps in getting MOLAP-level performance — keep in mind that a slight delay in the real-time data to users is to be expected.
Fortunately for the users, MOLAP cache building is done as a background thread and assigned a low priority. This means that if queries are submitted to Analysis Services databases, the queries will be given higher priority than the background proactive caching thread. If at any time during this rebuild process the user initiates a process that will change the data in the cube — for example, by re-processing the cube or doing a writeback to the cube — the background proactive caching thread to rebuild the MOLAP cache will be cancelled. Similarly, if Analysis Services receives another notification of a data change, the MOLAP cache rebuilding process will be cancelled unless you have explicitly specified not to do so through a proactive caching property. It is important for you as an administrator or database designer to be aware of this behavior so that you can make sure the proactive caching properties are set with desired values based on your business requirements.