No Latency Scenario


In this short latency scenario you are in charge of an eCommerce site that provides customers' links to the most up-to-date products on the Web, which when sold, provide you a commission. Your Internet affiliates are adding additional products to your catalog electronically and at this point you are at 2.3 million product SKUs and the number is rising. Meanwhile, your partition data is changing frequently, and you have large numbers of members in the product dimensions. What does a BI apps developer do?

Real-Time ROLAP Storage Option

The recommended solution here would be to set up the measure group and dimension data (which are frequently changing) to be in ROLAP mode so that data is automatically retrieved from the relational data store as needed by the user. Working in this way does not come without a price; indeed, although it is definitely a useful storage mode, the performance of ROLAP mode is much slower than the MOLAP equivalent. In general we would always recommend a MOLAP solution for large dimensions, but if your dimension members are constantly changing and these changes need to be reflected immediately to end users, the ROLAP would be a better option. This is because the data is being retrieved directly from your relational data source, which often requires over-the-net communication. Yes, you could go with HOLAP, but performance depends largely on your aggregated data and how frequently it is impacted due to changes in the data.

Just setting the storage mode to ROLAP is not sufficient. Analysis Services caches data and if there is a change in your relational data warehouse, this might need to be immediately reflected in your users' queries. If you definitely need real-time, as in no latency updates, you need to specify REAL-TIME ROLAP, which amounts to setting up proactive caching on ROLAP partitions or dimensions. Under this configuration, on a change in the source data, Analysis Services immediately drops the cache and gets the data from the relational data warehouse. Figure 18-16 shows the selection for REAL TIME ROLAP in the proactive caching dialog. If you click on the options button you can see the proactive caching properties setup so that latency is 0 and you bring the new data online immediately as shown in Figure 18-17.

image from book
Figure 18-16

image from book
Figure 18-17

Billions and Billions of Records

The American astronomer Carl Sagan seemed fond of pondering the number of stars in a galaxy or cluster of galaxies; well, these days it is not a stretch to have similarly mind-boggling numbers of records in your transactional repositories. For reference, the Milky Way contains about 200 billion stars. Anyway, you don't have to ponder how to deal with them because we have a recommended solution for you right here. If you have a large number of fact data (on the order of billions of rows), ask yourself if the dimension data does not change much as compared to the fact data that changes regularly. If this is the case, building the cache might take a disproportionate amount of time since the dimension and fact data needs to be updated. Typically you will have the fact data split across hundreds of partitions. However if the fact data is changing frequently, and if you do need real-time access of the data, then the cache needs to updated frequently and needs to be merged with existing partitions for the new fact data and Analysis Services needs to aggregate the data from multiple partitions to the end users.

The way to approach this type of case is to store historical data (data that does not changes) using the MOLAP storage method and store current data (which can be defined as hours, days, or weeks) in ROLAP mode. Then set proactive caching to operate on the ROLAP partition only. In this way you will get fast access to fact data from the MOLAP store even when there are changes being processed on the ROLAP partition. We recommend this solution whenever you have very large amount of fact data with new fact data arriving periodically and which you need to see in real-time.

Important

Even if you have a MOLAP cube specified, if you still want real-time data you should use the connection string property called Real Time OLAP and set it to True while connecting to Analysis Services. This will help provide you the most up-to-date data from the relational data source.



Professional SQL Server Analysis Services 2005 with MDX
Professional SQL Server Analysis Services 2005 with MDX (Programmer to Programmer)
ISBN: 0764579185
EAN: 2147483647
Year: 2004
Pages: 176

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net