Scalability Optimizations


Scalability with respect to Analysis Services indicates how well Analysis Services handles parameter value increases for dimension sizes, number of dimensions, size of the cube, number of databases, and number of concurrent users — all of which affect server behavior. Analysis Services provides several scalability parameters, such as handling a large number of partitions or a dimension with a large number of members (>5 million). In this section you will learn about optimizations relevant specifically to scalability of Analysis Services.

For Analysis Services 2005 you can control the number of queries executed concurrently with the advanced configuration property ThreadPool\Query\MaxThreads. The default value 10 works well for most scenarios, but on a machine with many processors it should be increased, or for scenarios combining short-running queries with long-running queries there may be benefits from decreasing the value.

Configuring Server Configuration Properties

There are several configuration properties provided by Analysis Services with which to fine tune the server. The default values for these properties have been set for the most common scenarios. However for specific requirements you might have change these settings to achieve the best performance from Analysis Services. Microsoft recommends you change configuration settings only when working with Microsoft product support. Documentation of all the server configuration properties is expected to be released by Microsoft in 2006. Here we provide information on some of the configuration properties.

  1. Memory\Total Memory Limit: The value for this property is between 0 and 100 since it is a percentage of the available memory. On a 4GB memory machine if you have not enabled the /3GB switch the msmdsrv.exe process will get 2 GB. Be aware that the total memory limit on the server should be 80% of the available memory since Analysis Services is expected to use the remaining 20%.

  2. Memory\Low Memory Limit: Analysis Services contains a cleaner thread which reclaims memory from jobs that are idle or have lower priority. The cleaner thread constantly wakes up and reclaims memory based on the Low Memory Limit. The default value is 75%. As soon as the low memory limit crosses the 75% threshold then the cleaner thread requests existing jobs to shrink memory and starts to reclaim the memory. In certain instances while processing multiple partitions in parallel the cleaner thread might not be fast enough to wake up and reclaim memory. Due to this existing processing, jobs could fail. By lowering the Low Memory Limit you can make sure cleaner thread starts to reclaim memory earlier and hence all the processing jobs will complete.

  3. CoordinatorExecutionMode: This property is used for parallelism of jobs and takes a negative or positive value. The value indicates the number of coordinator jobs that can be in parallel at a given point in time. If the value is negative then the number is multiplied by the number of processors on the machines. Having a high value can deteriorate performance since multiple threads are competing for resources. The default value is -4.

  4. ThreadPool\Query: This node contains several properties such as minimum and maximum number of threads to be allocated for a query and the priority for these threads. The MinThreads property indicates the number of threads created for each query and the MaxThreads property indicates the number of maximum threads that will be allocated for a query. A query can be split into multiple jobs and these jobs can be executed in parallel. Based on the MaxThreads, Analysis Services will have all the jobs execute in parallel (assume there are no dependencies).

  5. ThreadPool\Processing: Similar to ThreadPool\Query this node contains the same subproperties. However, the properties are applicable for each process statement rather than the query.

  6. OLAP\ProcessPlan\MemoryLimit: This value is represented as a percentage of available system memory. This value indicates the maximum amount of memory used for a specific processing job. As mentioned earlier in this chapter we recommend you increase this value up to the Total Memory Limit if you encounter out of memory errors while processing multiple partitions in parallel.

  7. OLAP\ProcessPlan\ForceMultiPass: This holds a Boolean value (0 or 1). When Analysis Services is processing dimensions it processes all the attributes in parallel. While processing the key attribute it retrieves all the related properties. If the number of members in the dimension is large then the entire row might not fit in memory. Hence Analysis Services provides you a way to process this in multiple passes of having chunks of related attributes processed based on memory availability. By default this value is set to 0 and Analysis Services gets into ForceMultiPass code path whenever Analysis Services is under memory pressure. However if you do need to force Analysis Services to use the multi-pass algorithm then you need to set this property to 1.

  8. Query\DefaultDrillthroughMaxRows: While performing drill-through on a specific cell the number of resulting rows can be quite large. This impacts the performance of drill-through. Most often users might look for a top 100 rows. While defining drill-through you can specify the number of rows to be returned. However if that option is not specified in the drill-through statement then the value specified for this property is used to restrict the number of rows returned.

Scale Out

If your cube contains a large number of partitions (thousands of partitions) then queries retrieving data from all the partitions might be slow. This is because Analysis Services needs to read the data from all the partitions and then needs to aggregate the data. For efficient data reads you need to have high speed disks optimized for reads. Keep in mind that the aggregation of the data from various partitions is a CPU intensive operation; you can certainly increase the number of processors on the machine to reduce latency. Another alternative provided by Analysis Services is to make the partitions remote so that multiple Analysis Services are involved in reading data from various partitions. This scale out solution helps in distributing the read and data aggregation on multiple Analysis Services and thereby reduces the overall query time. You do have the master Analysis Services which still needs to gather the data from various slave machines and then aggregate the final data to the end user. We recommend you perform cost benefit analysis where you calculate the total costs of all the Analysis Services machines and the benefit you would be getting from it — all before implementing a solution using remote partitions.

Scale up

For large databases where the queries are data intensive or the load on the system is quite heavy with several users you need to consider adding additional CPUs. Typically commodity machines for servers are 4 processors. Based on your system load consider increasing the number of processors. Your system might be bottlenecked by memory as well. Assuming you have a system with 4GB (maximum on 32-bit machine), by default each process can access up to 2GB of memory. We recommend you change the boot.ini file to enable the /3GB option so that Analysis Services can utilize maximum memory on the system. If your database is large (dimensions having greater than 10 million members) and the user load on your system high, consider moving to a 64-bit system and adding additional processors and memory.

Handling Large dimensions

Certain Analysis Services databases can contain very large dimensions. Analysis Services 2000 had limitations in handling very large dimensions. Analysis Services 2005 overcomes most of these limitations by loading parts of dimensions that are requested. For MOLAP dimensions you might reach the theoretical maximum of 32-bit systems. Based on our experience of working with certain customers with very large dimensions we have identified that you might reach the 32-bit system limit if you have dimensions containing memory in the range of around 10 to 15 million members along with several hundred attributes in the dimensions. Typically when you have customer or product dimension along with various properties of the customer you can encounter processing issues due to unavailability of system resources. You can certainly tweak certain server configuration properties to get the maximum from Analysis Services. However we recommend you to move to 64-bit systems if you have dimensions having more than ten million members. Another suggestion is to have very large dimensions as ROLAP dimensions. Consider the alternatives mentioned in this section while handling large dimensions.



Professional SQL Server Analysis Services 2005 with MDX
Professional SQL Server Analysis Services 2005 with MDX (Programmer to Programmer)
ISBN: 0764579185
EAN: 2147483647
Year: 2004
Pages: 176

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net