6.4 Data Mining and SQL Server Analysis Services

 < Day Day Up > 



When building a data mining model, a set of training data needs to be collected that is based on accurate data from previous activities. For example, if you are selling SQL Server training courses, you collect the historical demographic data of course attendees and run that through your model to ensure that the results were as expected.

Once the training data has been assembled, the appropriate algorithm needs to be chosen. There are two data-mining algorithms used in analysis services, both of which are based on statistical theories that have been around for a number of years.

  1. Decision trees represent the data classification questions as nodes on a tree or a branch-like structure. The predictive nature of the algorithm is based on the training data set, influencing where the node is located and the depth of the node in the structure.

  2. Clustering, or the expectation method, groups data into clusters or neighborhoods of similar predictable characteristics. In many instances the clusters are counterintuitive and obscure, but that is the whole point of data mining!

The stored data-mining model is known as a model node, which contains detailed information, such as probabilities, attributes, and data descriptions. Within analysis services is a data model browsing tool, which visualizes the model content into something most people would understand.

Some of the real power of data warehousing is now being uncovered, as more and more organizations build Web sites that track a user's site journey and purchasing habits. When installed, Commerce Server builds a comprehensive data warehouse on the underlying SQL Server, which in turn can be used to analzse a range of user activities. If you haven't been asked to build a data warehouse yet, the chances are you will as the demand for business intelligence increases.



 < Day Day Up > 



Microsoft  .NET. Jumpstart for Systems Administrators and Developers
Microsoft .NET: Jumpstart for Systems Administrators and Developers (Communications (Digital Press))
ISBN: 1555582850
EAN: 2147483647
Year: 2003
Pages: 136
Authors: Nigel Stanley

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net