Chapter 10: Incorporating Data Mining

We dig up diamonds by the score A thousand rubies, sometimes more

From Snow White by Walt Disney Company, music by Frank Churchill, words by Larry Morey, 1938

Overview

Data mining is not a single topic; its a loosely related collection of tools, algorithms, techniques, and processes. This makes it a difficult subject area to tackle, especially in a single chapter. However, we must tackle it for two main reasons: First, data mining offers the potential of huge business impact; and second, SQL Server 2005 includes a suite of data mining tools as part of the product. In short, high value, low costthe motivation is obvious.

The first part of this chapter sets the context for data mining. We begin with a brief definition of data mining and an overview of the business motivation for using it. We then look at the Microsoft data mining architecture and environment provided as part of SQL Server 2005, including a brief description of the data mining service, the algorithms provided, and the kinds of problems for which they might be appropriate. We next present a high-level data mining process. The process breaks into three phases: business, mining, and operations. The business phase involves identifying business opportunities and understanding the data resources. The data mining phase is a highly iterative and exploratory process whose goal is to identify the best model possible, given the time and resource constraints. Once you identify the best model, you need to implement it in a production form where, you hope, it will provide the intended business value.

The second part of the chapter puts these concepts and processes into practice by demonstrating the application of SQL Server 2005 data mining in two examples. The first example creates clusters of cities based on economic data, and the second creates a model to recommend products for the Adventure Works Cycles web site.

By the end of this chapter, you should have a good sense for what data mining is about, what the SQL Server 2005 data mining toolset includes, and for the high-level process for data mining. And although this is not a tutorial, you should also end up with a basic idea of how to use the SQL Server 2005 data mining toolset.



Microsoft Data Warehouse Toolkit. With SQL Server 2005 and the Microsoft Business Intelligence Toolset
The MicrosoftВ Data Warehouse Toolkit: With SQL ServerВ 2005 and the MicrosoftВ Business Intelligence Toolset
ISBN: B000YIVXC2
EAN: N/A
Year: 2006
Pages: 125

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net