Data mining techniques are specific implementations of algorithms used in data mining operations. The five most common data mining techniques are described briefly below. Associations DiscoveryThis data mining technique is used to identify the behavior of specific events or processes. Associations discovery links occurrences within a single event. An example might be the discovery that men who purchase premium brands of coffee are three times more likely to buy imported cigars than men who buy standard brands of coffee. Associations discovery is based on rules that follow this general form: "If item A is part of an event, then X percent of the time (confidence factor), item B is part of the same event." For example:
With the help of scanners , retail stores use this data mining technique to find buying patterns in grocery stores. Because of the context of a grocery store, associations discovery is sometimes called market basket analysis . Sequential Pattern DiscoveryThis data mining technique is similar to associations discovery except that a sequential pattern discovery links events over time and determines how items relate to each other over time. For example, sequential pattern discovery might predict that a person who buys a washing machine may also buy a clothes dryer within six months with a probability of 0.7. To increase the chances above the predicted 70 percent probability, the store may offer each buyer a 10 percent discount on a clothes dryer within four months after purchasing a washing machine. ClassificationThe classification technique is the most common use of data mining. Classification looks at the behavior and attributes of predetermined groups. The groups might include frequent flyers, high spenders, loyal customers, people who respond to direct mail campaigns , or people with frequent back problems (e.g., people who drive long distances every day). The data mining tool can assign classifications to new data by examining existing data that has already been classified and by using those results to infer a set of rules. The set of rules is then applied to any new data to be classified. This technique often uses supervised induction, which employs a small training set of already classified records to determine additional classifications. An example of this use is to discover the characteristics of customers who are (or are not) likely to buy a certain type of product. This knowledge would result in reducing the costs of promotions and direct mailings . ClusteringThe clustering technique is used to discover different groupings within the data. Clustering is similar to classification except that no groups have yet been defined at the outset of running the data mining tool. The clustering technique often uses neural networks or statistical methods . Clustering divides items into groups based on the similarities the data mining tool finds. Within a cluster the members are very similar, but the clusters themselves are very dis similar. Clustering is used for problems such as detecting manufacturing defects or finding affinity groups for credit cards. ForecastingThe forecasting data mining technique comes in two flavors: regression analysis and time sequence discovery.
|