Data Mining Uses and Suggested Guidelines

 < Day Day Up > 

The ceiling frontier model seeks to explain best performance as a function of one or more independent variables. Such models may be especially attractive for many data mining (DM) applications. For example, interest often centers on best instances such as customers most responsive to mailings or safest drivers, etc. For mailings of a given type, it would be desirable to predict a ranking of most responsive customers so that efforts can be best directed. Namely, if only, say, 1,000 are to be mailed, then those predicted as the top 1000 would be attractive for consideration. Similarly, an insurer may be interested in characterizing its best and worst customers according to a model.

Here we briefly propose several potential DM and related applications:

  • Supplier Ranking— In Supply Chain Management, firms must often consider and choose among potential suppliers. While cost is an important variable, many others may need to be considered. These include, for example: lead time performance, quality measures, capacity and flexibility measures, to name only a few. Generally, it will be desirable to select and characterize the best or highest performing suppliers among these.

  • Technology Choice— For the choice of industrial robots, instrumentation and similar technology, it is often possible to test and collect data on several possible choices. Obviously, the firm will be interested in the attributes of the best performers for the selection decision.

  • Total Quality Management— Every year in the total quality management area, the Malcolm Baldrige National Quality Award is given to a small group of firms. The guidelines for winning this prestigious award may not be clearly spelled out. One would be interested in the dimensions along which the award winners may differ from the "average" firm.

  • Marketing— In marketing, the 20/80 rule, sometimes called Pareto's law or principle, is appropriate for modeling the usage rate of the heavy users. Typically, the top 20% (heavy users) of the total number of consumers in the marketplace will account for roughly 80% of total revenue. Thus, it will be quite misleading for a firm to base its marketing strategy on the average purchase behavior of its consumers. A firm in most cases will develop a ceiling model for its "heavy" user group. Likewise, a "floor" model would be appropriate for the nonusers (users that a company may have no hope of getting—or perhaps those it does not wish to have such as high-risk drivers).

  • Airline Productivity— The efficiency frontier models have been used to examine productivity in the airline industry. A Cobb-Douglas total cost function is used for the estimation of the efficiency frontier. The dependent variable is the total cost for an airline. The independent variables are: (a) passenger output (number of passengers times distance traveled); (b) labor cost for an airline; (c) fuel cost for an airline; and (d) capital costs for an airline. Airlines that lie on the efficient frontier are presumably the most production efficient. The inefficiency measures of airlines can be constructed by taking the distance between any airline with that of the frontier.

  • Comparison of Stocks and Mutual Funds— Investors are not only interested in the average performance of the stock market. A ceiling model corresponds to the top performing stocks while a floor model will provide insights into the "poor" performers, firms that may potentially declare bankruptcy. Similarly, investors would like to decide between mutual funds with deeper information than just a simple comparison of recent performances.

  • Employee Loyalty— Employee turnover refers to the loss of trained employees, especially when such losses are early and costly. Data mining could be applied to human resources data marts with the help of floor frontier regression models. Which available attributes best explain the earliest termination cases? Such information could be used to score potential new hires on likelihood to terminate early. In this context the opposite case of most loyal employees might similarly be modeled as a ceiling type model.

By considering the general features of the above examples, we may propose the following suggested guidelines for considering frontier models instead of or in connection with regression data mining applications:

  1. There is interest in characterizing and modeling the best and/or worst cases in the data.

  2. Behaviors of both customers and the businesses that serve them are of the managed kind. In general, such "managed data" or data from purposeful or goal directed behavior will be amenable to frontier modeling.

  3. Some loss of inferential capability can be tolerated (see limitations below).

  4. High-lier data (for ceiling model) and low-lier data (for floor model) can easily be identified and/or adjusted.

 < Day Day Up > 

Managing Data Mining Technologies in Organizations(c) Techniques and Applications
Managing Data Mining Technologies in Organizations: Techniques and Applications
ISBN: 1591400570
EAN: 2147483647
Year: 2003
Pages: 174 © 2008-2017.
If you may any questions please contact us: