1.3 Data Mining


1.3 Data Mining

Data mining is the fusion of statistical modeling, database storage, and AI technologies. Statisticians have been using computers for decades as a means to prove or disprove hypotheses on collected data. In fact, one of the largest software companies in the world "rents" its statistical programs to nearly every government agency and major corporation in the United States: SAS. Linear regressions and other types of modeling analyses are common and have been used in everything from the drug approval process by the Food and Drug Administration to the credit rating of individuals by financial service providers.

Another element in the development of data mining is the increasing capacity for data storage. In the 1970s, most data storage depended upon COBOL programs and storage systems not conducive to easy data extraction for inductive data analysis. Today, however, organizations can store and query terabytes of information in sophisticated data warehouse systems. In addition, the development of multidimensional data models, such as those used in a relational database, has allowed users to move from a transactional view of customers to a more dynamic and analytical way of marketing and retaining their most profitable clients.

However, the final element in data mining's evolution is with AI. During the 1980s machine-learning algorithms were designed to enable software to learn; genetic algorithms were designed to evolve and improve autonomously; and, of course, during that decade, neural networks came into acceptance as powerful programs for classification, prediction, and profiling. During the past decade, intelligent agents were developed that were able to incorporate autonomously all of these AI functions and use them to go out over networks and the Internet to scrounge the planet for information its masters programmed them to retrieve. When combined, these AI technologies enable the creation of applications designed to listen, learn, act, evolve, and identify anything from a potentially fraudulent credit card transaction to the detection of tanks from satellites, and, of course, now more then ever, to prevent potential criminal activity.

As a result of these developments, data mining flowered during the late 1990s, with many commercial, medical, marketing, and manufacturing applications. Retail companies eagerly applied complex analytical capabilities to their data to increase their customer base. The financial community found trends and patterns to predict fluctuations in stock prices and economic demand. Credit card companies used it to target their offerings, microsegmenting their customers and prospects, maneuvering the best possible interest rates to maximize their profits. Telecommunication carriers used the technology to develop "churn" models to predict which customers were about to jump ship and sign with one of their wireless competitors.

The ultimate goal of data mining is the prediction of human behavior, which is by far its most common business application; however, this can easily be modified to meet the objective of detecting and deterring criminals. These and many more applications have demonstrated that rather than requiring a human to attempt to deal with hundreds of descriptive attributes, data mining allows the automatic analysis of databases and the recognition of important trends and behavioral patterns.

Increasingly, crime and terror in our world will be digital in nature. In fact, one of the largest criminal monitoring and detection enterprises in the world is at this very moment using a neural network to look for fraud. The HNC Falcon system uses, in part, a neural network to look for patterns of potential fraud in about 80% of all credit card transactions every second of every day. Likewise, analysts and investigators will come to rely on machines and AI to detect and deter crime and terrorism in today's world. Breakthrough applications are already taking place in which neural networks are being used for forensic analysis of chemical compounds to detect arson and illegal drug manufacturing. Coupled with agent technology, sensors can be deployed to detect bioterrorism attacks. The Defense Advanced Research Projects Agency (DARPA) has already solicited a prototype for such a system.