1.16 The Future
The above is one of many case studies that will be provided in subsequent chapters. In each, the reader will be shown how data mining technologies are being used by innovative investigators, criminalists, and analysts to detect and deter crime and terrorism. These case studies will demonstrate first-hand how link analysis, software agents, text mining, neural networks, and machine-learning are being used for everything from signature detection of illegal drugs to alerts of bio-terrorist attacks. As we said in the beginning of the chapter, the world has changed and so have the weapons, expanding the application of AI technologies for detecting and deterring criminals.
In the aftermath of 9/11, the director of the FBI, Robert S. Mueller, acknowledged that the bureau might have prevented the attacks. "Putting all the pieces together, who is to say?" Mueller said, noting that warning signs amounted to "snippets in a veritable river of information." As part of a major reorganization, the director announced, "The Bureau needs to do a better job of analyzing data and put prevention ahead of all else." With that the FBI took a new strategic focus and a key near-term action to "substantially enhance analytical capabilities with personnel and technology and expand the use of data mining, financial record analysis, and communications analysis to combat terrorism." The future, it appears, has arrived.
Turvey, Brent. (1999) Criminal Profiling: An Introduction to Behavioral Evidence Analysis, San Diego: Academic Press.
Vellani, K., and Nahoun, J. (2001) Applied Criminal Analysis, Boston: Butter-worth-Heinemann.
Chapter 2: Investigative Data Warehousing
2.1 Relevant Data
One of the most difficult and frustrating phases of data mining is getting access to the right data. In government there are always issues between agencies and agreements to be sorted out, not to mention formats that need to be reconciled, all of which require several meetings before arrangements can be made. In private industry, there are the issues of privacy and cost. These are some of the minor, but very real, obstacles that accompany most data mining projects. Of greater significance are the issues revolving around what data is required for the desired objective. However, in the aftermath of 9/11 a new sense of urgency has evolved, in the face of which these obstacles pale in comparison to failing to resolve these data integration issues.
The value of any data mining model is very much dependent on the quality of the data used to construct it; for this reason it is critical that some creative discussions be held and consideration be made about what data is available at the start of the project. Aside from the data that is internally available, thought should be given to what external data sources could provide valuable insight to the data mining analysis. In this chapter we will discuss the closed and open sources of data available both online and offline and how to integrate and prepare the data prior to its analysis.
Data mining is about predicting behavior or profiling individuals; as such, it is critical to have access to timely and relevant information. Without it, the whole process is doomed to failure. For example, in order to construct an accurate link analysis chart of phone calls made by targeted suspects, it is critical to have access to the most current wireless toll records. Similarly, in order to construct predictive models for the profiling of fraudulent transactions or other criminal or terrorist activities, it is equally important to be able to construct a centralized database or to query multiple networks with very relevant and current data. In order to construct a good fraud model, for example, it is critical to have an adequate sampling of all the types of illegal transactions that have been uncovered by, say, an insurance provider, an e-commerce site, or a wireless carrier.