1.4 Investigative Data Warehousing

1.4 Investigative Data Warehousing

Data warehousing is the practice of compiling transactional data with lifestyle demographics for constructing composites of customers and then decomposing them via segmentation reports and data mining techniques to extract profiles or "views" of who they are and what they value. Data warehouse techniques have been practiced for a decade in private industry. These same techniques so far have not been applied to criminal detection and security deterrence; however, they well could be.

Using the same approach, behavioral data from such diverse sources as the Internet (clickstream data captured by Internet mechanisms, such as cookies, invisible graphics, registration forms); demographics from data providers, such as ChoicePoint, CACI, Experian, Acxiom, DataQuick; and utility and telecom usage data, coupled with criminal data, could be used to construct composites representing views of perpetrators, enabling the analysis of similarities and traits, which through data mining could yield predictive models for investigators and analysts. As with private industry, better views of perpetrators could be developed, enabling the detection and prevention of criminal and terrorist activity.

1.5 Link Analysis

Effectively combining multiple sources of data can lead law enforcement investigators to discover patterns to help them be proactive in their investigations. Link analysis is a good start in mapping terrorist activity and criminal intelligence by visualizing associations between entities and events. Link analyses often involve seeing via a chart or a map the associations between suspects and locations, whether by physical contacts or communications in a network, through phone calls or financial transactions, or via the Internet and e-mail. Criminal investigators often use link analysis to begin to answer such questions as "who knew whom and when and where have they been in contact?"

Intelligence analysts and criminal investigators must often correlate enormous amounts of data about individuals in fraudulent, political, terrorist, narcotics, and other criminal organizations. A critical first step in the mining of this data is viewing it in terms of relationships between people and organizations under investigation. One of the first tasks in data mining and criminal detection involves the visualization of these associations, which commonly involves the use of link-analysis charts (Figure 1.1).

click to expand
Figure 1.1: A link analysis can organize views of criminal associations.

Link-analysis technology has been used in the past to identify and track money-laundering transactions by the U.S. Department of the Treasury, Financial Crimes Enforcement Network (FinCEN). Link analysis often explores associations among large numbers of objects of different types. For example, an antiterrorist application might examine relationships among suspects, including their home addresses, hotels they stayed in, wire transfers they received and sent, truck or flight schools attended, and the telephone numbers that they called during a specified period. The ability of link analysis to represent relationships and associations among objects of different types has proven crucial in helping human investigators comprehend complex webs of evidence and draw conclusions that are not apparent from any single piece of information.

1.6 Software Agents

Another AI technology that can be deployed to combat crime and terrorism is the use of intelligent agents for such tasks as information retrieval, monitoring, and reporting. An agent is a software program that performs user-delegated tasks autonomously; for example, an agent can be set up to retrieve information on individuals or companies via the Web or proprietary secured networks. An agent can be assigned tasks, such as compiling a dossier, interpreting its findings, and, following instruction, to act on those findings by issuing predetermined alerts. For example, agent technology is increasingly being used in the area of intrusion detection, for monitoring systems and networks and deterring hacker attacks. An agent is composed of three basic abilities:

  1. Performing tasks: They do information retrieval, filtering, monitoring, and reporting.

  2. Knowledge: They can use programmed rules, or they can learn new rules and evolve.

  3. Communication skills: They have the ability to report to humans and interact with other agents.

Over the past few years, agents have emerged as a new paradigm: they are in part distributed systems, autonomous programs, and artificial life. The concept of agents is an outgrowth of years of research in the fields of AI and robotics. They represent the concepts of reasoning, knowledge representation, and autonomous learning. Agents are automated programs and provide tools for integration across multiple applications and databases running across open and closed networks. They are a means of managing the retrieval, dissemination, and filtering of information, especially from the Internet.

Agents represent new type of computing systems and are one of the more recent developments in the field of AI. They can monitor an environment and issue alerts or go into action, all based on how they are programmed. For the investigative data miner, they can serve the function of software detectives, monitoring, shadowing, recognizing, and retrieving information on suspects for analysis and case development (Figure 1.2).

click to expand
Figure 1.2: Software agents can autonomously monitor events.

Intelligent agents can be used in conjunction with other data mining technologies, so that, for example, an agent could monitor and look for hidden relationships between different events and their associated actions and at a predefined time send data to an inference system, such as a neural network or machine-learning algorithm, for analysis and action. Some agents use sensors that can read identity badges and detect the arrival and departure of users to a network, based on the observed user actions and the duration and frequency of use of certain applications or files. A profile can be created by another component of agents called actors, which can also query a remote database to confirm access clearance. These agent sensors and actor mechanisms can be used over the Internet or other networks to monitor individuals and report on their activities to other data mining models which can issue alerts to security, law enforcement, and other regulatory personnel.