11.8 Needles in Moving Haystacks


11.8 Needles in Moving Haystacks

Whereas conventional data mining is sometimes described as finding a needle in a haystack, the EVS architecture would be designed to help analysts, investigators, inspectors, and other personnel to reassemble needles from pieces that have been deliberately hidden in many haystacks. Hence, the challenges for such an EVS would be as follows:

  1. It must be able to deal with patterns involving relations among multiple objects, such as people, places, and events, which are dynamic.

  2. It must be able to make inferences concerning organizations and activities and associate of individuals with them.

  3. The amount of criminal data is small; terrorist-related activity is even scarcer, making it difficult to induce predictive patterns of such activity and to profile these perpetrators. An ensemble mechanism, such as the one proposed in section 9.10, must be used.

  4. Effective learning and inference requires large amounts of general and domain-specific knowledge; because of this, experienced investigators and analysts need to guide the EVS.

  5. Much of the available knowledge concerns patterns of events in time and space, often with large gaps introduced intentionally for the purpose of obscuring clandestine activities and perpetrator identities. Again the EVS must be able to deal with these challenges.

  6. Relevant data are drawn from many different sources, such as databases of sightings, financial transactions, phone calls, travel records, and news stories. Those sources contain many different types of items, such as numbers, text, photos, audio, and video feeds and have varying degrees of reliability, overlap, and correlation. A major task of the EVS is unifying this mixed media into a cohesive view of entity profiles.

  7. The available data represent only a fraction of what could be known; part of effective learning is reasoning about what additional data to request. This will require the direction of experienced analysts in guiding and training the EVS.

  8. The organizations and individuals that wish to avoid detection deliberately obscure patterns of behavior; this fact must be taken into consideration in the general scheme and design of the sources referenced by the EVS.

  9. A single missed terrorist event could have catastrophic consequences, as we now know after 9/11. The EVS must have a low false-positive ratio.

  10. Changes in individuals, organizations, technologies, and political events all produce changes in underlying behavior. The EVS must be adaptive to its environment.

An EVS with these capabilities could greatly improve the detection of perpetrators involved in criminal and terrorist activities, such as money laundering, cybercrime, fraud, entity theft, and weapons of mass destruction acquisition. Conventional data mining techniques and applications, as we have discussed throughout the book, generally fall into two classes: anomaly detection and pattern recognition or signature discovery. Yet another method of seeing how investigative data mining processes work is via clustering and searching for anomalies or outliers, while the other process involves segmentation and classification of criminal signatures, such as fraud or misuse intrusions.

The first type of data mining, anomaly detection, identifies unusual events in a consistent stream of structured data. Such an event might be a sudden shift in telephone calling behavior or increased purchases with a credit card, both of which are indicative of fraud or identity theft, or altered usage of a computer system or application in the context of an intrusion. In this context, a SOM neural network might be used to search for a hidden cluster in a very large data set.

The second class of data mining techniques is pattern discovery, which constructs models from structured data that can be used to infer unknown variables. For example, a discovered pattern might construct a set of IF/ THEN rules that infers a questionable identification based on known patterns of prior deception. In this case inferences are based on the MO of individual instances of captured or killed perpetrators; for example:

      IF DOB                  =      07/06/66      AND SSN issue date      =      02/01/01      AND # of Credit Cards   <      0      EVS                     Score  87 

In this situation a high EVS score of 87 means that the identity of this individual is highly questionable in light of the missing information, lack of credit history, new Social Security number, and high age.




Investigative Data Mining for Security and Criminal Detection
Investigative Data Mining for Security and Criminal Detection
ISBN: 0750676132
EAN: 2147483647
Year: 2005
Pages: 232
Authors: Jesus Mena

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net