9.10 Ensemble Mechanisms for Crime Detection


9.10 Ensemble Mechanisms for Crime Detection

The optimal methodology for crime detection is to avoid designing a system that is rigid and based on a set of rules or thresholds that can be easily avoided by knowledgeable perpetrators. In addition, the detection system and models cannot be based on a single algorithm or technique; instead they should be based on the paradigm of a committee of models. The methodology should employ an ensemble of techniques and models, with each providing a vote of confidence regarding the legitimacy of any transaction, whether it be credit-card purchase, insurance claim, phone call, or any other type of event an investigator or analysts is trying to detect. The following are some essential steps that need to be considered in this process.

Random Sampling

Random samples of criminal evidence should be collected and used in the construction of multiple models. That is, fraudulent transactions need to be used in the analysis and an adequate sample for each crime that needs detection is required. For example, to construct a model to detect theft through credit cards on an e-commerce site, random samples of criminal transactions should be sampled from different product lines and, if possible, different times of days. Models will require samples of legal transactions in order to discriminate them from the criminal ones. This is especially important when using a neural network as one of the components in the ensemble.

Balance the Data

Another important step is having an adequate sampling of all types of criminal transactions along with an even number of legal ones. Again, this is especially important when working with neural networks, as they are essentially software memories that need an adequate number of observations to be able to recognize the phenomena (crime) the models are attempting to detect.

Split the Data

It is a common and standard practice in data mining to split the data into at least two parts, a training data set for constructing predictive models and a testing data set for evaluating the prediction and error rates of the models. However, due to the nature of criminal detection, where samples tend to be very small, such as that of fraudulent transactions, an extra step should be taken and the data should be split into three segments: training, testing, and validating datasets.

Rotate the Data

As an extra precaution, aside from splitting the data into three segments for training, testing and validating, they should also be rotated; again, this is because of the relatively small samples of criminal transactions. The rotation can be as follows:

Training

Data set 1

Data set 2

Data set 3

Results A

Testing

Data set 2

Data set 3

Data set 1

Results B

Validating

Data set 3

Data set 1

Data set 2

Results C

Results A

Results B

Results C

Solution Z

In this rotation scheme, the objective is to arrive at an average optimum of Solution Z by averaging the results of A, B, and C.

Evaluate Multiple Models

The optimum method for ensuring the best possible detection system is through the use and comparison of multiple models created with decision trees, rule induction generators, and neural networks. If time permits, multiple neural network architectures should be tried and tested to optimize their performance. In addition, the neural networks should be optimized if possible using a genetic algorithm component, which some data mining tools provide.

Combine Models

After evaluating and comparing individual models, create an ensemble:

Decision tree

Data set 1

Data set 2

Data set 3

Results A

Rule generator

Data set 2

Data set 3

Data set 1

Results B

Neural network

Data set 3

Data set 1

Data set 2

Results C

Results A

Results B

Results C

Solution Z

The model ensemble combines the results from the best-performing decision tree, rule generator, and neural networks to optimize the detection capability. This scheme is also known as bagging for combining the prediction results from multiple models or from the same type of model for different data. It is used to address the inherent instability of results when applying complex models to relatively small data sets. Another technique, called boosting, is to derive weights to combine the predictions from those models in which greater weight goes to those observations that are difficult to classify (fraud transactions) and lower weights to those that are easy to classify (legal transactions).

Measure False Positives

Consideration should be given to the overall performance of the model ensemble, especially in regard to the misclassification costs—that is, the wrong alerts (false positives) and undiscovered cases (false negatives):

Crime

No crime

Alert

Correct

False positive

Alert

No alert

False negative

Correct

No alert

Cost considerations are different for both of these misclassification errors. For false positives, for example, there is the manpower consideration for checking on a potential criminal case that turns out to be legal. For false negatives, the cost is in the potential loss of revenue by not identifying a criminal incident. This may turn out to be the most expensive of both errors.

The biggest obstacle in any criminal detection system is the prediction of false positives. This is where a legitimate transaction, conducted by a legitimate credit-card or cell-phone customer or insurance policy holder, is classified as potentially fraudulent. There are two problems with false positives. If this happens too often, the (valuable) legitimate customer is likely to get upset by the intrusion and the delay of his business. Worse yet is the implication that he or she is a criminal. Also, if the numbers of false positives are large, the business or governmental process may be unable to handle the required investigative procedures, which may be manual and expensive, such as assigning suspected cases of fraud to a call center operator to handle or the detainment of individuals at a point of entry at a border or at an airport.

Deploy and Monitor

Once the model ensemble has been built and tested, it must be deployed in a real-time business environment. This can be accomplished by the export of its rules, weights (formulas), and assorted code from the models, which most data mining tools support. The continuously changing environment and skewed distributions, coupled with the cost-sensitive requirements, complicate the evaluation of the performance of a criminal-detection system. However, it must be continuously evaluated and improved, as the methods of criminal behavior change and attempts to go undetected will no doubt continue. Crime does not stop, nor should the mining of its pattern of behavior.

Compounding the challenge of data mining for criminal detection is that often systems rely on a history of fraud examples, which means they are often only capable of detecting fraud activities that match or correspond closely with those training samples, leaving many more fraudulent activities undetected. There is no single technique or model that will solve this. However, the ensemble-of-models technique provides an optimum method for making the best possible decision on the predicted legitimacy of the transaction, along with a measure of confidence to support that decision and course of action through continuous monitoring, evaluation, and improvement. Finally, as a means to identify new, undetected criminal transactions, a clustering analysis should be performed, using a SOM neural network. The objective is to look for transactions that fall out of the pattern of normal transactions, to look for outliers. These are cases that don't adhere to the way the majority of business activities are carried out.

Resources

AI and Fraud Detection
http://www.dinkla.net/fraud

AI Techniques in Fraud Management
http://www.aaai.org/AITopics/html/fraud.html

Association of Certified Fraud Examiners
http://www.cfenet.com

Communications Fraud Control Association
http://www.cfca.org

National Fraud Information Center
http://www.fraud.org

National Healthcare Anti-Fraud Association
http://www.nhcaa.org




Investigative Data Mining for Security and Criminal Detection
Investigative Data Mining for Security and Criminal Detection
ISBN: 0750676132
EAN: 2147483647
Year: 2005
Pages: 232
Authors: Jesus Mena

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net