7.8 Machine-Learning Criminal Patterns

Digital crimes can take place in a variety of situations and through various methods of operation:

Intrusion into computers or networks
Insurance and health care crimes
Money laundering
Credit-card fraud
Telecom fraud
Identity theft
NetFraud

Detecting these types of criminal activities follows a basic methodology, which, although they are not all inclusive, will generally follow these steps:

Evidence is gathered where criminal transactions have been discovered.
These transactions or observations are examined using a visualization tool.
Cross-referenced demographics or other third-party data that is relevant and appropriate is associated to these cases.
A link analysis or geo-mapping tool may be used to look for potential associations and trends in temporal and spatial dimensions.
A text mining tool may be used for the discovery of hidden concepts if large amounts of documents, HTML, e-mails, etc., are involved.
An intelligent agent may be used to retrieve and collate additional information to assemble with the criminal case from other networks or the Web.
A SOM network may be used to develop hidden clusters in the data, or a back-propagation network may be used to develop a model of the crimes.
The final phase of the process involves the use of machine-learning-based tools and techniques, as discussed in this chapter, for extracting rules and decision trees from the data for predicting crimes and profiling perpetrators.

One thing is certain. The process is rooted in pattern-recognition techniques and software tools with origins in AI. Not only are these tools ideal for combating these type of crimes, they can also be deployed to detect and deter terrorist attacks, such as those involving weapons of mass destruction and biological agents.

Detecting and deterring crime through data mining is particularly challenging due to several factors. First, although there is a vast amount of data available, there is usually only a small number of observations that represent criminal behavior, such as on-line fraud. In statistical terms, the distribution of the data is highly skewed. This fact is important to note because empirical studies of classification systems indicate that symbolic classifiers, such as those discussed in this chapter, are the most effective weapons for classifying highly skewed data sets.

Data mining is a tool for the human investigator. It does not replace the analyst responsible for the security of a system or for the detection of the criminal acts, although data mining assists analysts in sorting through hundreds of thousands of records enabling them to "connect the dots." For each type of crime, a data mining model has to be created in order for the analyst be able to detect its unique signature. Compounding this challenge is the fact that criminals are not stupid. They will intentionally modify their methods of operation; as such, a data mining detection system has to be adapted to recognize these new patterns constantly and continuously. Investigative data mining is a process, not a single-product solution.

Yet another challenge is that a data mining detection system needs to have a very fast response time in order to minimize the monetary losses and damages to systems and networks, individuals, and companies. For example, for the detection of credit-card and Internet fraud or system intrusion, real-time processing is necessary. However, this is not a problem that cannot be solved. Web mining for e-commerce is commonly done today for making real-time offers to consumers. The same type of Internet and wireless techniques and technologies can be used in conjunction with data mining models to detect and deter these types of crimes in real time.

Lastly, there are two types of errors in the detection and classification of these types of crimes: false alarms (false positives) and undiscovered cases (false negatives). Often an alert of a suspected crime needs verification by human personnel and may require special processing, such as putting a transaction in a special queue or status. A false positive needs special attention and time, while a false negative may cause further losses. In other words, the costs of both are different. However, in both instances, consideration must be given that doing nothing is the worst possible action and option facing a business, government agency, or law enforcement unit. The cost of doing nothing may, in time, be the most expensive option of all, especially in situations involving the destruction of trust, data, systems, property, and human life.