1.12 Criminal Analysis and Data Mining

1.12 Criminal Analysis and Data Mining

Data mining is a process that uses various statistical and pattern-recognition techniques to discover patterns and relationships in data. It does not include business intelligence tools, such as query and reporting tools, on-line analytic processing (OLAP), or decision support systems. Those tools report on data and answer predefined questions, whereas data mining tools focus on finding previously unknown patterns and relationships among variables—in this case, for detecting and preventing criminal activity. While some will argue that forensics only applies to sciences used in court for convictions, the objective of recognizing threats and crime is also extremely important.

Unlike criminology, which re-enacts a crime in order to solve it, criminal analysis uses historical observations to come up with solutions. In criminal analysis, statistical examinations are performed on the frequency of specific crimes in order to evaluate the security of property and persons. Criminal analysis involves very careful evaluation of the location, time, and type of crime that has been committed at a building, neighborhood, beat, city, county, etc. Crime statistics, risks and probabilities are very much what criminal analysis is all about. Data mining, as with criminal analysis, has the same overall goal: the detection and prevention of crimes. The following scenario provides a good example of how criminal analysis works: A security professional in a large office building maintains information about all the criminal activity that has taken place on his property over three years, including the following incidents:

     Auto Thefts                179
     Office Thefts              142
     Auto Break-in Thefts       211
     Robberies                   17
     Burglaries of Offices       46
     Aggravated Assaults         21
     Rapes                        2
     Murders                      0

One of the most important tasks of criminal analysis is to breakdown the pattern of crimes to evaluate when, where, and why they are occurring. In the case of this particular building, for example, the objective is to reduce crime by improving security. This type of analysis, however, is not as much offender-specific as target-specific; in other words, it begs the question "why is the garage a target for such a high rate of thefts?" By focusing on when, where, and why break-in auto thefts are taking place, preventive security measures can be taken to deter future criminal acts. Through research and the documentation of crimes and categorization by type of offenses, location, and time, gradual patterns and trends will emerge, which will lead to preventive solutions. This type of criminal analysis can be automated through the use of data mining for uncovering subtle patterns in large data sets.

Obviously, understanding the environment in which crime takes place is very important in criminal analysis. In this example, examining where crimes are taking place is critical; locations must be broken down by categories into main areas, such as the main entrance, side entrances, offices, common areas, walkways to the building from the garage, walkways from the streets, and the parking garage. In addition, the surrounding areas must be considered, such as adjoining buildings, strip malls, parks, residential neighborhood, etc.

In order to gauge the level of crime at this particular building, a comparison of crime data statistics can be considered by the analyst; for example, how does the rate of auto thefts for the property compare with the rate for the same crime at the local law enforcement agency levels, at the beat, district, precinct, city, county, metropolitan statistical area (MSA), state, and national levels. Using the FBI's Uniform Crime Report (UCR) codification system, rate comparisons can be made by following categories:

  1. Murder

  2. Rape

  3. Robbery

  4. Aggravated assault

  5. Burglary

  6. Theft

  7. Motor vehicle theft

  8. Arson

  9. Other assaults

  10. Forgery and counterfeiting

  11. Fraud

  12. Embezzlement

  13. Stolen property (buying, receiving, possessing)

  14. Vandalism

  15. Weapons (carrying, possessing, etc.)

  16. Prostitution and commercialized vice

  17. Sex offenses

  18. Drug abuse violations

  19. Gambling

  20. Offense against the family and children

  21. Driving under the influence

  22. Liquor laws

  23. Drunkenness

  24. Disorderly conduct

  25. Vagrancy

  26. All other offenses

  27. Suspicion

  28. Curfew and loitering laws (persons under 18)

  29. Runaways (persons under 18)

To compute the comparison crime rates the following formulas can be used:

     For violent crime rate (VCR) formula for building

           VCR = (total violent crime/average
                  daily traffic) x 1,000

     For violent crime rate (VCR) formula for beat,
     city, county, state, and nation:

           VCR = (total violent crime/population) x 1, 000

     For property crime rate (PCR) formula for building

            PCR = (total property crime/number
                   of targets) x 1,000

Because property crime is target-specific it must be computed differently as these crimes are not against individuals. It is worth noting that criminal analysis is very much interested in statistics, rates of occurrence, risk, probabilities, trend, and patterns, all of which can be improved through the use of data mining for detection and deterrence. A similar understanding of the environment and the targets of crime can be applied to other situations, so that rather than a building, we might perform a criminal analysis inventory of an e-commerce Web site for illegal hacking intrusions into a server.

The next phase of this type of criminal analysis is to use data mining, given the fact that a security expert or law enforcement investigator must deal with hundreds of thousands of transactions, e-mails, system calls, wire transfers, and the like for examining digital crimes. This calls for an automated methodology for behavioral profiling via pattern-recognition techniques. Data mining can provide a new dimension to criminal analysis, especially in digital crimes such as entity theft; credit card, insurance, Internet, and wireless fraud; and money laundering, where investigators and analysts must deal with large volumes of transactions in large databases. Data mining has traditionally been used to predict consumer preferences and to profile prospects for products and services; however, in the current environment, there is a compelling need to use this same technology to discover, detect, and deter criminal activity to improve the security of property, people, and countries.