7.4 Rules Predicting Crime


7.4 Rules Predicting Crime

The partitioning of data sets, such as this border alert sample, can be done autonomously by the software or by the user. For example, the analysts may want to split the data on the basis of a specific driver demographic or based on insurance coverage, rather than the vehicle type. Through this interrogation of the data, an investigator can create homogeneous groupings of potential smugglers and can learn to predict with greater certainty where conditions and the attributes of individuals increase the probability of a smuggling situation. More importantly, because the outputs of these types of algorithms are in graphical formats or rules, greater insight can be gained by analysts about smuggling conditions and smuggler attributes.

Machine learning by definition is rooted in AI and deals with the design, architecture, and application of learning algorithms. For the analyst, this translates to the use of proprietary and commercial data mining tools whose engines are based on machine learning for the segmentation and identification of crimes, such as fraud, as well as the construction of criminal profiles. Essentially, machine learning can be used to calibrate the probability of a crime, such as computer intrusion, money laundering, or smuggling, based on existing conditions. In the case of detecting automobiles with a probability of being used to smuggle drugs across a border point of entry, we demonstrated how decision trees can be used to detect these conditions. Machine-learning algorithms can also generate IF/THEN rules.

Envision this scenario: As an auto approaches an inspection point, the customs or immigration personnel key in the plate number, which is routed to a center where a set of IF/THEN rules resides, created using machine-learning algorithms. The rules themselves have been created using an assortment of information gleaned from department of vehicle registration and insurance records and even neighborhood demographics of individuals and automobiles that have been apprehended in the past attempting to smuggle various types of drug contraband.

       Prediction Rule:        HOUSEHOLD is LiveWithParents        INSURER is 21stCentury        YEAR is 1994        OWNERSHIP is Owned        MAKE is CHEVROLET        Prediction  # 1 : ALERT is High        Relevant rules:       1)   If  HOUSEHOLD is LiveWithParents            and  INSURER is 21stCentury            Then            ALERT is High            Rule's probability: 0.619            The rule exists in 13 records.            Significance Level:   Error probability <      0.01       2)   If  INSURER is 21stCentury            and  MAKE is CHEVROLET            Then            ALERT is High            Rule's probability: 0.944            The rule exists in 68 records.            Significance Level:   Error probability <     0.0001       3)   If  INSURER is 21stCentury            and  OWNERSHIP is Owned            Then            ALERT is High            Rule's probability: 0.755            The rule exists in 253 records.            Significance Level:   Error probability <      0.001       4)   If  INSURER is 21stCentury            and  YEAR is 1994            Then            ALERT is High            Rule's probability: 0.625            The rule exists in 20 records.            Significance Level:   Error probability <       0.1 




Investigative Data Mining for Security and Criminal Detection
Investigative Data Mining for Security and Criminal Detection
ISBN: 0750676132
EAN: 2147483647
Year: 2005
Pages: 232
Authors: Jesus Mena

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net