The partitioning of data sets, such as this border alert sample, can be done autonomously by the software or by the user. For example, the analysts may want to split the data on the basis of a specific driver demographic or based on insurance coverage, rather than the vehicle type. Through this interrogation of the data, an investigator can create homogeneous groupings of potential smugglers and can learn to predict with greater certainty where conditions and the attributes of individuals increase the probability of a smuggling situation. More importantly, because the outputs of these types of algorithms are in graphical formats or rules, greater insight can be gained by analysts about smuggling conditions and smuggler attributes.
Machine learning by definition is rooted in AI and deals with the design, architecture, and application of learning algorithms. For the analyst, this translates to the use of proprietary and commercial data mining tools whose engines are based on machine learning for the segmentation and identification of crimes, such as fraud, as well as the construction of criminal profiles. Essentially, machine learning can be used to calibrate the probability of a crime, such as computer intrusion, money laundering, or smuggling, based on existing conditions. In the case of detecting automobiles with a probability of being used to smuggle drugs across a border point of entry, we demonstrated how decision trees can be used to detect these conditions. Machine-learning algorithms can also generate IF/THEN rules.
Envision this scenario: As an auto approaches an inspection point, the customs or immigration personnel key in the plate number, which is routed to a center where a set of IF/THEN rules resides, created using machine-learning algorithms. The rules themselves have been created using an assortment of information gleaned from department of vehicle registration and insurance records and even neighborhood demographics of individuals and automobiles that have been apprehended in the past attempting to smuggle various types of drug contraband.
Prediction Rule: HOUSEHOLD is LiveWithParents INSURER is 21stCentury YEAR is 1994 OWNERSHIP is Owned MAKE is CHEVROLET Prediction # 1 : ALERT is High Relevant rules: 1) If HOUSEHOLD is LiveWithParents and INSURER is 21stCentury Then ALERT is High Rule's probability: 0.619 The rule exists in 13 records. Significance Level: Error probability < 0.01 2) If INSURER is 21stCentury and MAKE is CHEVROLET Then ALERT is High Rule's probability: 0.944 The rule exists in 68 records. Significance Level: Error probability < 0.0001 3) If INSURER is 21stCentury and OWNERSHIP is Owned Then ALERT is High Rule's probability: 0.755 The rule exists in 253 records. Significance Level: Error probability < 0.001 4) If INSURER is 21stCentury and YEAR is 1994 Then ALERT is High Rule's probability: 0.625 The rule exists in 20 records. Significance Level: Error probability < 0.1