List of Figures | Investigative Data Mining for Security and Criminal Detection

Chapter 1: Precrime Data Mining

Figure 1.1: A link analysis can organize views of criminal associations.
Figure 1.2: Software agents can autonomously monitor events.
Figure 1.3: Text mining can extract the core content from millions of records.
Figure 1.4: A neural net can be trained to detect criminal behavior.
Figure 1.5: CATCH— Computer Aided Tracking and Characterization of Homicides.
Figure 1.6: September 11, Boston to New York, 8—30AM.
Figure 1.7: Illustrative example of the encoding of height as zero or one.
Figure 1.8: Derived cluster sizes.
Figure 1.9: Symbolic descriptions of clusters.
Figure 1.10: Dendrogram for hierarchical agglomerative clustering of SOM cluster centres.
Figure 1.11: SOM map following merging of spatially near neighbors.

Chapter 2: Investigative Data Warehousing

Figure 2.1: Sample record extract (criminal record detail).
Figure 2.2: The iManageData interface.

Chapter 3: Link Analysis: Visualizing Associations

Figure 3.1: A financial link analysis network.
Figure 3.2: A NETMAP link chart displaying financial relationships.
Figure 3.3: The Link Notebook supports zoom in features.
Figure 3.4: A timeline displaying time-related events.
Figure 3.5: Confirmed links are shown as solid lines.
Figure 3.6: Unconfirmed associations are dashed lines.
Figure 3.7: Members of an organization are grouped inside a box.
Figure 3.8: An organization can be aggregated as an entity.
Figure 3.9: The central contact is unknown.
Figure 3.10: Here Entity 1 is ID.
Figure 3.11: The links are the intelligence.
Figure 3.12: A sample of a chart with a legend.
Figure 3.13: A telephone toll analysis chart.
Figure 3.14: Voluminous amounts of data can lead to vague charts.
Figure 3.15: An analyst can move events and change the chart as needed.
Figure 3.16: Events are placed on the theme they relate to.
Figure 3.17: Several events can also be combined.
Figure 3.18: Multiple events and transactions can be mapped.
Figure 3.19: The association matrix in Crime Link.
Figure 3.20: From the matrix Crime Link generates its diagrams.
Figure 3.21: A Daisy chart showing a date and time analysis.
Figure 3.22: The formats supported by NETMAP.
Figure 3.23: This chart shows the link between the nodes at both ends.
Figure 3.24: An ORIONLink sample diagram.

Chapter 4: Intelligent Agents: Software Detectives

Figure 4.1: Bio-terrorism system using agents with sensors.
Figure 4.2: Agent system would serve to provide early detection.
Figure 4.3: Agentland.com provides agent software for downloading.
Figure 4.4: A menu of development agent software available.
Figure 4.5: The completed agent form.
Figure 4.6: A list is generated with scores of relevance associated with them.

Chapter 5: Text Mining: Clustering Concepts

Figure 5.1: Topics derived from clustering 60,000 news reports.
Figure 5.2: An 86-word summary of the news stories.
Figure 5.3: WordStat univariate word-frequency analysis.
Figure 5.4: ClearForest taxonomy graphical view of an individual.
Figure 5.5: Dynamic view of relationships.
Figure 5.6: TextRoller summary results.
Figure 5.7: A Leximancer concept map of 155 Internet news groups.
Figure 5.8: TripleHop's three-layer architecture.
Figure 5.9: The VisualText GUI interface.

Chapter 6: Neural Networks: Classifying Patterns

Figure 6.1: This is the suspect the police are searching for.
Figure 6.2: Attrasoft ImageFinder during training.
Figure 6.3: System recognized the suspect wearing a hat.
Figure 6.4: System recognized suspect with a beard.
Figure 6.5: This is how the data looks in our Border Profile database.
Figure 6.6: The different colors represent different stages of alerts.
Figure 6.7: The cluster of arrests can be marked and exported to a file.
Figure 6.8: Example given to the neural network. The C-12 denotes the position of dodecane.
Figure 6.9: One of the two matches found by the neural network. The C-12 denotes the position of dodecane.
Figure 6.10: A second match found by the neural network. The C-12 denotes the position of dodecane.
Figure 6.11: The closest non-match found by the neural network. The C-12 denotes the position of dodecane.
Figure 6.12: The CRISPDM methodology.
Figure 6.13: Primary network of offenders.
Figure 6.14: Distance chart.
Figure 6.15: Crimes by time of day.
Figure 6.16: Crimes by day of week.
Figure 6.17: Spatial analysis.
Figure 6.18: Schematic data flow.
Figure 6.19: Panes allow the user to visualize the network training results.
Figure 6.20: Training to recognize the number 5.

Chapter 7: Machine Learning: Developing Profiles

Figure 7.1: Decision tree used to predict probability of smuggling by make of auto.
Figure 7.2: The Anti-Drug Network (ADNET).
Figure 7.3: The ADNET control center.
Figure 7.4: Eleven sets of training, testing, validation data (33 sets in all).
Figure 7.5: The data was rotated in the training, testing, and validation phases.
Figure 7.6: Five algorithms on six data sets yielded different results.
Figure 7.7: Model ensembles make decisions by committee of algorithms.
Figure 7.8: Data is prepared for mining.
Figure 7.9: Model creation stream in Clementine.
Figure 7.10: Results of final models.
Figure 7.11: Overall model score on validation data.
Figure 7.12: Alice decision tree interface.
Figure 7.13: Alice d'Isoft 6.0 decision tree output.
Figure 7.14: Business Miner decision tree interface.
Figure 7.15: This is the CART interface for model setup.
Figure 7.16: The CART binary trees.
Figure 7.17: Lift charts for each class from the decision trees can be viewed.
Figure 7.18: This instrument displays the variables of most importance.
Figure 7.19: The rates of prediction for training and testing classes can be viewed.
Figure 7.20: Sample of CART rules.
Figure 7.21: SuperQuery IF/THEN dialog box.
Figure 7.22: Alert is the field from which rules will be generated.
Figure 7.23: This dialog box in WizWhy allows for the setting of rule parameters.
Figure 7.24: This is rule 6, from a total of 214 rules. Note the conditions for a high alert.
Figure 7.25: Decision trees can be split on any desired variable in the database.
Figure 7.26: Decision tree split on the basis of vehicle make.
Figure 7.27: Multiple analyses can be performed by inserting them via a drop window.
Figure 7.28: Note the improved performance at 70% of population.
Figure 7.29: Rules can be produced in various formats.
Figure 7.30: Partial view of rules generated in Java from this tool.
Figure 7.31: The Neural Net Wizard interface.
Figure 7.32: This is PolyAnalyst's main window.
Figure 7.33: This is the data import wizard interface.
Figure 7.34: The Visual Rule Assistant simplifies rule generation.
Figure 7.35: Decision tree interface with summary statistic window.
Figure 7.36: A schematic decision tree.
Figure 7.37: Decisionhouse graphical displays.
Figure 7.38: Enterprise Miner's SEMMA process.
Figure 7.39: Clementine uses icons to perform data mining analyses.
Figure 7.40: NCR's Data Mining Method and Teradata Warehouse Miner Technolgoy.

Chapter 8: NetFraud: A Case Study

Figure 8.1: Associations between products and fraud. Note the bold line between hardware/software and fraud.
Figure 8.2: A clustering map where light shades are legal and dark areas are fraudulent transactions.
Figure 8.3: We mark the section of fraudulent transactions.
Figure 8.4: Camcorders with an average price of $1,052 are a major target for fraud.
Figure 8.5: The error rate is only about 8% for this neural-network model.
Figure 8.6: This sensitivity instrument prioritizes the inputs for a fraud model.
Figure 8.7: A view of the training of the perceptron neural network.
Figure 8.8: Decision trees can uncover hidden ranges where fraud is higher than average.
Figure 8.9: As fraud statistics show, computer equipment is high on criminals' lists.
Figure 8.10: Fraud is highest in households where the median rent is $425-$548.

Chapter 9: Criminal Patterns: Detection Techniques

Figure 9.1: The CRISP-DM process.

Chapter 10: Intrusion Detection: Techniques and Systems

Figure 10.1: Thirty-day summary of File Transfer Protocol connections.
Figure 10.2: An IDS is only part of the entire deterrence process.

Chapter 11: The Entity Validation System (EVS): A Conceptual Architecture

Figure 11.1: Incremental profiles are distributed.

Chapter 12: Mapping Crime: Clustering Case Work

Figure 12.1: MAPS links to crime maps and statistics to various cities.
Figure 12.2: A view of crimes by types in the central part of the city.
Figure 12.3: San Diego interactive crime map.
Figure 12.4: Approach and verbal-themes behavior.
Figure 12.5: Approach and precautions behavior.
Figure 12.6: The SOM represents about 5,000 murders in the HITS database.
Figure 12.7: Crimes are mapped by modus operandi descriptions.
Figure 12.8: Order and description of crimes such as rape, serial and rituals can be queried.
Figure 12.9: The figure shows all the crime data vectors as points in a three-dimensional eigenspace.
Figure 12.10: Crimes can be mapped along highways.
Figure 12.11: Similarity of crimes can be viewed and measured via a grid.
Figure 12.12: Comparison of crime types can be measured.
Figure 12.13: Probability and distance of crimes by the same perpetrator can be graphed.
Figure 12.14: The solid line in the graph shows the probability of finding two sexual assaults by one serial rapist n number of cells apart.