7.10 The Rule-Extracting Tools


7.10 The Rule-Extracting Tools

There are also tools for detecting crime and profiling that include the following software for classification using a rule-based approach. Keep in mind that some of these decision tree products, like CART, can also generate rules.

AIRA

http://www.godigital.com.br

AIRA from godigital is a rule discovery and visualization tool; it works as an add-on to Excel.

DataMite

http://www.lpa.co.uk/ind_top.htm

DataMite from Logic Programming Associate enables rules to be discovered in ODBC-compliant relational databases. DataMite will generate rules through a clustering process by combining elements using standard AND, OR, and NOT operators.

SuperQuery

http://www.azmy.com/

SuperQuery from AZMY Thinkware works with multiple data formats; this rule engine generator displays patterns in a database by reporting them as IF/ THEN statements in a Fact Engine window (see Figure 7.21).

click to expand
Figure 7.21: SuperQuery IF/THEN dialog box.

WizWhy

http://www.wizsoft.com/

WizWhy from WizSoft automatically finds all the IF/THEN rules in a database and uses them to summarize the data, identify exceptions, and generate predictions for new cases. This software can generate thousands of rules from a data set, so care must be taken in the setting of error rates. The higher the setting, the fewer rules the tool will generate. WizWhy uses a proprietary algorithm to generate its rules. Some additional features include the following:

  • Performs Boolean as well as multivalue analysis

  • Analyzes the data by discovering all the IF/THEN rules

  • Reveals necessary and sufficient conditions (IF-and-only-IF rules)

  • Calculates the error probability of each rule

  • Calculates the best segmentations of continuous value fields

  • Calculates the prediction power of each field

  • Summarizes the data graphically by presenting the main rules and trends

  • Reveals the interesting phenomena in the data by uncovering unexpected rules

  • Predicts new cases on the basis of the discovered rules

  • Explains predictions by listing relevant rules

  • Calculates the prediction's conclusive probability

  • Calculates the prediction's error probability

A session with WizWhy starts by importing a data set and completing a dialog box selecting the fields to be used in the analysis for generating the rules. In this case the BorderDemo.dbf data set has been prepared with IF/THEN rules used to predict the field ALERT. The fields that will be used to generate the rules are listed in the dialog box field grid (see Figure 7.22). Note that the user can exclude any field and can also analyze any field even if it is empty, an important feature in fraud detection analysis.

click to expand
Figure 7.22: Alert is the field from which rules will be generated.

Next, some settings need to be set for the rules that will be generated. Note, however, that thousands of rules may be generated. With some trial and error, the user will begin to limit the rules that apply to the analysis (see Figure 7.23). Too few rules may yield little insight, while on the other hand thousands of rules will defeat the purpose of the data mining analysis.

click to expand
Figure 7.23: This dialog box in WizWhy allows for the setting of rule parameters.

Once the tool runs, the numbers of rules are shown and a report appears for the user from which various views can be performed. In addition, the rules can be exported via SQL statements or for use with a Predictor component that ships with WizWhy. This enables the user to analyze data, extract predictive rules, and then have an interactive application (see Figure 7.24).

click to expand
Figure 7.24: This is rule 6, from a total of 214 rules. Note the conditions for a high alert.

WizWhy will provide the conditions for predicting a variable, as well as the rule's probability score, a count of how many records of this condition exist in the database, an error probability score, and the actual records in the database where this rule exists in the training database. WizWhy was originally created by a group of mathematicians from Israel; however, the product has been available in the United States for several years. It is a very powerful and accurate rule-extracting data mining tool.




Investigative Data Mining for Security and Criminal Detection
Investigative Data Mining for Security and Criminal Detection
ISBN: 0750676132
EAN: 2147483647
Year: 2005
Pages: 232
Authors: Jesus Mena

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net