There are also tools for detecting crime and profiling that include the following software for classification using a rule-based approach. Keep in mind that some of these decision tree products, like CART, can also generate rules.
http://www.godigital.com.br
AIRA from godigital is a rule discovery and visualization tool; it works as an add-on to Excel.
http://www.lpa.co.uk/ind_top.htm
DataMite from Logic Programming Associate enables rules to be discovered in ODBC-compliant relational databases. DataMite will generate rules through a clustering process by combining elements using standard AND, OR, and NOT operators.
http://www.azmy.com/
SuperQuery from AZMY Thinkware works with multiple data formats; this rule engine generator displays patterns in a database by reporting them as IF/ THEN statements in a Fact Engine window (see Figure 7.21).
Figure 7.21: SuperQuery IF/THEN dialog box.
http://www.wizsoft.com/
WizWhy from WizSoft automatically finds all the IF/THEN rules in a database and uses them to summarize the data, identify exceptions, and generate predictions for new cases. This software can generate thousands of rules from a data set, so care must be taken in the setting of error rates. The higher the setting, the fewer rules the tool will generate. WizWhy uses a proprietary algorithm to generate its rules. Some additional features include the following:
Performs Boolean as well as multivalue analysis
Analyzes the data by discovering all the IF/THEN rules
Reveals necessary and sufficient conditions (IF-and-only-IF rules)
Calculates the error probability of each rule
Calculates the best segmentations of continuous value fields
Calculates the prediction power of each field
Summarizes the data graphically by presenting the main rules and trends
Reveals the interesting phenomena in the data by uncovering unexpected rules
Predicts new cases on the basis of the discovered rules
Explains predictions by listing relevant rules
Calculates the prediction's conclusive probability
Calculates the prediction's error probability
A session with WizWhy starts by importing a data set and completing a dialog box selecting the fields to be used in the analysis for generating the rules. In this case the BorderDemo.dbf data set has been prepared with IF/THEN rules used to predict the field ALERT. The fields that will be used to generate the rules are listed in the dialog box field grid (see Figure 7.22). Note that the user can exclude any field and can also analyze any field even if it is empty, an important feature in fraud detection analysis.
Figure 7.22: Alert is the field from which rules will be generated.
Next, some settings need to be set for the rules that will be generated. Note, however, that thousands of rules may be generated. With some trial and error, the user will begin to limit the rules that apply to the analysis (see Figure 7.23). Too few rules may yield little insight, while on the other hand thousands of rules will defeat the purpose of the data mining analysis.
Figure 7.23: This dialog box in WizWhy allows for the setting of rule parameters.
Once the tool runs, the numbers of rules are shown and a report appears for the user from which various views can be performed. In addition, the rules can be exported via SQL statements or for use with a Predictor component that ships with WizWhy. This enables the user to analyze data, extract predictive rules, and then have an interactive application (see Figure 7.24).
Figure 7.24: This is rule 6, from a total of 214 rules. Note the conditions for a high alert.
WizWhy will provide the conditions for predicting a variable, as well as the rule's probability score, a count of how many records of this condition exist in the database, an error probability score, and the actual records in the database where this rule exists in the training database. WizWhy was originally created by a group of mathematicians from Israel; however, the product has been available in the United States for several years. It is a very powerful and accurate rule-extracting data mining tool.