6.14 Neural Network Tools


6.14 Neural Network Tools

The current neural network tools are highly developed, unlike the software products of prior years, which were crude and required an extensive amount of labor to get the inputs and outputs out of them in a usable format. Today's commercial neural network products range from small, inexpensive programs to very sophisticated software suites costing thousands of dollars. Some of the current products have very intuitive interfaces, with most having the ability to generate code, such as C or Java, which can be incorporated into other applications. To obtain more information on these tools, the reader should go to Knowledge Discovery Nuggets (http://www.kdnuggets.com), a leading data mining portal. Today's neural network tools automate what once was a manual trial-and-error process of adjusting settings and preparing of the input and output data; a listing of some of these software products follows.

Attrasoft

http://attrasoft.com

Attrasoft is one of the most innovative neural network companies in the marketplace, providing a host of image recognition and pattern-recognition technologies. Its facial recognition product is highly accurate, versatile, and capable of searching millions of images, easily handling over a terabyte of data. Attrasoft has specialized in image recognition and pattern recognition since 1995. Their basic products are software components sold for licensing fees or complete software products sold as a package. Their core neural network software package is DecisionMaker.

DecisionMaker uses two files, the training data set, which it calls the problem database file, and a testing data set. The training file basically teaches DecisionMaker about the classification problem; the test file is then used to evaluate the model. Attrasoft also uses its proprietary algorithms to provide image, sound, and facial recognition products and services, all of which operate on the basis of their neural network technology.

BioComp

http://www.bio-comp.com

BioComp's iModel is a desktop tool for creating predictive models with automated optimization capabilities; using what it calls "mesh" technology. The iModel uses a genetic algorithm to optimize the model construction processes by performing searches through alternative model types, structures, and combinations of input variables. iModel works with delimited text files, Microsoft Excel workbooks, and Microsoft Access databases and has wizards for preparing and loading data. The iModel Professional and Enterprise versions support access to Oracle, SQL Server, and other relational databases. BioComp also sells modeling servers.

COGNOS 4Thought

http://www.cognos.com

4Thought is the neural network tool from Cognos, a company that primarily sells on-line analytic processing (OLAP) business reporting software. The tool supports every step of the analysis process, including data collection, transformation, exploration, and model creation. 4Thought can use data from the other Cognos data component including Impromptu, Powerplay, and Scenario, as well as external data sources like Microsoft Excel. It can automatically identify and omit anomalies and can also augment the data by creating new fields like ratios and percentages.

BrainMaker

http://www.calsci.com

NetMaker is a fairly solid product that has been around since 1985 and has sold over 25,000 copies; it is a very reliable back-propagation neural network tool. At its Web site, literally hundreds of applications are documented. Net-Maker facilitates the building and training of neural networks by importing data from multiple formats, including ASCII, binary, and other text or numeric data. The spreadsheet-like interface allows the user to organize and preprocess raw data with column shifts, arithmetic operations, moving averages, moving medians, and more. The professional version of this tool has all the features of the standard BrainMaker, plus additional functions supporting larger data limits and providing the ability for more extensive automated training and tuning capabilities, plus a runtime license and more extensive graphics.

An optional component, the Genetic Training Option (GTO), applies a genetic algorithm to optimize the neural networks built with NetMaker. Following Darwin's theories of genetic mutation and natural selection, GTO automatically creates a large number of subtly different networks to do the same job. GTO then trains, tests and ranks them to find the network that performs best overall. Once a good network is found, its "genes" are mutated to create another "parent" network. These two networks are then used as parents to create yet another "child" through genetic crossover techniques. When children perform better than their parents, they are saved as evolutionarily superior "beings" and are used for producing even better generations of networks. This represents one of the most advanced features of today's neural network tools.

MATLAB Neural Net Toolbox

http://www.mathworks.com/products/neuralnet/

This toolbox provides a complete set of functions and a GUI for the design, implementation, visualization, and simulation of neural networks. This tool allows for the construction of different neural network architectures, including multilayer perceptrons, radial basis function, and SOM via the use of simple tabs in a graphical interface. It supports a comprehensive set of training and learning functions. The STATISTICA Neural Networks can be purchased as a stand-alone application or as an add-on to the MATLAB main statistical product software package.

The tool contains a comprehensive selection of neural network methods with automatic wizards; a C-code generator add-on is also available for exporting the results into production applications. This tool, as with other more advanced neural network products, is entirely icon-based and has an extremely easy-to-use user interface (see Figure 6.19). MATLAB, as with other vendors such as NeuralWare and SPSS, offers on-site training services.

click to expand
Figure 6.19: Panes allow the user to visualize the network training results.

NeuroSolutions

http://www.nd.com

NeuroSolutions is NeuroDimension's base product. This neural network software combines a modular, icon-based network design interface with an implementation of advanced learning procedures, such as recurrent back-propagation and back-propagation time series. NeuroSolutions is a complete neural network package, currently in Version 4.2, that the company claims any novice can use for creating clustering or classification models.

The "feel" of NeuroSolutions is unique. The interface consists of electronic circuit components, such as resistors, capacitors, and transistors, which are laid out on a breadboard and wired together to form a circuit. NeuroSolutions provides a wizard to build a neural network "circuit" automatically. Neural components, such as axons, synapses, and gradient search engines, are laid out to form a neural network; input components are used to inject signals, and probe components are used to visualize the network's response. The tool provides a wide range of flexibility in designing models from scratch.

A demo shipped with the tool shows how a character-recognition problem can be developed in which the input to the network is a set of 24 x 18 images of handwritten digits. Each image has a corresponding desired output, the box on the right, which is an encoding of the digit that the image represents. The unsupervised portion of the network uses a principal component analysis (PCA) on the images. The features extracted from this preprocessing stage are then fed into a multilayer perceptron (MLP), which uses back-propagation to perform the image classification. This network has been trained to a relatively low error rate, such that the network closely matches the desired output. Note that the network classified the "5" correctly in the output box, but it also found that the image had some characteristics of the numbers "6" and "3" due to their similarity in shape (see Figure 6.20).

click to expand
Figure 6.20: Training to recognize the number 5.

The wizard that comes with the software makes the creation of predictive models a process involving half a dozen clicks. Many features of this tool automate the process of data preparation and the selection of the appropriate neural network architecture. This is done through a NExpert component that automates the model creation process through a sequence of simple dialogs, which for the first-time data miner makes this tool very appealing.

NeuralWare

http://www.neuralware.com

NeuralWare is one of the oldest neural network companies in the world. Founded in the mid 1980s along with HNC and Nestor, these three firms were the pioneers in the modeling software industry. Their Professional II/PLUS is in Version 5.5, making it an established and comprehensive neural network development system. Professional II/PLUS is available for UNIX, Linux, and Windows operating systems on a variety of hardware platforms with data and network files being fully interchangeable. The Professional II/PLUS package contains comprehensive documentation that addresses the entire neural network development and deployment process, including a tutorial, a guide to neural computing, standard and advanced reference manuals, and platform-specific installation and user guides. NeuralWare also provides considerable training for individuals interested in using their products. They also offer consulting services, as do most other neural network vendors.

NeuralWare's proprietary InstaNet facility allows quick generation of a neural network based on any one of 28 standard neural network architectures described in neural network literature. After a network is created, all parameters associated with it can be directly modified and customized to reflect more closely the classification or clustering problem of the user. Professional II/PLUS includes advanced features, such as performance-measure-based methods to inhibit over-fitting, automatic optimization of hidden-layer size, and the ability to prune hidden units.

As with the more advanced neural network packages in today's marketplace Professional II/Plus also contains an Explain facility that indicates which network inputs most influence a network output. These sensitivity reporting features and instruments allow today's software vendors to point out that neural networks are no longer the blackboxes that they were in the past. In addition to a wide variety of built-in diagnostic monitoring tools, Professional II/ PLUS provides an interface through which user-written programs can supply input data and process neural network outputs. The Designer Pack is an extension component to NeuralWorks Professional II/PLUS that can be used to generate C source code.

For beginners, NeuralWare also offers its Predict product, which can be run as an add-on to Microsoft Excel. Predict automatically analyzes input data to identify the appropriate data transform. It partitions the input data into training and test sets, selects relevant input variables, and then constructs, trains, and optimizes a neural network for a variety of classification problems. Predict allows for rapid creation and deployment of prediction and classification applications by combining neural network technology with genetic algorithms, statistics, and fuzzy logic for investigative and security small-scale or experimental applications.

ProForma

http://www.proformacorp.com

Founded in 1994 by mathematicians and programmers from Stirling University in Scotland and originally known as Neural Innovation, this firm's core neural network product is ProForma. To use ProForma, you follow this simple "by the numbers process" according the company:

  1. Select a source of data, which will form the base of your solution.

  2. Select the factors you wish to predict and those you wish to predict them with.

  3. ProForma will now check the data, warn you of and help you to solve any problems, and build a solution that is capable of making the predictions you require.

  4. You can then use this solution in a number of ways:

    • To make predictions from new events as they happen, or recognize new events as belonging to the same class as certain previous events

    • To perform dry runs of scenarios to see what would happen before you implement new plans

    • To calculate the set of actions required to optimize a given system

    • To analyze the relationships between the different variables and increase your understanding of the data

    • To embed the solution in another piece of software via an application programming interface (API) call to a dynamic link library (DLL) or Java program

In a sense, ProForma mimics the role of human experts, who learn that certain events lead to certain other events, make predictions based on what they have learned, and modify their behavior to try to improve the predicted outcomes. ProForma does this too, although this tool has exclusively been used in business intelligence and marketing applications. With some minor modifications, it can also be used for investigative analyses of criminal activity. For example, an insurance company client doubled its rate of detection of fraudulent claims using this tool, according to the company. The system also consistently detected 74% of fraudulent cases, with a similar system identifying the most risky 10% of policy holders.

SPSS Neural Connection

http://www.spss.com/spssbi/neuralconnection

Neural Connection is an SPSS stand-alone neural network package. It enables a novice user to build predictive models without training in statistics or AI. SPSS also offers neural networks components in its data mining flagship product Clementine, which will be covered in another chapter. Neural Connection includes the following architectures: Multilayer perceptron (MLP), radial basis function, Bayesian neural network, and the Kohonen (SOM) network. Neural Connection also provides the user with some of the standard statistical processes, such as multiple regression, closest class mean classifier, and principal component analysis techniques.

The tool provides data-management features, such as viewing descriptive statistics of a database to performing transformations (e.g., the creation of ratios or splitting a database into training and testing partitions). This interface, as with most of the advance generation of neural network tools, is graphically presented, requiring no programming or knowledge of statistics. A unique What If? utility enables the user to explore the results of the models interactively and graphically. It can also reposition two variables in the analysis to see how the change affects the model's outcome. The user can create specialized models for specific needs by combining the modeling and forecasting tools, such as the creation of committees of models for a combined set of predictions.

STATISTICIA Neural Networks

http://www.statsoftinc.com/stat_nn.html

This is a comprehensive, flexible, and powerful neural network package, featuring pre- and post-data processing, including data selection, nominal-value encoding, scaling, normalization, and missing value substitution, with interpretation for classification, regression, and time series problems. This neural network tool also comes with a wizard interface it calls the Intelligent Problem Solver, which can walk the user through a step-by-step process in creating a variety of different networks and choosing the network with the best performance. This is a state-of-the-art data mining product from a statistical software company with extensive experience in data modeling.

The tool also has an Input Feature Selection component for automating the selection of the right input variables for exploratory data analysis. The tool supports a wide variety of neural network architectures and combinations of networks architectures of practically unlimited sizes. It has comprehensive graphical and statistical feedback that facilitates interactive exploratory analyses. Because this tool is produced by a statistical software company, integration with the STATISTICA main system is supported, including direct transfer of data and graphs. As with most high-end neural network tools, code can be generated for integration with embedded solutions using other programming languages.

A Neuro-Genetic Input Selection component uses a genetic algorithm to automatically search for optimal combinations of input variables, even where there are correlations and nonlinear interdependencies. STATISTICA Neural Networks includes a Principal Components Analysis component to extract smaller numbers of dimensions from raw data inputs. The tool also includes automatic data scaling and recoding for both inputs and outputs. For classification problems, you can set confidence limits, which the tool uses to assign cases to classes.

Ward Systems

http://www.wardsystems.com

The NeuroShell tool from Ward Systems is another package that has been around since the mid-1980s. This tool has a very simple and clean interface, making the process of constructing a predictive model using neural network and genetic algorithms a very easy task. Ward System provides a variety of classification tools based on their neural network technology, including the following packages:

  • NeuroShell Predictor. This is Ward Systems' core product for forecasting and estimation problems.

  • NeuroShell Classifier. This professional system learns historical patterns to categorize or classify data.

  • GeneHunter. This optimizer component uses genetic algorithms to find optimal solutions for many modeling problems.

  • NeuroShell Engine. This is the package for creating an API from the Predictor and Classifier tools.

As seen in these product descriptions some of the more advanced neural network products incorporate yet another AI technique to optimize their design and performance: they use genetic algorithms (GAs). GAs are a programming mechanism based on a evolutionary method of computation founded by Dr. John Holland from the University of Michigan. GAs are an ingenious method of arriving at solutions and optimizing the performance of neural networks and are analogous to the way nature evolves by mutations and evolution in its "survival of the fittest" architectural design. GAs optimize their performance through a trial and error process by evaluating the correctness of their solutions to gradually improve their outputs.




Investigative Data Mining for Security and Criminal Detection
Investigative Data Mining for Security and Criminal Detection
ISBN: 0750676132
EAN: 2147483647
Year: 2005
Pages: 232
Authors: Jesus Mena

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net