| C++ Neural Networks and Fuzzy Logic |
by Valluru B. Rao
M&T Books, IDG Books Worldwide, Inc.
ISBN: 1558515526 Pub Date: 06/01/95
|Previous||Table of Contents||Next|
Neural networks are dynamic systems in the learning and training phase of their operation, and convergence is an essential feature, so it was necessary for the researchers developing the models and their learning algorithms to find a provable criterion for convergence in a dynamic system. The Lyapunov function, mentioned previously, turned out to be the most convenient and appropriate function. It is also referred to as the energy function. The function decreases as the system states change. Such a function needs to be found and watched as the network operation continues from cycle to cycle. Usually it involves a quadratic form. The least mean squared error is an example of such a function. Lyapunov function usage assures a system stability that cannot occur without convergence. It is convenient to have one value, that of the Lyapunov function specifying the system behavior. For example, in the Hopfield network, the energy function is a constant times the sum of products of outputs of different neurons and the connection weight between them. Since pairs of neuron outputs are multiplied in each term, the entire expression is a quadratic form.
Besides the applications for which a neural network is intended, and depending on these applications, you need to know certain aspects of the model. The length of encoding time and the length of learning time are among the important considerations. These times could be long but should not be prohibitive. It is important to understand how the network behaves with new inputs; some networks may need to be trained all over again, but some tolerance for distortion in input patterns is desirable, where relevant. Restrictions on the format of inputs should be known.
An advantage of neural networks is that they can deal with nonlinear functions better than traditional algorithms can. The ability to store a number of patterns, or needing more and more neurons in the output field with an increasing number of input patterns are the kind of aspects addressing the capabilities of a network and also its limitations.
Sometimes neural networks are used as adaptive filters, the motivation for such an architecture being selectivity. You want the neural network to classify each input pattern into its appropriate category. Adaptive models involve changing of connection weights during all their operations, while nonadaptive ones do not alter the weights after the phase of learning with exemplars. The Hopfield network is often used in modeling a neural network for optimization problems, and the Backpropagation model is a popular choice in most other applications. Neural network models are distinguishable sometimes by their architecture, sometimes by their adaptive methods, and sometimes both. Methods for adaptation, where adaptation is incorporated, assume great significance in the description and utility of a neural network model.
For adaptation, you can modify parameters in an architecture during training, such as the learning rate in the backpropagation training method for example. A more radical approach is to modify the architecture itself during training. New neural network paradigms change the number or layers and the number of neurons in a layer during training. These node adding or pruning algorithms are termed constructive algorithms. (See Gallant for more details.)
The analogy for a neural network presented at the beginning of the chapter was that of a multidimensional mapping surface that maps inputs to outputs. For each unseen input with respect to a training set, the generalization ability of a network determines how well the mapping surface renders the new input in the output space. A stock market forecaster must generalize well, otherwise you lose money in unseen market conditions. The opposite of generalization is memorization. A pattern recognition system for images of handwriting, should be able to generalize a letter A that is handwritten in several different ways by different people. If the system memorizes, then you will not recognize the letter A in all cases, but instead will categorize each letter A variation separately. The trick to achieve generalization is in network architecture, design, and training methodology. You do not want to overtrain your neural network on expected outcomes, but rather should accept a slightly worse than minimum error on your training set data. You will learn more about generalization in Chapter 14.
Learning and training are important issues in applying neural networks. Two broad categories of network learning are supervised and unsupervised learning. Supervised learning provides example outputs to compare to while unsupervised learning does not. During supervised training, external prototypes are used as target outputs and the network is given a learning algorithm to follow and calculate new connection weights that bring the output closer to the target output. You can refer to networks using unsupervised learning as self-organizing networks, since no external information or guidance is used in learning. Several neural network paradigms were presented in this chapter along with their learning and training characteristics.
|Previous||Table of Contents||Next|
Copyright © IDG Books Worldwide, Inc.