| C++ Neural Networks and Fuzzy Logic |
by Valluru B. Rao
M&T Books, IDG Books Worldwide, Inc.
ISBN: 1558515526 Pub Date: 06/01/95
|Previous||Table of Contents||Next|
Table 5.9 shows that the network weight vector changed from an initial vector (0, 0) to the final weight vector (1, 0) in eight iterations. This example is not of a network for pattern matching. If you think about it, you will realize that the network is designed to fire if the first digit in the pattern is a 1, and not otherwise. An analogy for this kind of a problem is determining if a given image contains a specific object in a specific part of the image, such as a dot should occur in the letter i.
If the initial weights are chosen somewhat prudently and to make some particular relevance, then the speed of operation can be increased in the sense of convergence being achieved with fewer iterations than otherwise. Thus, encoding algorithms are important. We now present some of the encoding algorithms.
Consider a network that is to associate each input pattern with itself and which gets binary patterns as inputs. Make a bipolar mapping on the input pattern. That is, replace each 0 by 1. Call the mapped pattern the vector x, when written as a column vector. The transpose, the same vector written as a row vector, is xT. You will get a matrix of order the size of x when you form the product xxT. Obtain similar matrices for the other patterns you want the network to store. Add these matrices to give you the matrix of weights to be used initially, as we did in Chapter 4. This process can be described with the following equation:
W = ςixixiT
Consider a network that is to associate one input pattern with another pattern and that gets binary patterns as inputs. Make a bipolar mapping on the input pattern. That is, replace each 0 by 1. Call the mapped pattern the vector x when written as a column vector. Get a similar bipolar mapping for the corresponding associated pattern. Call it y. You will get a matrix of size x by size y when you form the product xyT. Obtain similar matrices for the other patterns you want the network to store. Add these matrices to give you the matrix of weights to be used initially. The following equation restates this process:
W = ςixiyiT
In one of the many interesting paradigms you encounter in neural network models and theory, is the strategy winner takes all. Well, if there should be one winner emerging from a crowd of neurons in a particular layer, there needs to be competition. Since everybody is for himself in such a competition, in this case every neuron for itself, it would be necessary to have lateral connections that indicate this circumstance. The lateral connections from any neuron to the others should have a negative weight. Or, the neuron with the highest activation is considered the winner and only its weights are modified in the training process, leaving the weights of others the same. Winner takes all means that only one neuron in that layer fires and the others do not. This can happen in a hidden layer or in the output layer.
In another situation, when a particular category of input is to be identified from among several groups of inputs, there has to be a subset of the neurons that are dedicated to seeing it happen. In this case, inhibition increases for distant neurons, whereas excitation increases for the neighboring ones, as far as such a subset of neurons is concerned. The phrase on center, off surround describes this phenomenon of distant inhibition and near excitation.
Weights also are the prime components in a neural network, as they reflect on the one hand the memory stored by the network, and on the other hand the basis for learning and training.
You have seen that mutually orthogonal or almost orthogonal patterns are required as stable stored patterns for the Hopfield network, which we discussed before for pattern matching. Similar restrictions are found also with other neural networks. Sometimes it is not a restriction, but the purpose of the model makes natural a certain type of input. Certainly, in the context of pattern classification, binary input patterns make problem setup simpler. Binary, bipolar, and analog signals are the varieties of inputs. Networks that accept analog signals as inputs are for continuous models, and those that require binary or bipolar inputs are for discrete models. Binary inputs can be fed to networks for continuous models, but analog signals cannot be input to networks for discrete models (unless they are fuzzified). With input possibilities being discrete or analog, and the model possibilities being discrete or continuous, there are potentially four situations, but one of them where analog inputs are considered for a discrete model is untenable.
An example of a continuous model is where a network is to adjust the angle by which the steering wheel of a truck is to be turned to back up the truck into a parking space. If a network is supposed to recognize characters of the alphabet, a means of discretization of a character allows the use of a discrete model.
What are the types of inputs for problems like image processing or handwriting analysis? Remembering that artificial neurons, as processing elements, do aggregation of their inputs by using connection weights, and that the output neuron uses a threshold function, you know that the inputs have to be numerical. A handwritten character can be superimposed on a grid, and the input can consist of the cells in each row of the grid where a part of the character is present. In other words, the input corresponding to one character will be a set of binary or gray-scale sequences containing one sequence for each row of the grid. A 1 in a particular position in the sequence for a row shows that the corresponding pixel is present(black) in that part of the grid, while 0 shows it is not. The size of the grid has to be big enough to accommodate the largest character under study, as well as the most complex features.
|Previous||Table of Contents||Next|
Copyright © IDG Books Worldwide, Inc.