19. | Classification Methods for Remotely Sensed Data, Second Edition

[Cover] [Contents] [Index]

Page 114

One way to avoid over-training is to consider the error derived from a test data set, that is, the user constructs two data sets. One is the data set used for training the neural network in the normal way, and the other is a test data set used for monitoring the ‘test error’. If the magnitude of the test error begins to increase then network training should stop. Another approach to overcome the over-training effect is to limit the ability of network to take advantage of noise (as illustrated in Figure 3.4) using one of the techniques of network pruning to reduce the number of connections between nodes. A common method of network pruning is to begin with a large network and then simplify the network structure by cutting some of the links between neurones. Network pruning leads to several advantages, specifically: (1) it avoids the problem of over-training; (2) it makes the network smaller and thus more capable of generalising; and (3) it reduces the computational burden because fewer weights and neurones are used.

There are two major groups of pruning algorithms. The first kind of method eliminates weights by adding extra terms to the error function (refer to Equation (3.4)). This can help the network to detect redundant weights. In other words, the objective function will force the network to use only some of the weights to achieve a solution and the redundant weights, which will have magnitudes near to zero, can be eliminated (Ji et al., 1990). The second group of network pruning methods is based on the analysis of the sensitivity of the error function to the removal of a particular link and its associated weight. The weight can be removed if the lack of that weight only results in a slight increase in the error. This weight removal process can be iterated until some predefined error threshold is exceeded (Karnin, 1990). Reed (1993) provides a comprehensive survey of network pruning techniques, while Kavzoglu and Mather (1999) give an example of the use of network pruning methods in remote sensing. Other references are Hassibi and Stork (1993), who describe the algorithm known as optimum brain surgeon, and Le Cun et al. (1990), whose technique is known as optimal brain damage.

3.2 Kohonen’s self-organising feature map

The idea of self-organising feature map, or SOM, was developed by Kohonen (1982, 1988a, 1989). Unlike the multilayer perceptron, this kind of network contains no hidden layer or layers, but is made up of one input and one output layer. The SOM network has an interesting property: that of automatically detecting (self-organising) the relationships within the set of input patterns. This property can be applied to the problem of image mapping from higher dimensions to a two-dimensional feature space and, via the use of the learning vector quantisation training algorithm (Kangas et al., 1990), to perform image classification.

[Cover] [Contents] [Index]