325.

[Cover] [Contents] [Index]

Page 91

● there should be some minimum number of samples that are assigned to every class, to ensure that each class is properly represented.

Test data are used in the assessment of classification accuracy. These data are put to one side until the thematic map has been produced. The test data are then categorised, using the same procedure as that which generated the thematic map. A comparison of the label allocated to each sample element of the test data by the classifier and the label provided by the human observer allows an assessment of the accuracy achieved by the classifier. This topic is discussed below. Where it is not possible to acquire test data then the use of cross-validation is a possible solution (Section 2.4).

Some authors describe their test data as ‘ground truth’. This is a misleading and inaccurate description. It is human to err, and one presumes that this aspect of human behaviour extends to the collection of test and training data in remote sensing. Some of the errors in a thematic map may not be due to the fallibility of the classifier but may be the result of mislocation of some elements of the test data, or to faulty identification of the attributes of the test data.

2.6.1 Sampling scheme

A number of sampling schemes are described in the literature (see, for example, Berry and Baker, 1968; Borak and Strahler, 1999; Ginevan, 1979; Fitzpatrick-Lins, 1981; Stehman, 1992; Congalton and Green, 1999). Congalton (1988) suggests that both random sampling without replacement and stratified unaligned random sampling generally provide satisfactory results. Stehman (1992) suggests, however, that the usual formulae for deriving the value of the kappa coefficient, which is used in accuracy assessment (Section 2.7), may perform poorly when stratified sampling methods are used. Congalton and Plourde (2000) have a more relaxed view, but make the point that the placement of samples is as important as the sampling scheme. Further discussion is contained in Brogaard and Ólafsdóttir (1997).

Atkinson (1991, 1996) notes that the standard statistical rules of sampling, as outlined in conventional statistical texts such as Cochran (1977), do not hold for spatial data because locations in space are in fixed positions and their attributes are therefore autocorrelated. He proposes the use of geostatistical methods, which take into account the spatial relationships between the pixels, van der Meer et al. (1998) present the results of an investigation of mapping accuracy using geostatistical methods. They show that the optimum sampling distance varies with the nature of the target, being of the order of 100 m for vegetated areas. For soil, however, the optimum sampling distance is the highest achievable spatial resolution of the instrument. Guidance on the use of geostatistical methods in remote sensing is provided by Curran (1988) and Woodcock et al. (1988 a, b).

[Cover] [Contents] [Index]


Classification Methods for Remotely Sensed Data
Classification Methods for Remotely Sensed Data, Second Edition
ISBN: 1420090720
EAN: 2147483647
Year: 2001
Pages: 354

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net