| < Day Day Up > |
6.3. Examining Data Content
Examining data content is an important part of any project. Understanding information specific to your dataset will help you use it more effectively. Each piece of spatial data will have some geographic component to it (coordinates describing the location of real features), but it will also have what are called
attributes
.These are non-geographic data about the geographic feature, such as the
6.3.1. Viewing Summary Information About AirportsThe MapServer demo data includes a variety of vector spatial files; therefore you will use the ogrinfo tool to gather information about the files. At the command prompt, change into the workshop folder, and run the ogrinfo command to have it list the datasets that are in the data folder. The output from the command will look like Example 6-1.
Example 6-1. Showing a list of the layer
|
|
Only two features are shown in this example, the first starting with OGRFeature(airports):0. The full example goes all the way to OGRFeature(airports):11 , including all 12 airports. The rest of the points aren't shown in this example, just to keep it simple.
ogrinfo is a great tool for digging even deeper into your data. There are more options that can be used, including a database query-like ability to select features and the ability to list only features that fall within a certain area. Running man ogrinfo (if your operating system supports manpages) shows the full usage for each parameter. Otherwise, the details are available on the OGR web site at http://www.gdal.org/ogr/ogr_utilities.html. You can also run the ogrinfo command with the --help parameter ( ogrinfo --help ) to get a summary of options. Example 6-4 shows some examples of how they can be used with your airport data.
> ogrinfo data airports - where "name='Bolduc Seaplane Base'" INFO: Open of 'data' using driver 'ESRI Shapefile' successful. Layer name: airports Geometry: Point Feature Count: 1 Extent: (434634.000000, 5228719.000000) - (496393.000000, 5291930.000000) Layer SRS WKT: (unknown) NAME: String (64.0) LAT: Real (12.4) LON: Real (12.4) ELEVATION: Real (12.4) QUADNAME: String (32.0) OGRFeature(airports):1 NAME (String) = Bolduc Seaplane Base LAT (Real) = 47.5975 LON (Real) = -93.4106 ELEVATION (Real) = 1325.0000 QUADNAME (String) = Balsam Lake POINT (469137 5271647)
This example lists only those airports that have the name Bolduc Seaplane Base . As you can imagine, there is only one. Therefore, the summary information about this layer and one set of attribute values are listed for the single airport that meets this criteria in Example 6-5. The -sql option can also specify what attributes to list in the ogrinfo output.
|
> ogrinfo data airports -sql "select name from airports where quadname='Side Lake'" INFO: Open of 'data' using driver 'ESRI Shapefile' successful. layer names ignored in combination with -sql. Layer name: airports Geometry: Point Feature Count: 2 Extent: (434634.000000, 5228719.000000) - (496393.000000, 5291930.000000) Layer SRS WKT: (unknown) name: String (64.0) OGRFeature(airports):4 name (String) = Christenson Point Seaplane Base POINT (495913 5279532) OGRFeature(airports):10 name (String) = Sixberrys Landing Seaplane Base POINT (496393 5280458)
The SQL parameter is set to show only one attribute, NAME , rather than all seven attributes for each feature. It still shows the coordinates by default, but none of the other information is displayed. This is combined with a query to show only those features that meet a certain QUADNAME requirement.
Example 6-6 shows how ogrinfo can use some spatial logic to find features that are within a certain area.
> ogrinfo data airports -spat 451869 5225734 465726 5242150 Layer name: airports Geometry: Point Feature Count: 2 Extent: (434634.000000, 5228719.000000) - (496393.000000, 5291930.000000) Layer SRS WKT: (unknown) NAME: String (64.0) LAT: Real (12.4) LON: Real (12.4) ELEVATION: Real (12.4) QUADNAME: String (32.0) OGRFeature(airports):7 NAME (String) = Grand Rapids-Itasca County/Gordon Newstrom Field LAT (Real) = 47.2108 LON (Real) = -93.5097 ELEVATION (Real) = 1355.0000 QUADNAME (String) = Grand Rapids POINT (461401 5228719) OGRFeature(airports):8 NAME (String) = Richter Ranch Airport LAT (Real) = 47.3161 LON (Real) = -93.5914 ELEVATION (Real) = 1340.0000 QUADNAME (String) = Cohasset East POINT (455305 5240463)
The ability to show only features based on where they are located is quite powerful. You do so using the -spat parameter followed by two pairs of coordinates. The first pair of coordinates 451869 5225734 represent the southwest corner of the area you are interested in querying. The second pair of coordinates 465726 5242150 represents the northeast corner of the area you are interested in, creating a rectangular area.
|
ogrinfo then shows only those features that are located within the area you define. In this case, because the data is projected into the UTM coordinate system, the coordinates must be specified in UTM format in the -spat parameter. Because the data is stored in UTM coordinates, you can't specify the coordinates using decimal degrees (°) for instance. The coordinates must always be specified using the same units and projection as the source data, or you will get inaccurate results.
Example 6-7 is similar to a previous example showing complex query syntax using the
-sql
parameter, but it
> ogrinfo data airports - sql "select * from airports where elevation > 1350 and quadname like '%Lake'" -summary INFO: Open of 'data' using driver 'ESRI Shapefile' successful. layer names ignored in combination with -sql. Layer name: airports Geometry: Point Feature Count: 5 Extent: (434634.000000, 5228719.000000) - (496393.000000, 5291930.000000)
If you add the -summary option, it doesn't list all the attributes of the features, but shows only a summary of the information. In this case, it summarizes only information that met the criteria of the -sql parameter. This is very handy if you just want to know how many features meet certain criteria or fall within a certain area but don't care to see all the details.
You can download a sample satellite image from http://geogratis.cgdi.gc.ca/download/RADARSAT/mosaic/canada_mosaic_lcc_1000m.zip. If you unzip the file, you create a file called canada_mosaic_lcc_1000m.tif . This is a file containing an image from the RADARSAT satellite. For more information about RADARSAT, see http://www.ccrs.nrcan.gc.ca/ccrs/data/satsens/radarsat/rsatndx_e.html.
To better understand what kind of data this is, use the gdalinfo command. Like the ogrinfo command, this tool lists certain pieces of information about a file, but the GDAL tools can interact with raster/image data. The output from gdalinfo is also very similar to ogrinfo as you can see in Example 6-8. You should change to the same folder as the image before running the gdalinfo command.
>
gdalinfo canada_mosaic_lcc_1000m.tif
Driver: GTiff/GeoTIFF
Size is 5700, 4800
Coordinate System is:
PROJCS["LCC E008",
GEOGCS["NAD83",
DATUM["North_American_Datum_1983",
SPHEROID["GRS 1980",6378137,298.2572221010042,
AUTHORITY["EPSG","7019"]],
AUTHORITY["EPSG","6269"]],
PRIMEM["Greenwich",0],
UNIT["degree",0.0174532925199433],
AUTHORITY["EPSG","4269"]],
PROJECTION["Lambert_Conformal_Conic_2SP"],
PARAMETER["standard_parallel_1",49],
PARAMETER["standard_parallel_2",77],
PARAMETER["latitude_of_origin",0],
PARAMETER["central_meridian",-95],
PARAMETER["false_easting",0],
PARAMETER["false_northing",0],
UNIT["metre",1,
AUTHORITY["EPSG","9001"]]]
Origin = (-2600000.000000,10500000.000000)
Pixel Size = (1000.00000000,-1000.00000000)
Corner Coordinates:
Upper Left (-2600000.000,10500000.000) (177d17'32.31"W, 66d54'22.82"N)
Lower Left (-2600000.000, 5700000.000) (122d54'49.00"W, 36d12'53.87"N)
Upper Right ( 3100000.000,10500000.000) ( 9d58'39.57"W, 62d25'50.45"N)
Lower Right ( 3100000.000, 5700000.000) ( 62d32'49.65"W, 34d18'5.61"N)
Center ( 250000.000, 8100000.000) ( 89d56'43.00"W, 62d46'47.18"N)
Band 1 Block=5700x1 Type=Byte, ColorInterp=Gray
There are five main sections in this report. Unlike ogrinfo , there aren't a lot of different options, and attributes are very simplistic. The first line tells you what image format the file is.
Driver: GTiff/GeoTIFF
In this case, it tells you the file is a GeoTIFF image. TIFF images are used in general computerized photographic applications such as digital photography and printing. However, GeoTIFF implies that the image has some geographic information encoded into it.
gdalinfo
can be run with a
—formats
option, which lists all the raster formats it can read and possibly write. The version of GDAL included with FWTools has support for more than three
The next line shows the size of the image:
Size is 5700, 4800.
An image size is characterized by the number of data rows and columns. An image is a type of raster data. A raster is made up of
Images can be projected into various coordinate reference systems (see Appendix A for more about map projections):
Coordinate System is:
PROJCS["LCC E008",
GEOGCS["NAD83",
DATUM["North_American_Datum_1983",
SPHEROID["GRS 1980",6378137,298.2572221010042,
AUTHORITY["EPSG","7019"]],
AUTHORITY["EPSG","6269"]],
PRIMEM["Greenwich",0],
UNIT["degree",0.0174532925199433],
AUTHORITY["EPSG","4269"]],
PROJECTION["Lambert_Conformal_Conic_2SP"],
PARAMETER["standard_parallel_1",49],
PARAMETER["standard_parallel_2",77],
PARAMETER["latitude_of_origin",0],
PARAMETER["central_meridian",-95],
PARAMETER["false_easting",0],
PARAMETER["false_northing",0],
UNIT["metre",1,
AUTHORITY["EPSG","9001"]]]
These assign a cell to a global geographic coordinate. Often these coordinates need to be adjusted to improve the appearance of particular applications or to line up with other pieces of data. This image is in a projection called Lambert Conformal Conic (LCC). You will need to know what projection data is in if you want to use it with other data. If the projections between data don't match, you may need to reproject them into a common projection.
|
The latitude of origin and central
PARAMETER["latitude_of_origin",0],
PARAMETER["central_meridian",-95],
Note that in the earlier projection, the unit setting is metre . When you look at Pixel Size in a moment, you will see a number but no unit. It is in this unit (meters) that the pixel sizes are measured.
Cells are given row and column numbers, but are also given geographic coordinate values. The origin setting tells what the geographic coordinate is of the cell at row 0, column 0. Here, the value of origin is in the same projection and units as the projection for the whole image. The east/west coordinate -2,600,000 is 2,600,000 meters west of the central meridian. The north/south coordinate is 10,500,000 meters north of the equator.
Origin = (-2600000.000000,10500000.000000)
Pixel Size = (1000.00000000,-1000.00000000)
Cells are also called pixels and each of them has a defined size. In this example the pixels have a size of 1000 x 1000: the -1000 is just a notation; the negative aspect of it can be ignored for now. In most cases, your pixels will be square, though it is possible to have rasters with nonsquare pixels. The unit of these pixel sizes is in meters, as defined earlier in the projection for the image. That means each pixel is 1,000 meters wide and 1,000 meters high.
|
Much like the previous origin settings, corner coordinates tell you the geographic coordinate the corner pixels and center of the image have:
Corner Coordinates:
Upper Left (-2600000.000,10500000.000) (177d17'32.31"W, 66d54'22.82"N)
Lower Left (-2600000.000, 5700000.000) (122d54'49.00"W, 36d12'53.87"N)
Upper Right ( 3100000.000,10500000.000) ( 9d58'39.57"W, 62d25'50.45"N)
Lower Right ( 3100000.000, 5700000.000) ( 62d32'49.65"W, 34d18'5.61"N)
Center ( 250000.000, 8100000.000) ( 89d56'43.00"W, 62d46'47.18"N)
Notice that the coordinates are first given in their projected values, but also given in their unprojected geographic coordinates, longitude, and latitude. Knowing this will help you determine where on the earth your image
Images are made up of different bands of data. In some cases, you can have a dozen different bands, where each band stores values about a specific wavelength of light that a sensor photographed. In this case, there is only one band Band 1 . The ColorInterp=Gray setting tells you that it is a grayscale image, and Type=Byte tells you that it is an 8-bit (8 bits=1 byte) image. Because 8 bits of data can hold 256 different values, this image could have 256 different shades of gray.
Band 1 Block=5700x1 Type=Byte, ColorInterp=Gray
|
If you add the -mm parameter to the gdalinfo command, as shown in Example 6-9, you get a summary of the minimum and maximum color values for the bands in the image.
>
gdalinfo canada_mosaic_lcc_1000m.tif
-
mm
...
Band 1 Block=5700x1 Type=Byte, ColorInterp=Gray
Computed Min/Max=0.000,255.000
This shows that there are 256 different values used in this image (with 0 being the minimum value).
| < Day Day Up > |