Concepts


About the Input Data Set

The input data set must contain at least three numeric variables :

  • two horizontal variables, ( x , y )

  • one or more vertical variables, z through z - n , that will be interpolated or smoothed as if it were a function of the two horizontal variables.

The procedure can process multiple vertical variables for each pair of horizontal variables that you specify. If you specify more than one vertical variable, the G3GRID procedure performs a separate analysis and produces interpolated or smoothed values for each vertical variable. If more than one observation in the input data set has the same values for both horizontal variables, x and y , a warning message is printed, and only the first such point is used in the interpolation.

By default, the interpolation is performed after both variables are similarly scaled because the interpolation methods assume that the scales of x and y are comparable.

Multiple Vertical Variables

In the GRID statement, you can name multiple vertical variables ( z through z - n ) and produce a data set that contains two horizontal variables and multiple vertical variables. You can use the resulting data set to produce plots of the relationships of the two horizontal variables to different vertical variables.

Horizontal Variables Along a Nonlinear Curve

If the points that are generated by the horizontal variables tend to lie along a curve, a poor interpolation or spline may result. In such cases, the vertical variable(s) and one of the horizontal variables should be modeled as a function of the remaining horizontal variable. You can use a scatter plot of the two horizontal variables to help determine the appropriate function.

If the horizontal variable points are collinear, the procedure interpolates the function as constant along lines perpendicular to the line in the plane that is generated by the input data points.

About the Output Data Set

The output data set contains the two horizontal variables, the interpolated or smoothed vertical variables, and the BY variables, if any. If the GRID statement s SMOOTH= option is used, the output data set also contains a variable named _SMTH_, with a value equal to that of the smoothing parameter.

You can control both the number of x and y values in the output data set and the values themselves . In addition, you can specify an interpolation method.

Interpolation Methods

The G3GRID procedure can use one of three interpolation methods: bivariate interpolation (the default), spline interpolation, and smoothed spline interpolation.

Default Bivariate Interpolation

Unless you specify the SPLINE option, the G3GRID procedure is an interpolation procedure. That is, it calculates z values for x , y points that are missing from the input grid. The surface that is formed by the interpolated data passes precisely through the data points in the input data set.

This default method of interpolation works best for fairly smooth functions with values given at uniformly distributed points in the plane. If the data points in the input data set are erratic, the default interpolated surface can be erratic.

This default method is a modification of that described by Akima (1978). This method consists of

  1. dividing the plane into nonoverlapping triangles that use the positions of the available points

  2. fitting a bivariate fifth degree polynomial within each triangle

  3. calculating the interpolated values by evaluating the polynomial at each grid point that falls in the triangle.

The coefficients for the polynomial are computed based on

  • the values of the function at the vertices of the triangle

  • the estimated values for the first and second derivatives of the function at the vertices.

The estimates of the first and second derivatives are computed using the n nearest neighbors of the point, where n is the number specified in the GRID statement s NEAR= option. A Delauney triangulation (Ripley 1981, p. 38) is used for the default method. The coordinates of the triangles are available in an output data set if requested by the OUTTRI= option in the PROC G3GRID statement.

Spline Interpolation

If you specify the SPLINE option, a method is used that produces an interpolation or smoothing that is optimally smooth in a certain sense (Harder and Desmarais 1972, Meinguet 1979). The surface that is generated can be thought of as one that would be formed if a stiff, thin metal plate were forced through or near the given data points. For large data sets, this method is substantially more expensive than the default method.

The function u , formed when you specify the SPLINE option, is determined by letting

and

click to expand

where

click to expand

The coefficients c 1 , c 2 ,..., c n and d 1 , d 2 , d 3 of this polynomial are determined by these equations:

click to expand

and

where

E

  • is the n — n matrix E( t i , t j )

I

  • is the n — n identity matrix

»

  • is the smoothing parameter that is specified in the SMOOTH= option

c

  • is ( c 1 ,..., c n )

z

  • is ( z 1 ,..., z n )

d

  • is ( d 1 , d 2 , d 3 )

T

  • is the n — 3 matrix whose i th row is (1, x i , y i ).

See Wahba (1979) for more detail.

Spline Smoothing

To produce a smoothed spline, you can use the GRID statement s SMOOTH= option with the SPLINE option. The value or values specified in the SMOOTH= option are substituted for » in the equation that is described in Spline Interpolation on page 1330. A smoothed spline trades closeness to the original data points for smoothness. To find a value that produces the best balance between smoothness and fit to the original data, you can try several values for the SMOOTH= option.




SAS.GRAPH 9.1 Reference, Volumes I and II
SAS.GRAPH 9.1 Reference, Volumes I and II
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 342

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net