Spatial simulation, just like spatial prediction, requires a model of spatial dependence, usually in terms of the covariance C z ( h ). For a given set of spatial data Z ( s i ) ,i = 1 , ,n , the covariance structure (both the form and parameter values) can be found by the VARIOGRAM procedure. This example uses the coal seam thickness data that is also used in the 'Getting Started' section of Chapter 80, 'The VARIOGRAM Procedure.'
In this example, the data consist of coal seam thickness measurements (in feet) taken over an approximately square area. The coordinates are offsets from a point in the southwest corner of the measurement area, with the north and east distances in units of thousands of feet.
It is instructive to see the locations of the measured points in the area where you want to perform spatial simulations. It is generally desirable to have these locations scattered evenly around the simulation area.
First, the data are input and the sample locations plotted.
data thick; input east north thick @@; datalines; 0.7 59.6 34.1 2.1 82.7 42.2 4.7 75.1 39.5 4.8 52.8 34.3 5.9 67.1 37.0 6.0 35.7 35.9 6.4 33.7 36.4 7.0 46.7 34.6 8.2 40.1 35.4 13.3 0.6 44.7 13.3 68.2 37.8 13.4 31.3 37.8 17.8 6.9 43.9 20.1 66.3 37.7 22.7 87.6 42.8 23.0 93.9 43.6 24.3 73.0 39.3 24.8 15.1 42.3 24.8 26.3 39.7 26.4 58.0 36.9 26.9 65.0 37.8 27.7 83.3 41.8 27.9 90.8 43.3 29.1 47.9 36.7 29.5 89.4 43.0 30.1 6.1 43.6 30.8 12.1 42.8 32.7 40.2 37.5 34.8 8.1 43.3 35.3 32.0 38.8 37.0 70.3 39.2 38.2 77.9 40.7 38.9 23.3 40.5 39.4 82.5 41.4 43.0 4.7 43.3 43.7 7.6 43.1 46.4 84.1 41.5 46.7 10.6 42.6 49.9 22.1 40.7 51.0 88.8 42.0 52.8 68.9 39.3 52.9 32.7 39.2 55.5 92.9 42.2 56.0 1.6 42.7 60.6 75.2 40.1 62.1 26.6 40.1 63.0 12.7 41.8 69.0 75.6 40.1 70.5 83.7 40.9 70.9 11.0 41.7 71.5 29.5 39.8 78.1 45.5 38.7 78.2 9.1 41.7 78.4 20.0 40.8 80.5 55.9 38.7 81.1 51.0 38.6 83.8 7.9 41.6 84.5 11.0 41.5 85.2 67.3 39.4 85.5 73.0 39.8 86.7 70.4 39.6 87.2 55.7 38.8 88.1 0.0 41.6 88.4 12.1 41.3 88.4 99.6 41.2 88.8 82.9 40.5 88.9 6.2 41.5 90.6 7.0 41.5 90.7 49.6 38.9 91.5 55.4 39.0 92.9 46.8 39.1 93.4 70.9 39.7 94.8 71.5 39.7 96.2 84.3 40.3 98.2 58.2 39.5 ; proc gplot data=thick; title 'Locations of Measured Samples'; plot north*east / frame cframe=ligr haxis=axis1 vaxis=axis2; symbol1 v=dot color=blue; axis1 minor=none; axis2 minor=none label=(angle=90 rotate=0); label east = 'East' north = 'North' ; run;
proc g3d data=thick; title 'Surface Plot of Coal Seam Thickness'; scatter east*north=thick / xticknum=5 yticknum=5 grid zmin=20 zmax=65; label east = 'East' north = 'North' thick = 'Thickness' ; run;
Figure 65.2 shows the small scale variation typical of spatial data, but there does not appear to be any surface trend. Hence, you can work with the original thickness data rather than residuals from a trend surface fit. In fact, a reasonable approximation of the spatial process generating the coal seam data is given by
where the µ ( s ) is a Gaussian SRF with Gaussian covariance structure
Note that the term 'Gaussian' is used in two ways in this description. For a set of locations s 1 , s 2 , , s n , the random vector
has a multivariate Gaussian or normal distribution N n ( µ, & pound ; ). The (i, j)th element of is computed by C z ( s i ˆ’ s j ), which happens to be a Gaussian functional form. Any functional form for C z ( h ) yielding a valid covariance matrix can be used. Both the functional form of C z ( h ) and the parameter values
µ = 40 . 14
c = 7 . 5
a = 30 .
are visually estimated using PROC VARIOGRAM, a DATA step, and the GPLOT procedure. Refer to the 'Getting Started' section beginning on page 4852 in the chapter on the VARIOGRAM procedure for details on how these parameter values are obtained.
The choice of a Gaussian functional form for C z ( h ) is simply based on the data, and it is not at all crucial to the simulation. However, it is crucial to the simulation method used in PROC SIM2D that Z ( s ) be a Gaussian SRF. For details, see the section 'Computational and Theoretical Details of Spatial Simulation' beginning on page 4106.
The variability of Z ( s ),modeledby
with the Gaussian covariance structure C z ( h ) found previously is not obvious from the covariance model form and parameters. The variation around the mean of the surface is relatively small, making it difficult visually to pick up differences in surface plots of simulated realizations. Instead, you investigate variations at selected grid points.
To do this investigation, this example uses PROC SIM2D and specifies the Gaussian model with the parameters found previously. Five thousand simulations (iterations) are performed on two points: the extreme south-west point of the region and a point towards the north-east corner of the region. Because of the irregular nature of these points, a GDATA= data set is produced with the coordinates of the selected points.
Summary statistics are computed for each of these grid points by using a BY statement in PROC UNIVARIATE.
data grid; input xc yc; datalines; 0 0 75 75 run; proc sim2d data=thick outsim=sim1; simulate var=thick numreal=5000 seed=79931 scale=7.5 range=30.0 form=gauss; mean 40.14; coordinates xc=east yc=north; grid gdata=grid xc=xc yc=yc; run; proc sort data=sim1; by gxc gyc; run; proc univariate data=sim1; var svalue; by gxc gyc; title 'Simulation Statistics at Selected Grid Points'; run;
Simulation Statistics at Selected Grid Points ------ X-coordinate of the grid point=0 Y-coordinate of the grid point=0 ------- The UNIVARIATE Procedure Variable: SVALUE (Simulated Value at Grid Point) Moments N 5000 Sum Weights 5000 Mean 40.1387121 Sum Observations 200693.561 Std Deviation 0.54603592 Variance 0.29815523 Skewness 0.0217334 Kurtosis 0.0519914 Uncorrected SS 8057071.54 Corrected SS 1490.478 Coeff Variation 1.36037231 Std Error Mean 0.00772211 Basic Statistical Measures Location Variability Mean 40.13871 Std Deviation 0.54604 Median 40.14620 Variance 0.29816 Mode . Range 3.81973 Interquartile Range 0.76236
Simulation Statistics at Selected Grid Points ------ X-coordinate of the grid point=0 Y-coordinate of the grid point=0 ------- The UNIVARIATE Procedure Variable: SVALUE (Simulated Value at Grid Point) Tests for Location: Mu0=0 Test -Statistic- -----p Value------ Student's t t 5197.892 Pr > t <.0001 Sign M 2500 Pr >= M <.0001 Signed Rank S 6251250 Pr >= S <.0001 Quantiles (Definition 5) Quantile Estimate 100% Max 41.9369 99% 41.4002 95% 41.0273 90% 40.8334 75% Q3 40.5168 50% Median 40.1462 25% Q1 39.7544 10% 39.4509 5% 39.2384 1% 38.8656 0% Min 38.1172 Extreme Observations ------Lowest----- -----Highest----- Value Obs Value Obs 38.1172 2691 41.8085 1149 38.2959 1817 41.8251 3612 38.3370 3026 41.8446 3757 38.3834 2275 41.9338 135 38.4198 3100 41.9369 4536