JB Industries
March 21st, 2014
Geostatistics Major
Assignment Geostatistical Analysis of Student Collected
Spatial Data
John Bull
Geostatistics Major Assignment March 19, 2014
JB Industries | 3 Jasmin Crescent, St Catharines, ON, L2T 2B9
2
Executive Summary The geostatistical analysis of student collected spatial data project, was a major project that
included data collection, preprocessing, and surface creation. The project study area was
defined as the city of St. Catharines. Data used for the project were obtained online, primarily
from government ministries, and included water well data with elevation values used as the z-
value. Using these elevation values, two interpolated surfaces were created; an Inverse
Distance Weighted, and an Empirical Bayesian Kriging.
The IDW interpolator was used to create a smooth, gradual surface from the elevation points.
High levels of variation along the escarpment were attempted to be accounted for by altering
the search neighbourhood and outliers were attempted to be diminished through smoothing.
The IDW result came out as desired, with a slow gradual elevation change in the study area,
outside of a few outliers that created some small depressions in high elevation areas.
The EBK interpolator was used to create a more exact surface, attempting to account for more
elevation changes in the study area. The EBK used spatial autocorrelation to weight the
prediction locations, which in turn produced a slightly more accurate surface. The EBK result
accounted for more of the elevation changes in the area, as exemplified by the 12 Mile Creek
area.
Data coverage within the study area posed slight problems with two major data gaps and other
small clusters of data points. These areas slightly affected the accuracy of both interpolation
techniques, and the surfaces would be better suited if these gaps could be filled in with
supplemental elevation measurements.
In all, both interpolation techniques accurately performed the job they were set out to do. The
added geostatistical power of any kriging technique allows for a stricter surface to be derived
than that of an IDW due to the interactive modelling and added parameters. Depending on the
final use of the surface, however, one interpolation technique may not necessarily be better
than the other.
Geostatistics Major Assignment March 19, 2014
JB Industries | 3 Jasmin Crescent, St Catharines, ON, L2T 2B9
3
Table of Contents 1. Introduction .......................................................................................................................................... 1
1.1. Project Goals ................................................................................................................................. 1
1.2. Study Area ..................................................................................................................................... 2
2. Data ....................................................................................................................................................... 3
2.1. Statistics ........................................................................................................................................ 5
2.2. Trends ........................................................................................................................................... 7
2.3. Preprocessing ................................................................................................................................ 8
3. Methodology ....................................................................................................................................... 10
3.1. Data Transformations and Trend Removal ................................................................................. 10
3.2. Inverse Distance Weighted ......................................................................................................... 12
3.3. Kriging ......................................................................................................................................... 17
4. Analysis ............................................................................................................................................... 20
4.1. Interpolation Techniques ............................................................................................................ 20
4.2. Data Coverage ............................................................................................................................. 23
4.3. Accuracy of Results ..................................................................................................................... 25
5. Future Recommendations .................................................................................................................. 27
6. Conclusion ........................................................................................................................................... 27
Bibliography ................................................................................................................................................ 28
Geostatistics Major Assignment March 19, 2014
JB Industries | 3 Jasmin Crescent, St Catharines, ON, L2T 2B9
4
Figures and Tables Figure 1: Outline of study area ..................................................................................................................... 2
Figure 2: Locations of water wells within study area ................................................................................... 4
Figure 3: Summary statistics and distribution for water well data ............................................................... 6
Figure 4: Trend analysis of water well elevation measurements ................................................................. 7
Figure 5: Example of declustering ................................................................................................................. 9
Figure 6: Distribution of water wells data before transformation ............................................................. 10
Figure 7: Water well data distribution after transformation ...................................................................... 11
Figure 8: Cross-section of study area .......................................................................................................... 12
Figure 9: Comparison of IDW model to digital elevation model ................................................................ 14
Figure 10: Method report for IDW interpolation ........................................................................................ 15
Figure 11: Final IDW map output ................................................................................................................ 16
Figure 12: Method report for EBK interpolator .......................................................................................... 18
Figure 13: Empirical Bayesian Kriging output ............................................................................................. 19
Figure 14: Problems with IDW result .......................................................................................................... 21
Figure 15: Issues with EBK output ............................................................................................................... 22
Figure 16: Differences between EBK and IDW ............................................................................................ 23
Figure 17: Study area data gaps .................................................................................................................. 24
Figure 18: IDW overlaid onto satellite imagery .......................................................................................... 25
Figure 19: EBK overlaid on satellite imagery .............................................................................................. 26
Table 1: Descriptive statistics for water well dataset ................................................................................... 5
Geostatistics Major Assignment March 19, 2014
1
1. Introduction The following report details the work that was undertook when completing the project:
Geostatistical Analysis of Student Collected Spatial Data. The report begins by detailing the
goals of the project, and then proceeds to explain the methodology used when performing each
analysis technique. Lastly, analytic commentary described the changes and differences between
each analysis technique, as well as the overall quality of the data coverage, and the accuracy of
the results when compared to real life.
1.1. Project Goals The ‘Geostatistical Analysis’ project’s main purpose and goals are listed below, as per the Terms
of Reference (Smith, 2014):
Derive a working ability to report upon the collection of geospatial data and to describe
the data both geostatistically as well as practically.
To predict geospatial coverage in areas not directly measured or observed, through
interpolation
The following report outlines the data, methodology, and analysis techniques used to achieve
these goals, and the report concludes with an assessment of the two interpolation techniques as
well as the data collection and coverage within the study area.
Geostatistics Major Assignment March 19, 2014
2
1.2. Study Area The area of interest for the Geostatistical Analysis project is the city of St. Catharines, in the
Niagara Region. The study area limits were delineated using ESRI’s municipal polygons layer and
can be seen in Figure 1.
Figure 1: Outline of study area
Geostatistics Major Assignment March 19, 2014
3
2. Data Data for the project was gathered entirely from online sources between January 17th and
January 19th of 2014.
The well water dataset is the primary dataset being used within this project, and has been
obtained from Ontario’s Ministry of the Environment. This dataset includes easting, northing,
and elevation measurements which will be used to interpolate the two different surfaces. The
municipal boundaries, used to derive the study area, were obtained online from Ontario Basic
Mapping; however the data is property of the Government of Ontario.
Geostatistics Major Assignment March 19, 2014
4
The obtained data points, clipped to the study area, can be seen in Figure 2.
Figure 2: Locations of water wells within study area
Geostatistics Major Assignment March 19, 2014
5
2.1. Statistics The water well dataset contains 573 data points within the area of interest, all with a unique
easting, northing, and elevation value. Descriptive statistics for these data points can be seen in
Table 1.
Descriptive Statistics for Well Data
Minimum Maximum Mean Median Standard Deviation
Easting 637,216.00 647,235.00 642,008.00 642,146.00 2,912.00
Northing 4,774,403.00 4,788,203.00 4,780,876.00 4,780,903.00 2,561.00
Elevation(m) 75.08 177.63 100.81 99.84 13.92 Table 1: Descriptive statistics for water well dataset
The range of elevation values (102.55m) immediately tells us that the study area has some fairly large
elevation changes within it. The mean and median values being close together, and around 100m,
suggest the elevation change is fairly steep (or else the mean would fall closer to the middle of the
range), and that the data is fairly close to normally distributed with a positive skew (due to the mean
lying closer to the minimum than the maximum).
Geostatistics Major Assignment March 19, 2014
6
Further investigation into the data shows that the data is farther from a normal distribution than
first assumed. Skewness and Kurtosis are two statistical measures that can be used to describe
the distribution of a dataset; a skewness value of zero, and a kurtosis value of 3 are indicative of
a normal distribution (Babish, 2006). Figure 3 shows that the water well elevations have a
skewness value of 2.31 and a kurtosis value of 9.75.
Figure 3: Summary statistics and distribution for water well data
The skewness value of 2.31 indicates the dataset is distributed with positive skew, as previously
assumed due to a large cluster of values around the mean and a long tail into the higher values (Babish,
2006). The kurtosis value of 9.75 indicates the distribution is leptokurtic and therefore has a high peak
with thicker than normal tails (Babish, 2006). The lack of true normality within the dataset can
potentially be addressed before analysis by performing data transformations.
1801651501351201059075
Median
Mean
1021011009998
1st Q uartile 93.70
Median 99.84
3rd Q uartile 105.30
Maximum 177.63
99.66 101.95
98.42 100.15
13.16 14.78
A -Squared 19.05
P-V alue < 0.005
Mean 100.81
StDev 13.92
V ariance 193.84
Skewness 2.30888
Kurtosis 9.74837
N 573
Minimum 75.08
A nderson-Darling Normality Test
95% C onfidence Interv al for Mean
95% C onfidence Interv al for Median
95% C onfidence Interv al for StDev
95% Confidence Intervals
Summary for Elevation
Geostatistics Major Assignment March 19, 2014
7
2.2. Trends The unique geomorphology of the Niagara Region, and in particular the study area, imposes
obvious trends within an elevation dataset. The presence of the Niagara Escarpment creates an
area where large elevation changes are occurring over short distance. Aside from this, the area
itself has a gradual increasing trend in the north-south direction; elevation values gradually
increase as one moves south from Lake Ontario, until the escarpment is reached and elevation
values rise sharply (Figure 4).
Figure 4: Trend analysis of water well elevation measurements
The green line in Figure 4 represents the projected trends in the data for the YZ plane, and the
blue line represents the projected trends for the XZ plane. The strong trend in the YZ plane can
be visualized within this figure, showing that the elevation is increasing as the northing values
decrease. The blue line also shows a slight trend in the XZ plane, with a slight elevation increase
in the centre of the study area. The appearances of trends in the data are sometimes removed
before performing Kriging or Cokriging, allowing the analysis to be performed more accurately
(ESRI, 2012).
Geostatistics Major Assignment March 19, 2014
8
2.3. Preprocessing Preprocessing was performed on the water wells dataset to ensure accuracy in the
measurements as well as in the interpolated surface. Water well measurements were obtained
with a positional accuracy of +-500m, therefore using Google Earth, elevation values were
confirmed using a small degree of leeway in positional accuracy.
Water well measurements were also combed through to eliminate tight clusters of data points,
resulting in two benefits. Firstly, the kriging interpolation model is a processing-intensive
procedure and therefore the overall reduction of data points to below 500 is beneficial for
processing but also for obtaining accurate results. Secondly, clustered data results in data
redundancy and can also affect the outcome of any interpolation.
Geostatistics Major Assignment March 19, 2014
9
An example of these clusters and points that have been removed can be seen in Figure 5.
Figure 5: Example of declustering
Geostatistics Major Assignment March 19, 2014
10
3. Methodology The following section details the methodologies applied when creating the interpolated
surfaces, including all parameters used within the GIS analysis tools. Both interpolated surfaces
were created using ESRI’s Geostatistical Wizard.
3.1. Data Transformations and Trend Removal The lack of normality within the water well dataset indicates that the data should attempt to be
transformed in order to obtain a more accurate result, particularly when using the kriging model
(Babish, 2006). Using the histogram in ESRI’s Geostatistical Analyst extension, two
transformations can be applied to the dataset in which an updated distribution will be displayed
automatically, immediately showing what the transformation did to the dataset. The original
distribution, before the use of any transformations, can be seen in Figure 6.
Figure 6: Distribution of water wells data before transformation
The Box-Cox transformation is a useful method in alleviating heteroscedasticity which is
essentially where sub-populations in the data have different variabilities than others (Babish,
2006; Wikipedia, 2014). In the case of the water wells dataset, sub-populations in the lower
region of the study area, closest to Lake Ontario, should have similar variabilities; however, sub-
populations occuring on or around the Niagara Escarpment, will have an increased variability
due to the quick elevation changes in the area. Babish (2006) also explains that
heteroscedasticity can be caused by nonnormality of one of the variables, or an indirect
relationship between variables; the water well dataset contains both.
Geostatistics Major Assignment March 19, 2014
11
The Box-Cox transformation, in turn, may help to make the variances more constant throughout
the study area and it will often make the data appear more normally distributed (ESRI, 2012).
Using the Box-Cox transformation, with a parameter of -2, the dataset appeared to become
more normally distributed, and thus more suitable for interpolation techniques (Figure 7).
Figure 7: Water well data distribution after transformation
It is evident in Figure 7, that the distribution visually appears to be far more normal then before
the transformation. The skewness and kurtosis statistics, mentioned earlier, also changed to
reflect a more normalized distribution. A skewness value of 0.088 is far closer to an appropriate
value of zero than the original 2.32, and a kurtosis value of 3.87 is also closer to the value of 3
which is indicative of a normal distribution.
A global trend in the dataset is an overriding process that affects all measurements in a
deterministic manner, meaning that all data points within the study area are affected by the
trend (ESRI, 2012). The trend we examined in the earlier sections, displayed an increasing curve
in the YZ plane, indicative of a global trend within the dataset. This trend can be represented by
a mathematical formula, and essentially removed from the dataset prior to the kriging analysis,
and then added back before predictions are made (ESRI, 2012).
Geostatistics Major Assignment March 19, 2014
12
The most common way of modelling a trend is by using polynomial functions, with the degree
depending on the trends in the dataset. For the case of the water well dataset, the quick
increase in elevation values due to the escarpment creates a situation in which a second-degree
polynomial would fit the trend better than a first-degree polynomial. The first-degree
polynomial is simply a linear polynomial that would account for the gradual elevation changes
seen from Lake Ontario to the Niagara Escarpment; since elevation values were included on top
of the escarpment, the linear polynomial no longer fits as well. Figure 8 shows a cross-section of
St. Catharines in terms of elevation values.
Figure 8: Cross-section of study area
It can be seen that values at the right of the cross-section increase dramatically due to the
presence of the escarpment, and therefore a linear or planar trend model would not successfully
account for these data points. A second-degree polynomial, commonly referred to as a parabola
when graphed, allows for a curve in the modelling of the trend; this curved surface will fit the
water well dataset far better.
3.2. Inverse Distance Weighted The Inverse Distance Weighted (IDW) interpolation model operates under the assumption that
things that are close to one another are more alike than things that are farther apart (ESRI,
2012). Using this assumption, unmeasured locations are predicted using a weighting system;
measured points closer to the prediction location are given greater weight to those farther away
from the prediction location (ESRI, 2012).
Study Area Cross-Section
St Catharines
9,0008,0007,0006,0005,0004,0003,0002,0001,0000
Geostatistics Major Assignment March 19, 2014
13
The IDW model generally works reasonably well for elevation values, due to the fact that
elevation values are typically more similar close to one another than far. The IDW model was
used within the Geostatistical Wizard to create the interpolated surface.
The original idea going into this project was to create an IDW surface that was exact as possible,
to better compare with all of the fluctuations that will be seen in the Krigged surface; however,
after working through the wizard with the dataset, it was decided that the IDW will be a
smoother surface. This decision was made because it will hopefully increase the overall
accuracy of the surface and reduce error, but also to avoid the bulls-eye effect near data points
that may have different z-values than the surrounding area.
Geostatistics Major Assignment March 19, 2014
14
The first step in creating the IDW model was choosing the major and minor semiaxes for the
search neighbourhood. With the presence of the Niagara Escarpment running east to west
throughout the study area, it was assumed that values in the east-west direction would be more
similar to a predicted location than values in the north-south direction, particularly near the
escarpment. For this reason, an ellipsoidal search location was chosen and aligned parallel to
the escarpment by choosing a major semiaxis of 4,000, a minor semiaxis of 1,500, and an angle
of 60˚ from north. A comparison of the IDW model and search neighbourhood compared
against a digital elevation model can be seen in Figure 9, showing the elevation change the
neighbourhood was placed parallel to.
Figure 9: Comparison of IDW model to digital elevation model
Aligning the search neighbourhood this way helps for measurements near the escarpment as
elevation values to the north and south would be quite a bit different whereas values to the east
and west should be more similar. Eight sectors were chosen so that the surface would be
smoother; using eight sectors allows the maximum neighbour (20) and minimum neighbour (5)
values to be placed on each sector as opposed to overall (ESRI, 2012). This places less weight on
more surrounding data points, as opposed to only using a maximum of 20 points and placing
very high weights on nearby points. With the search neighbourhood parameters finalized, the
power parameter was then optimized using the Geostatistical Wizard. The power parameter
indicates how the weighting of data points will reduce based on distance; the optimization of
Geostatistics Major Assignment March 19, 2014
15
this parameter is determined by minimizing the root mean square prediction error, and in the
case of the water well dataset, became 1.22844 (ESRI, 2012). With the parameters completed,
the wizard can be finished and the interpolated surface will be produced. The completion of the
wizard comes along with a method report detailing the parameters used in the interpolation;
the method report for the water wells can be seen in Figure 10.
Figure 10: Method report for IDW interpolation
Geostatistics Major Assignment March 19, 2014
16
The final IDW output can be seen in Figure 11.
Figure 11: Final IDW map output
Geostatistics Major Assignment March 19, 2014
17
3.3. Kriging The Kriging interpolation technique is similar to the IDW interpolator, with a farm more
geostatistically intensive approach. Similar to IDW, Kriging assigns weights to the surrounding
measured values to derive its predictions; however, unlike IDW, in Kriging the weights are not
dependent solely on distance to the predicted location but also dependent on the spatial
arrangement and autocorrelation of those points (ESRI, 2012). Therefore, Kriging essentially
quantifies the basic rule that things closer together are more similar than those far apart, and
uses it as part of the weighting method within the formula.
The first step in the Kriging process is determining the Kriging method to use; for use in this
project, the Empirical Bayesian Kriging (EBK) method was chosen. This method was chosen
primarily for the reason that in large datasets, EBK subsets the input data into overlapping
subsets for which multiple semivariograms are calculated and analyzed (ESRI, 2012). The
prediction for each location is then generated using unique semivariogram distributions,
weighting subsets closer to the location higher than those far away (ESRI, 2012). Due to the
extreme variation near the escarpment, the subsetting of data and subsequent analysis of
multiple semivariograms provided a more reliable and accurate Kriging approach than any other
method.
The EBK Kriging method essentially accounts for error introduced when estimating the
semivariogram; whereas other Kriging methods assume that the estimated semivariogram is the
true semivariogram for the entire interpolation region (ESRI, 2012). The semivariogram
estimation also reduces the amount of minimal interactive modelling, which in turn can reduce
the amount of human error introduced into the model.
With the kriging method chosen, the parameters must once again be filled out in the
Geostatistical Wizard. The parameters were chosen to try and highlight the variation in
elevation that is seen within the study area. The predictive surface is an attempt to build a strict
surface within the study area, picking up on smaller elevation changes than the IDW.
The subset size, which is defaulted as 100 data points, was reduced to 50 data points, in order to
create more, smaller subsets. This was done with the reasoning that the smaller subsets will in
turn give even lesser weights to values farther away and values that are less spatially auto
correlated. The overlap factor for the subsets was 1.2, indicating that about 20% of data points
will be used in two subsets, while the remaining 80% will only be included in one subset. The
number of simulations was left as 100, as changing this value did very little to the actual result.
The output surface type was also left as its default value of a prediction surface, as opposed to a
predictive surface showing probability or prediction standard error.
The search neighbourhood parameters were set up in an attempt to capture all of the variation
in the study area. The radius of the search neighbourhood was increased slightly from 750 to
800, but the minimum neighbours value was reduced significantly to only 3. This was done with
the reasoning that areas with very few surrounding data points should not travel a great
Geostatistics Major Assignment March 19, 2014
18
distance to obtain their minimum amount of neighbours. The maximum amount of neighbours
was increased to 20 to also account for areas where there were large amounts of data points.
Lastly the sector was left as a standard circle with only one sector, which was performed once
again to try and account for the variation in the surface. Where the IDW was an attempt at
creating a smooth surface and used eight sectors, the krigged surface is far stricter and thus only
one sector is preferred.
With the parameters for the kriging completed, the Geostatistical Wizard can be finished and
the method report produced. The method report for the EBK performed can be seen in Figure
12.
Figure 12: Method report for EBK interpolator
Geostatistics Major Assignment March 19, 2014
19
The final EBK output can be seen in Figure 13.
Figure 13: Empirical Bayesian Kriging output
Geostatistics Major Assignment March 19, 2014
20
4. Analysis The following section details an analysis of the two interpolation techniques, the data coverage
within the study area, and the accuracy of the results.
4.1. Interpolation Techniques The IDW and EBK interpolation techniques have separate advantages and disadvantages, some of which
can be seen in the comparison between the two outputs that have been created for the water well
dataset. Multiple differences occur between the two outputs due to the difference in user defined
inputs, as well as the major statistical differences between the two interpolation methods.
Geostatistics Major Assignment March 19, 2014
21
The IDW technique is extremely sensitive to clusters and outliers since the technique is directly
based off of a linear distance weighting system (ESRI, 2012). In clustered data, predictions near
clusters may be very accurate, but predictions made in areas with few data points probably will
not be, unless very little variation occurs in the study area. Outliers can also greatly affect the
data; extreme values that are highly weighted will deteriorate the accuracy of the predicted
result. Figure 14 shows examples, outlined in blue circles, of areas where abnormal elevation
values created small areas of reduced relief.
Figure 14: Problems with IDW result
These areas may not be statistical outliers, and instead may just be quite a bit different than the
surrounding data points. In this case it is known that many of these low spots are wells dug
along the 12 Mile Creek waterway, which reduces their elevations compared to surrounding
points. Aside from these isolated points, the IDW produced a smooth surface that gradually
increases the farther south it moves.
Geostatistics Major Assignment March 19, 2014
22
The EBK technique is affected by clusters and outliers, but not to the extent of the IDW
technique. The use of subsets and multiple semivariograms can aid in reducing the weight
assigned to points that are less spatially auto correlated to the prediction location. Areas within
the study area with large gaps will once again be poorly represented by the model, but using a
more advanced technique like EBK, the error may not be as drastic as with IDW. An example of
this can be seen in Figure 15.
Figure 15: Issues with EBK output
The area outlined in blue shows the linear depression pattern within the EBK output, which
closely follows the true location of 12 Mile Creek. The use of spatial autocorrelation, instead of
simply distance, allows the EBK interpolator to make more educated decisions on what the
predicted value could be at any location. For this example, the trouble area (outlined in purple),
is an area with a real lack of data points, which the EBK modelled far better than the IDW.
Geostatistics Major Assignment March 19, 2014
23
These outputs are tough to compare in terms of which technique performed the better
interpolation. This is because the IDW was chosen to be a smoother surface and the EBK was
chosen to be a stricter surface. Expected differences can be seen when looking at the
differences in the two interpolation techniques (Figure 16).
Figure 16: Differences between EBK and IDW
Figure 16 shows the differences in elevation between the two techniques. Areas in blues are
locations where the EBK was quite a bit greater than the IDW, whereas areas in orange and red
are areas where the EBK was quite a bit lower than the IDW. A lot of difference can be seen in
the aforementioned ‘trouble area’; this could be due to a lack of data points, or simply that the
IDW followed a smooth gradient and did not pick up the elevation decrease near 12 Mile Creek.
4.2. Data Coverage The data coverage within the study area greatly affects the output of an interpolated surface.
Areas with very little or no data points are going to be extremely tough to predict without data
Geostatistics Major Assignment March 19, 2014
24
points close enough to them to interpolate from. The water well elevation measurements show
two major gaps within the study area (Figure 17).
Figure 17: Study area data gaps
These two areas are rather large and yet have very few data points. The lower yellow polygon
was previously described as a trouble area as elevation values are high aside from a select few
that were within the 12 Mile Creek basin. The lack of data in this area makes it extremely
difficult for the interpolator to pick up on a feature like 12 Mile Creek. The upper yellow
polygon is located in the urban area of St. Catharines, which could explain why so few wells have
been drilled there. This area did not cause as much trouble due to it not being an area of high
variability. The upper polygon is located north of the escarpment, on the Lake Iroquois bench,
where the elevation changes are very gradual and follow a similar trend to all of the closest data
points, even points that may be far away.
Geostatistics Major Assignment March 19, 2014
25
Aside from these two trouble areas, the data coverage is pretty good for the study area and
allows for fairly accurate interpolation.
4.3. Accuracy of Results The accuracy of the results is a difficult thing to quantify, so therefore the interpolation
techniques accuracy will mainly be based off of visual assessment compared with satellite
imagery.
The IDW for the most part gave an accurate result, but with less precision than the EBK. Within
the IDW result, a constant smooth gradient can be seen, indicating elevation increases with
movement in the southward direction. Small elevation changes, such as those that occur near
water bodies, are not picked up in the IDW for the most part. Figure 18 shows the IDW result
overlaid onto satellite imagery, to assess accuracy.
Figure 18: IDW overlaid onto satellite imagery
Geostatistics Major Assignment March 19, 2014
26
It can be seen that 12 Mile Creek, as well as the Welland Canal, do not represent any elevation
changes, indicating that the IDW interpolator smoothed over those areas. The green circular
areas in the bottom-left of the figure may appear to be errors, but the elevation measurements
are actually accurate due to them lying within the creek bed. The only noticeable error in the
study area is the location outlined by the yellow ellipse. This small depression has been
concluded to have occurred due to a measurement error, as elevation values in Google Earth
disprove the measurement value obtained from the water well.
The EBK result was slightly more precise than the IDW, picking up smaller elevation changes and
better representing the actual topography in the area. The EBK does a better job at picking up
elevation changes, even when there is a lack of data points for the predicted location. This can
be seen in Figure 19, as low elevation colours follow 12 Mile Creek far better than the IDW.
Figure 19: EBK overlaid on satellite imagery
Geostatistics Major Assignment March 19, 2014
27
The area outlined in the yellow circle is the depression caused by the previously mentioned
measurement error. The only other slight error is indicated by the yellow ellipse; it shows a
small linear depression, however it is slightly off from where 12 Mile Creek actually runs.
In all, the accuracy of both interpolation techniques was acceptable, considering what was trying
to be obtained from each interpolation technique.
5. Future Recommendations The major issue encountered when completing the project, was the lack of adequate data
coverage in certain areas. Major data gaps create areas where it is very difficult for software to
interpolate a surface due to a lack of reference points. Even smaller data gaps can pose large
problems if they occur in areas of high variability. For example, if more data points were found
along the escarpment, as well as along 12 Mile Creek, both of these features could be better
identified using interpolation.
In the future, additional datasets could be used to supplement the well water data with more
elevation measurements. Additional data points in the previously mentioned areas would
provide a far more accurate result for the entire study area.
6. Conclusion The Geostatistical Analysis of Student Collected Spatial Data project successfully was completed
with the completion of two interpolated surfaces. Water well data, with elevation
measurements, was obtained for the city of St. Catharines, and was used as the basis for
interpolating each surface.
Two surface interpolation techniques were used, the Inverse Distance Weighted, and the
Empirical Bayesian Kriging. The IDW technique was used to create a much smoother surface,
which gave a good overall estimate of elevations in the area, whereas the EBK technique was
used to create a much stricter surface.
Both interpolation techniques produced surfaces that achieved the original goals and desires of
the client and project. Limited errors occurred throughout the project; however, overall data
coverage in the area could be improved which in turn would improve the interpolated surfaces.
Geostatistics Major Assignment March 19, 2014
28
Bibliography Babish, G. (2006). Geostatistics Without Tears: A practical guide to surface interpolation, geostatistics,
variograms and kriging. Regina: Environment Canada.
ESRI. (2012). ArcGIS 10.1 Help. Redlands, CA: ESRI.
Smith, I. D. (2014, January 24). Geostatistical Analysis of Student Collected Spatial Data. Retrieved from
Terms of Reference:
https://niagara.blackboard.com/webapps/portal/frameset.jsp?tab_tab_group_id=_2_1&url=%2
Fwebapps%2Fblackboard%2Fexecute%2Flauncher%3Ftype%3DCourse%26id%3D_113105_1%26
url%3D
Wikipedia. (2014, March 18). Heteroscedasticity. Retrieved from Wikipedia:
http://en.wikipedia.org/wiki/Heteroscedasticity