+ All Categories
Home > Documents > Geography, Spatial Data Analysis, and Geostatistics: An...

Geography, Spatial Data Analysis, and Geostatistics: An...

Date post: 23-May-2018
Category:
Upload: buidung
View: 216 times
Download: 1 times
Share this document with a friend
25
Geography, Spatial Data Analysis, and Geostatistics: An Overview Robert P. Haining, 1 Ruth Kerry, 2 Margaret A. Oliver 3 1 Department of Geography, University of Cambridge, Cambridge, UK, 2 Department of Geography, Brigham Young University, Provo, UT, and CRSSA, Rutgers University, New Brunswick, NJ, 3 Department of Soil Science, University of Reading, Reading, UK Geostatistics is a distinctive methodology within the field of spatial statistics. In the past, it has been linked to particular problems (e.g., spatial interpolation by kriging) and types of spatial data (attributes defined on continuous space). It has been used more by physical than human geographers because of the nature of their types of data. The approach taken by geostatisticians has several features that distinguish it from the methods typically used by human geographers for analyzing spatial variation associ- ated with regional data, and we discuss these. Geostatisticians attach much impor- tance to estimating and modeling the variogram to explore and analyze spatial variation because of the insight it provides. This article identifies the benefits of geo- statistics, reviews its uses, and examines some of the recent developments that make it valuable for the analysis of data on areal supports across a wide range of problems. Introduction As an introduction to this special issue, the purpose of this article is to provide an overview of the core concepts and techniques of geostatistics, together with a short literature review of its application in the environmental sciences and in geography. Geostatistics has a long history of application in the environmental sciences where data are on a point or small regular area support, but it is now being applied to re- gional data where data are on an areal support that might be large and regular or irregular. We describe the new tools associated with the latter type of data and contrast them with techniques of spatial data analysis with which geographers, especially hu- man geographers, are familiar. These techniques have descended more or less directly from work that began in the 1960s by Dacey (1968) and Cliff and Ord (1969), also the subject of a recent special issue of Geographical Analysis (2009, issue 4). Geostatistics, by contrast, has a different lineage and uses a different set of tools and techniques. Correspondence: Ruth Kerry, Department of Geography, 690 SWKT, Brigham Young University, Provo, UT 84602 e-mail: [email protected] Submitted: August 22, 2008. Revised version accepted: September 9, 2009. Geographical Analysis 42 (2010) 7–31 r 2010 The Ohio State University 7 Geographical Analysis ISSN 0016-7363
Transcript

Geography, Spatial Data Analysis, and

Geostatistics: An Overview

Robert P. Haining,1 Ruth Kerry,2 Margaret A. Oliver3

1Department of Geography, University of Cambridge, Cambridge, UK, 2Department of Geography, Brigham

Young University, Provo, UT, and CRSSA, Rutgers University, New Brunswick, NJ, 3Department of Soil

Science, University of Reading, Reading, UK

Geostatistics is a distinctive methodology within the field of spatial statistics. In the

past, it has been linked to particular problems (e.g., spatial interpolation by kriging)

and types of spatial data (attributes defined on continuous space). It has been used

more by physical than human geographers because of the nature of their types of data.

The approach taken by geostatisticians has several features that distinguish it from the

methods typically used by human geographers for analyzing spatial variation associ-

ated with regional data, and we discuss these. Geostatisticians attach much impor-

tance to estimating and modeling the variogram to explore and analyze spatial

variation because of the insight it provides. This article identifies the benefits of geo-

statistics, reviews its uses, and examines some of the recent developments that make it

valuable for the analysis of data on areal supports across a wide range of problems.

Introduction

As an introduction to this special issue, the purpose of this article is to provide an

overview of the core concepts and techniques of geostatistics, together with a short

literature review of its application in the environmental sciences and in geography.

Geostatistics has a long history of application in the environmental sciences where

data are on a point or small regular area support, but it is now being applied to re-

gional data where data are on an areal support that might be large and regular or

irregular. We describe the new tools associated with the latter type of data and contrast

them with techniques of spatial data analysis with which geographers, especially hu-

man geographers, are familiar. These techniques have descended more or less directly

from work that began in the 1960s by Dacey (1968) and Cliff and Ord (1969), also the

subject of a recent special issue of Geographical Analysis (2009, issue 4). Geostatistics,

by contrast, has a different lineage and uses a different set of tools and techniques.

Correspondence: Ruth Kerry, Department of Geography, 690 SWKT, Brigham YoungUniversity, Provo, UT 84602e-mail: [email protected]

Submitted: August 22, 2008. Revised version accepted: September 9, 2009.

Geographical Analysis 42 (2010) 7–31 r 2010 The Ohio State University 7

Geographical Analysis ISSN 0016-7363

The use of the term spatial analysis in geography can be traced back to the

1950s (see, e.g., Berry and Marble 1968). It includes several distinctive elements

(Haining 2003, pp. 4–5), but the statistical analysis of spatial data is the focus here,

referred to by statisticians as spatial statistics (Ripley 1981) or statistics for spatial

data (Cressie 1993). Geographers often refer to these as methods for spatial data

analysis (Haining 1993), and many of these models and techniques figure prom-

inently in geographic information science (Goodchild and Haining 2004) and spa-

tial econometrics (Anselin 1988).

The roots of spatial statistics can be traced back to the early part of the twen-

tieth century to analyses of agricultural field trial data by statisticians. Geostatistics

is a component of spatial statistics, although its evolution has been led principally

by applied scientists and mathematicians rather than by classically trained statis-

ticians. This historical context may explain why little cross-fertilization occurred

with other branches of spatial statistics until quite recently (Cressie 1993; Diggle

and Ribeiro 2007) and why geostatistics is distinctive.

Any methodology for analyzing spatial data needs to recognize that such data

have the fundamental property of spatial dependence or spatial autocorrelation. For

many attributes, values recorded at locations close together in space are correlated

(autocorrelated); as the separating distance increases, autocorrelation weakens.1

The autocorrelation structure in a region may be complex, with several scales of

variation nested within, or superimposed on, one another, varying with direction

(anisotropic) and between subareas (spatially heterogeneous). Quantifying spatial

dependence matters, whether the purpose of an analysis is to interpolate, to fit a

regression model, or to test a hypothesis (Haining 2003, pp.33–36, 40–41). Differ-

ent branches of spatial statistics model spatial dependence in different ways.

Geographical data acquire other properties as a consequence of the chosen

representation of geographic space. The areal units into which a study region may

be partitioned for reporting attribute values often vary in size and shape (e.g., cen-

sus output areas). If the population denominator for rates (e.g., mortality rates) var-

ies, the standard errors of such statistics are not constant across a map. Therefore,

values obtained from irregularly sized areas may not be directly comparable, mak-

ing map interpretation potentially problematic. Data for areas with small popula-

tions suffer from the ‘‘small number problem’’ (Haining 2003, pp.196–99). Methods

must be able to deal not only with spatial dependence but also with the properties

acquired as a consequence of a chosen representation.

This article elucidates the distinctiveness of geostatistics and how it differs from

other branches of spatial statistics that are concerned with the same general prob-

lem of analyzing spatial variation and illustrates the relevance of geostatistics to

geographers. Physical geographers have been somewhat receptive to geostatistics,

partly because their problems and data are similar to those of other earth scientists,

who were among the early practitioners of geostatistics. Nevertheless, geostatistical

methods generally have been used only for basic interpolation. Human geogra-

phers, by contrast, generally have taken little interest in geostatistics for spatial data

Geographical Analysis

8

analysis because often attribute values are not defined everywhere in a region, and

data values are defined for irregular spatial units (e.g., census areas).

To verify some of these assertions, we searched the bibliographic databases

Geobase and Information Sciences Institute Web of Knowledge (ISI).2 These

searches produced 4377 and 1596 hits, respectively. Table 1 shows the journals

that publish most articles on geostatistics (also see Zhou et al. 2007). Physical geo-

graphers publish in these journals. However, of the institutions with authors who

have published more than three articles on geostatistics articles identified by the

Geobase search, only 9 out of 61 were geography departments, and only 50 of over

4000 articles found with this search included authors from geography departments.

Table 1 also shows specifically which geography journals2 have published the most

articles on geostatistics; none is devoted solely to human geography. Figures 1a and

1b show that the number of articles has increased over time, both in general and in

geography; but for the latter, the increase seems to have leveled out since 2000.

Table 1 Number of geostatistics articles identified by Geobase and ISI searches in the top 10

journals that publish most geostatistics articles in general and in geography

Number of articles in top 10 journals publishing most geostatistics articles:

In general In geography

Journal name Geobase ISI Journal name Geobase ISI

Mathematical Geology 310 144 International Journal of

GIS

23 10

Geoderma 161 51 Acta Geoglogica Sinica 13 0

Water Resources

Research

143 46 Geographical Analysis 12 9

Computers and

Geosciences

138 47 Geographical and

Environmental

Modelling

9 0

Journal of Hydrology 105 30 Cartography and GIS 7 0

Soil Science Society of

American Journal

78 16 Journal of

Geographical Science

7 1

Environmentrics 66 22 Journal of

Biogeography

6 1

International Journal of

Remote Sensing

65 22 The Professional

Geographer

6 2

Stochastic

Environmental Research

and Risk Assessment

47 17 Progress in Physical

Geography

6 3

The Environmental

Monitoring and

Assessment

42 13 Physical Geography 4 2

Geography, Spatial Data Analysis, and GeostatisticsRobert P. Haining et al.

9

Between 1990 and 2008, only 3.4% of the Geobase results and 2.1% of the ISI

results identified geostatistics articles that were published in geography journals.

Geostatistics: Its core concepts, techniques, and relationship with other

approaches to spatial data analysis

An historical perspective on geostatistics

The use of the term geostatistics stems from Matheron’s development of a compre-

hensive theory for the prediction of properties in geographical space. Matheron’s

(1963) theoretical framework for geostatistics was developed from D. G. Krige’s

empirical ideas for improving predictions of the amount of gold in rock (Krige 1951)

by using neighboring samples. Matheron (1963) uses the term kriging for the

method of optimal prediction or estimation in geographical space—a spatial best

linear unbiased predictor (BLUP). Matheron’s fundamental contribution is to define

the covariance or variogram of a random field based on a probabilistic or stochastic

approach to the analysis of spatial variation, which recognizes its complexity and

treats as random that aspect of the variation that appears to be random. Matheron’s

approach led to the formulation of models of spatial variation that provide weights

for the BLUP (see Bilodeau, Meyer, and Schmitt 2005 for more detail on Matheron’s

contribution to geostatistics).

Matheron’s ideas had been anticipated in earlier work. The article by Mercer

and Hall (1911), as well as the appendix to it by Student (1911), anticipates some of

the fundamental features of modern geostatistics: support, spatial dependence,

correlation range, and the nugget effect. Kolmogorov (1941) devises the ‘‘structure

function’’ to represent both spatial correlation, which we now recognize as the

variogram, and his method of interpolation now known as kriging (Ripley 1981,

pp. 44–50). Matern (1960) theoretically derives some of the familiar ‘‘permissible’’

y = 23.649x – 47011R2 = 0.657

0

100

200

300

400

500

600

700

800 (a) (b)

1990 1995 2000 2005 2010Year

Tot

al n

umbe

r of

geo

stat

istic

alar

ticle

s in

all

jour

nals

(G

eoba

se)

y = 0.7812x – 1553R2 = 0.5735

0

5

10

15

20

25

1990 1995 2000 2005 2010Year

Num

ber

of g

eost

atis

tical

art

icle

s in

geog

raph

y jo

urna

ls (

Geo

base

)

Figure 1. Number of geostatistics articles identified by a Geobase search published (a) in

general and (b) in geography journals as a function of time.

Geographical Analysis

10

functions to describe spatial covariance from random point processes. These are

equivalent to Jowett’s (1955) ‘‘serial variation function,’’ which now is known as

the variogram.

The geostatistical approach to describing spatial dependence: the covariance

and variogram

Most spatial properties vary in such a complex way that variation cannot be defined

deterministically, and thus the basis of geostatistics is to treat the variable of interest

as a random variable. Traditionally, a variable is defined on a continuous surface

such that at each point, s, in space, a range of values exists for an attribute, Z(s), and

the one observed, z(s), is drawn at random from a probability distribution. The set of

random variables, fZ(s): sADg, where D is a subset of two-dimensional space (R2),

is a random field, and the actual values of Z observed are for just one of a poten-

tially infinite number of realizations of it (the ‘‘superpopulation’’ view). Geostatis-

tics is based on regionalized variable theory (RVT), which provides a sound model

of how properties vary in space. It recognizes gradual change across R2, locally

erratic and structured components of variation, and uncertainty.

To describe the variation of an underlying random field and to estimate the

mean and variance of an attribute, the spatial autocovariance (or covariance) is

estimated to describe quantitatively the relation between pairs of points a given

distance apart. This covariance is given by

Cðsi ; sjÞ ¼ E½ ZðsiÞ � mðsiÞf g ZðsjÞ � mðsjÞ� �

� ð1Þ

where m(si) and m(sj) are the means of Z at si and sj, and E[.] is the expected value.

Because only one realization of Z exists at each point, these means are unknown.

To proceed, geostatistics invokes assumptions of stationarity, which means that

certain properties of a random field are assumed to be the same everywhere. We

assume that the mean, m5 E[Z(s)], is constant for all s, and hence m(si) and m(sj) can

be replaced by m, which can be estimated by repetitive sampling. When si and sj

coincide, equation (1) defines the variance or the a priori variance of a field,

s2 5 E[fZ(s)� mg2] which is assumed to be finite and, as for the mean, the same

everywhere. When si and sj do not coincide, their covariance depends on their

separation and not on their absolute positions, a property that applies to any pair of

points separated by lag h (a vector in both distance and direction). Therefore, given

two points si and sj separated by lag h,

Cðsi; sjÞ ¼E½ ZðsiÞ � mf g ZðsjÞ � m� �

�¼E½fZðsÞgfZðsþ hÞg � m2�¼CðhÞ

ð2Þ

which is also constant for any given h. This constancy of the first and second mo-

ments of a random field constitutes second-order or weak stationarity. Equation (2)

indicates that the covariance is a function of the spatial lag and describes quan-

titatively the dependence between values of Z with changing separation or lag

Geography, Spatial Data Analysis, and GeostatisticsRobert P. Haining et al.

11

distance. Often the autocovariance is converted to the dimensionless autocorrela-

tion by

rðhÞ ¼ CðhÞ=Cð0Þ

where C(0) 5s2 is the covariance at lag 0.

The mean often appears to change across a region, and the variance will then

appear to increase indefinitely as the extent of an area increases. The covariance

cannot be defined because no value for m exists to insert into equation (2). This

situation is a departure from weak stationarity. Matheron’s (1965) solution is the

weaker intrinsic hypothesis of geostatistics. Although the general mean might not

be constant, it would be constant for small distances, and so the expected differ-

ences would be zero:

E½ZðsÞ � Zðsþ hÞ� ¼ 0

and the expected squared differences for those lags define their variances:

E½ ZðsÞ � Zðsþ hÞf g2� ¼ var½ZðsÞ � Zðsþ hÞ� ¼ 2gðhÞ ð3Þ

The quantity g(h) is known as the semivariance at lag h, or the variance per

point when points are considered in pairs. As for the covariance, the semivariance

depends only on the lag and not on the absolute positions of the data points. As a

function of h, g(h) is the semivariogram or, more usually, the variogram. The va-

riogram can be applied when the assumptions of second-order stationarity do not

hold or when uncertainty exists about whether they do or do not. This result makes

the variogram a valuable tool, and accordingly it has become the cornerstone of

geostatistics. If the field fZ(s): sADg is second-order stationary, the semivariance

and covariance are equivalent.

The usual method of computing the empirical semivariances from data,

fz(s1), z(s2), . . .., z(sn)g, at sample points s1, s2, . . ., sn, is Matheron’s (1965) method

of moments (MoM) estimator:

gðhÞ ¼ 1

2mðhÞXmðhÞi¼1

fzðsiÞ � zðsi þ hÞg2 ð4Þ

where z(si) and z(si1h) are the actual values of Z at locations si and si1h, which

are separated by the lag h. The sum is over m(h), which is the number of paired

comparisons separated by h. By changing h, an ordered set of semivariances is

obtained; these semivariances constitute the experimental or sample variogram.

Other estimators of the variogram exist, most notably the residual maximum

likelihood (REML) estimator (Pardo-Iguzquiza 1998a, b). Diggle and Ribeiro

(2007) also suggest a model-based approach to geostatistics where expert knowl-

edge, in addition to properties of the data, plays a role in determining an appro-

priate variogram model.

Assuming sufficient data from which to compute a reliable empirical variogram

(see Webster and Oliver 1992, pp. 178, 190–91), a best-fitting model is selected

Geographical Analysis

12

from what are known as authorized (or permissible) functions (see Webster and

Oliver 2007, pp. 82–95). Several fitting procedures exist from which to choose

(Cressie 1993, pp. 90–104).

The variogram can take on a variety of forms (Fig. 2), and the reader is referred

to Webster and Oliver (2007, pp. 56–60) for a discussion of the most common

forms and their interpretations and for explanations of commonly used terms that

describe important features of the variogram, such as its ‘‘sill,’’ ‘‘range,’’ and ‘‘nug-

get’’ variance. Introductions to basic geostatistical methodology and terms are also

given by Armstrong (1998), Christensen (2001), Goovaerts (1997), and Isaaks and

Srivastava (1989). The variogram is a valuable exploratory data tool, regardless of

whether an analyst wishes to use other geostatistical tools. For example, if data

result in a variogram that appears as a horizontal set of points (i.e., pure nugget)

(a) (b)

Lag distance (h) Lag distance (h)

(c) (d)

Lag distance (h)

Range (a)

c1

Sill variance

c1

a1a2

c2

0 40 80 120Lag distance (h)

3.5

3.0

2.5

2.0

1.5

1.0

0.5

0.0

Var

ianc

e

c0

160

(c0+c1)

Nugget (c0)

Figure 2. Examples of (a) a bounded variogram fitted with a spherical model, (b) a bounded

variogram fitted with an exponential model (nugget highlighted), (c) an unbounded vario-

gram fitted with a power model, and (d) a bounded variogram with nested variation fitted

with a double spherical model.

Geography, Spatial Data Analysis, and GeostatisticsRobert P. Haining et al.

13

interpolation would merely give the local observed mean. In such cases, all that is

essentially being done is the application of a local smoothing function. The vario-

gram shape may also indicate whether the variation has a spatially random com-

ponent (nugget, c0; Fig. 2a–d) and whether more than one scale of variation is

present, which requires a nested model that can be used for factorial kriging (Fig.

2d). If a variogram has no upper bound or sill (Fig. 2c), disjunctive kriging and

empirical-BLUP are precluded. Anisotropy in variation can be explored by com-

puting a variogram in different directions. The variogram map (semivariances plot-

ted against separation distance in the x and y directions [Webster and Oliver 2007,

p. 75]) can indicate the directions of greatest and least variation.

Kriging

Kriging is a generic term for a range of BLUP least-squares methods of spatial in-

terpolation in geostatistics. It can be linear or nonlinear, although the former is

more common. Kriging provides not only predictions but also the kriging errors or

kriging variances at each prediction location. The original formulation of linear

kriging, now known as ordinary kriging (Journel and Huijbregts 1978), is the most

robust and most used method. Ordinary kriging assumes that the mean is unknown

but constant and that the random field is locally stationary. In linear kriging, the

estimate at any location s0, Z ðs0Þ, is a weighted linear combination of the data:

Zðs0Þ ¼Xn

i¼1

lizðsiÞ ð5Þ

The weights, li, are chosen to minimize E½fZðs0Þ � Zðs0Þg2�, or the kriging

variance, and to ensure that the estimates are unbiased, the weights are constrained

to sum to 1 (Webster and Oliver 2007, pp. 155–59). Kriging uses the spatial infor-

mation described by a variogram function together with the data to predict opti-

mally. The weights depend on the variogram and the spatial positions of the data,

fz(si)g, in relation to one another and to the target point or block (s0). Observations

that are nearest to a prediction point, s0, have the largest weights, but clusters of

adjacent observations that are highly correlated are individually down-weighted.

Kriging is essentially a local predictor that, depending upon the aims of the pre-

diction, can be applied to point (punctual kriging) or block supports of various sizes

(block kriging) and shapes (area-to-area [ATA] kriging) even if the sample informa-

tion is for points. This theory also provides the basis for designing optimal sampling

schemes for kriging using a variogram (Webster and Oliver 2007, pp. 186–88).

Since its original formulation, kriging has been elaborated to tackle increas-

ingly complex problems. Disjunctive (Matheron 1973) and indicator (Journel 1982)

kriging are nonlinear forms that give probabilities that attribute values are above or

below a given threshold. Both techniques have been widely used as part of the risk

assessment of contaminated sites based on thresholds that define serious contam-

ination (Brus et al. 2002; Gaus et al. 2003). Oliver, Webster, and McGrath (1996)

also discuss the merits of disjunctive kriging for environmental management. Kerry

Geographical Analysis

14

and Oliver (2007) use indicator kriging to map soil structure from categorical field

observations, and Lloyd and Atkinson (2001) use it to assess the uncertainty of

digital elevation model (DEM) estimates. Lark and Ferguson (2004) use both indi-

cator and disjunctive kriging in an agricultural context to map the probability of

serious nutrient deficiency. Matheron (1969) originally introduced universal kriging

to deal with data with a strong deterministic component (i.e., trend), but the state-

of-the-art is empirical BLUP (Stein 1999). The latter is based on a variogram esti-

mated by REML (Lark, Cullis, and Welham 2006). For situations in which some

prior knowledge about a drift or trend exists, Omre (1987) introduces Bayesian

kriging.

When two or more variables are spatially correlated or co-regionalized, and

one is more expensive to obtain than the other, predictions of the less densely

sampled variable can be improved by ordinary cokriging (CK) (Matheron 1965).

When the cross-variogram has been computed and modeled, the strength of

co-regionalization and the potential benefit of CK for the dependent variable can

be assessed. Several other methods in geostatistics incorporate secondary informa-

tion to improve the accuracy of predictions of a primary variable. Examples include

regression kriging, simple kriging with locally varying means (SKlm), kriging with

an external drift (KED), and multivariate factorial kriging. These methods often have

been compared with ordinary kriging (Odeh, McBratney, and Chittleborough 1994;

Odeh and McBratney 2000; Rawlins et al. 2009). We refer interested readers to

Chiles and Delfiner (1999), Goovaerts (1997), McBratney et al. (2000), Wacker-

nagel (1995), and articles in this special issue for more detail on these methods,

which take advantage explicitly of the co-regionalization between variables. Cross-

variogram analysis also can describe how the relationship between properties

varies with scale in the form of structural correlations (Webster and Oliver 2007,

p. 240) or of regionalized correlation coefficients (Wackernagel 1995). These

multivariate geostatistical methods have interpretive as well as predictive power,

especially if the primary variable is regarded as analogous to the dependent vari-

able and if the secondary variables are treated as the independent variables.

Factorial kriging (Matheron 1982) was developed for nested variation. Long-

and short-range components of variation are estimated in a single analysis and can

be filtered out. Oliver, Webster, and Slocum (2000) use factorial kriging to filter

different scales of variation from remotely sensed imagery and to determine the

sources of variation at each scale. Goovaerts and Webster (1994) use this approach

to isolate different scales of variation in topsoil copper and cobalt concentrations

and to examine the correlation between them at various scales.

The kriging equations can be used with an existing variogram to design a new

optimal sampling scheme for kriging (McBratney, Webster, and Burgess 1981). No

data are required for this task, and it is advantageous when variables are expensive

to obtain and where previous sampling has been either too intensive or insufficient.

Atkinson (1991) uses geostatistics to optimize ground sampling strategies for re-

motely sensed investigations. The structure of spatial autocorrelation characterized

Geography, Spatial Data Analysis, and GeostatisticsRobert P. Haining et al.

15

by the variogram also can be used to improve the spatial continuity of classifications

(Oliver and Webster 1989). Atkinson and Lewis (2000) use this approach to classify

remotely sensed imagery, and Frogbrook and Oliver (2007) use it to identify spatially

coherent management zones for precision agriculture. Geostatistical methodology

also has been implemented for compressing image data (Oliver, Shine, and Slocum

2005) and for identifying redundant bands in hyperspectral imagery.

Kriging tends to smooth variation in data (see Webster and Oliver 2007, p.

267–68, for further explanation). Often this outcome is required, but sometimes

the uncertainty or likely variability of observed patterns needs to be established.

Determining uncertainty involves geostatistical simulation by methods such as

turning bands, sequential Gaussian simulation, and LU decomposition. Given a

specific variogram function and histogram, data with the desired characteristics can

be simulated to produce multiple equi-probable realizations with the same vario-

gram and histogram. Simulation also can be conditioned on existing data where the

characteristics of the conditioning data are taken into account. Goovaerts (2001)

provides an overview of the relative merits of kriging and simulation. Often sim-

ulation is used for risk assessment; for example, Bierkens (2006) uses conditional

simulation to determine uncertainties associated with groundwater pollution and

the associated costs of installing a monitoring network. Webster and Oliver (1992)

use turning bands simulation to create fields with a known variogram to determine

how well the generating variogram is reproduced by samples of different sizes.

They also determine confidence limits for experimental variograms computed from

different sample sizes.

Finally, geostatistical methods have been used to investigate temporal as well

as spatial autocorrelation. Applications in this context include studies by Kyriakidis

and Journel (2001a, b) of sulfate deposition over Europe and by Heuvelink, Musters,

and Pebesma (1996) of soil water content. Kyriakidis and Journel (1999) give an

overview of these methods.

Geostatistics for regional data

More recently, the basic theory of geostatistics has been adapted to predict and map

regional data and to model spatial variation in an attribute (a dependent variable) in

terms of other variables (independent variables), as in linear regression. This rep-

resents a particularly interesting and important development in geostatistics for

human geographers.

Oliver et al. (1998) develop binomial CK to analyze the risk of childhood can-

cer in the West Midlands of England. To estimate the variogram of risk for binomial

CK, population size is taken into account to give more weight to pairs of areas with

larger populations and hence more reliable rates (Oliver et al. 1998, pp. 286–87).

In addition to the kriging weights li, summing to 1, the weighted sum of the risk

covariances between the target point s0 and the centroids of the data supports si are

constrained to equal the variance of the underlying risk (Oliver et al. 1998, pp.

284–85).

Geographical Analysis

16

When the population at risk is large and the probability is small, the binomial

distribution approaches the Poisson. Monestiez et al. (2005, 2006) introduce Pois-

son kriging for rates where the small number problem is an issue. This approxi-

mation has since been applied to health data (Goovaerts 2005; Ali et al. 2006).

Again, areas with larger populations are given more weight when estimating the

variogram and when solving the kriging equations. An ‘‘error variance’’ term, de-

rived from the Poisson distribution (Goovaerts 2005, 2006b) and associated with

the reliability of the rates based on population size, is introduced.

Articles by Gotway and Young (2002, 2005), Kyriakidis (2004), and Goovaerts

(2006b, 2008) address problems associated with change of support in the case of

irregularly sized and shaped areas. This situation involves deconvolution of the

variogram obtained from areal data to estimate a point-support variogram. Then

area-to-point (ATP) kriging is applied where the measurement support is an area,

but the prediction support is a point. Goovaerts (2008) describes an approach to

ATP kriging that involves discretizing each area, where the number of discretizing

points for any area will depend on its size. Each discretizing point has an associated

population count obtained from small area census data; therefore, population het-

erogeneity is taken into account in estimating the deconvoluted variogram (Goo-

vaerts 2008, p. 109). The distance between any two areas (required for calculating

the variogram) is measured as a population-weighted average of the straight-line

distances between all the points that discretize the two areas (Goovaerts 2008, p.

106). Setting aside in this review the ‘‘supplementary and unverifiable hypotheses’’

(Journel and Huijbregts 1978, p. 231) that ATP kriging (downscaling) involves, the

creation of such maps of disease rates helps to reduce the visual bias of choropleth

maps, where physically large areas dominate.

Goovaerts (2006b, 2008) combines Poisson kriging with ATP kriging (Ky-

riakidis 2004) for the analysis of cancer rate data. ATA Poisson kriging is applied

when both the measurement and prediction supports are areas (blocks). Taking

population into account filters out the influence of the small number problem.

Comparing geostatistics with other approaches to spatial data analysis

Regional data are encountered widely in human geography, and methods have

been developed that, in many respects, contrast with those of geostatistics.

The initial interest in spatial dependence among human geographers in the

1960s took the form of adapting existing tests for spatial autocorrelation developed

by Moran (1950), Geary (1954), and Krishna Iyer (1949) on regular lattices to the

irregular shapes of geographical units (Cliff and Ord 1973). Moran’s I and Geary’s c

statistics are

I ¼ nXn

i;j¼1

dði; jÞðzðiÞ � �zÞðzðjÞ � �zÞ

24

35, Xn

i;j¼1

dði; jÞ

0@

1AXn

i¼1

ðzðiÞ � �zÞ224

35 ð6Þ

and

Geography, Spatial Data Analysis, and GeostatisticsRobert P. Haining et al.

17

c ¼ ðn � 1ÞXn

i;j¼1

dði; jÞðzðiÞ � zðjÞÞ224

35, 2

Xn

i;j¼1

dði; jÞ

0@

1AXn

i;j¼1

ðzðiÞ � zðjÞÞ224

35 ð7Þ

where z(i) is the observation for area i (i 5 1, . . ., n) and d(i,j) is 1 if i6¼j and i and j

are contiguous, and 0 otherwise. The numerator of equation (6) shares similarities

with an estimator for autocovariance (compare it with equation (1)), whereas the

numerator of equation (7) is similar to the MoM estimator for computing the semi-

variances (compare it with equation (4)). Cliff and Ord’s (1973) adaptation of these

statistics is achieved by replacing fd(i, j)g by the more general fw(i, j)g, which

specifies a connectivity or weights matrix. This specification can be based on

topological and/or geometric properties of the shapes or inter-area flow (or similar)

data as a measure of the ‘‘strength’’ of the connections between areas (Haining

2003, pp. 74–87). The resulting n-by-n matrix also could be used to identify the

different orders of contiguity (similar to the vector h in geostatistics; see equation

(4)), but in practice many analyses consider only the first-order or closest nearest

neighbors. Testing takes the form of a null hypothesis of no spatial autocorrelation

against a nonspecific alternative hypothesis that spatial autocorrelation is present.

This type of testing is of particular interest when examining the residuals from a

least-squares regression (for which some further modifications to equations (6) and

(7) and their distribution theory are needed), because model errors are required to

be independent. Investigations of the power of Moran’s I and Geary’s c by simu-

lation and evidence that the asymptotic relative efficiency of c to I is o1 have

resulted in Cliff and Ord’s (1973) modified I statistic becoming the spatial auto-

correlation test statistic of choice among geographers.

If spatial autocorrelation is detected in least-squares regression residuals, a

common response is to respecify the regression model using a simultaneous spatial

autoregressive model for the errors, where the value of the error term at location i is

modeled as a function of the values of the errors at adjacent locations plus an in-

dependent (white noise) term. Geographers analyzing regional data tend to model

spatial dependence using the simultaneous spatial autoregressive model (specified

either with respect to the dependent variable or the errors) rather than calculating

the empirical covariances and selecting a function that best fits these auto covari-

ances (for an overview, see Haining 2003, pp. 297–304, 350–58). This approach

taken by geographers is similar to that of econometricians when modeling time-

series data, which has led to the adoption, by some, of the term spatial economet-

rics (Anselin 1988).

Regression analysis is used to identify those independent variables that best

account for the variation in a dependent variable. The linear function element of a

regression model, which includes the independent variables, specifies the mean of

a dependent variable. It is variation in the set of significant independent variables

that model (or ‘‘statistically explains’’) variation in the mean of a dependent vari-

able. It is this variation usually that is of most interest to geographers. Spatial de-

Geographical Analysis

18

pendence in the second-order sense, which is important in geostatistics, refers only

to that part of the variation in a dependent variable that remains ‘‘unexplained’’ by

regression. This remark applies not only to fixed-effects modeling but also to ran-

dom-effects modeling of spatial data because the random effects term is often

specified using some form of (conditional) spatial autoregressive mode (Besag,

York, and Mollie 1991).

Two further comparative remarks merit attention here. First, frequently the ef-

fects of population size are handled in regression through weighted least squares—

down-weighting most those data values with the largest error variances in model

fitting (e.g., those based on the smallest population denominators). In geostatistics,

this down-weighting is introduced both into modeling the variogram and into the

kriging weights. Second, geographers have paid particular attention to a form of

spatial heterogeneity where properties vary, or appear to vary, with location, es-

pecially when observed over large regions. For example, the mean varies in dif-

ferent map segments (though not in the form of a trend). In addition, or alternately,

the variance and covariance, or even the relationship between a dependent and a

set of independent variables, might vary in different parts of a map. To quantify this

heterogeneity, statistical tests for global- or map-wide spatial dependence, such as

Moran’s I, have been complemented with local statistics, such as local indicators of

spatial association that analyze spatially defined subsets of data using moving win-

dows (Anselin 1995). Sometimes geographically weighted regression is used with

local subsets of data (see Fotheringham, Brunsdon, and Charlton 2000). Some of the

geostatistical methods previously discussed can be seen as equivalent develop-

ments to deal with spatial heterogeneity. First, SKlm uses known strata from a ge-

ology map, for example, to inform about a locally varying mean, and the class

residuals are kriged and added to the strata means (Goovaerts 1997). Similarly,

SKlm can use regression between primary and secondary variables to inform about

a local mean. McBratney, Hart, and McGarry (1991) also acknowledge that the

structure of the variogram can change spatially and thus have partitioned regions

based on ancillary data and computed separate variograms for subregions within

a study area. Walter et al. (2001) and Haas (1990) use local or moving window

variograms and find improvements in estimates over area-wide variograms in sit-

uations where data display heterogeneity. Finally, when KED is employed, allow-

ance is made for spatial variation in the linear relationships between variables or

the regression coefficients (Goovaerts 1997).

The equivalent problem to spatial interpolation for regional data occurs when

there are no data in some areas. Although interest may exist in trying to estimate these

missing values, for either prediction or mapping purposes (see, e.g., Haining 2003,

pp.154–74 for an overview of methods, including those based on the

expectation-maximization algorithm, which has similarities to kriging), the focus

usually is again on fitting a regression model. The concern is less with estimating the

missing values per se (although estimates are generated) and more with estimating

the parameters of a regression model when there are missing values in a database.

Geography, Spatial Data Analysis, and GeostatisticsRobert P. Haining et al.

19

An overview of applications of geostatistics in geography

Table 2 classifies the results from the previously described Geobase search,

providing one relevant reference per category.3 The table shows that 40% of geo-

statistical references found in ‘‘mainstream’’ geography journals refer to the use

of kriging in relation to temperature, rainfall erosivity, various atmospheric

and groundwater pollutants, soil properties, vegetation and species distribu-

tion patterns, and DEM creation. By contrast, kriging has been little used

in human geography, an exception being De Cola’s (2002) mapping of Lyme

disease.

Few articles in geography journals consider the importance of the variogram as

an interpretive tool, for example, for quantifying ‘‘spheres of influence’’ or the

proportion of variation that is spatially uncorrelated and correlated at the scale of

sampling, or for identifying variation at different scales and directions as a first step

in understanding processes. This oversight might indicate that few researchers in

geography are as familiar as they could be with computing and modeling the va-

riogram and with choosing an appropriate method of kriging. A possible cause of

this problem is that certain software packages, such as Surfer, offer kriging as a

method of interpolation with default settings and do not provide users with access

to experimental variogram and modeling options. In ArcGIS’s geostatistical analyst,

a variogram model is fitted to the variogram cloud, but determining visually

whether a model is a sensible fit is difficult because of the inherent skewness in

the distribution of squared differences (Cressie 1993, p. 41).

Table 2 shows that multivariate geostatistical methods, which incorporate sec-

ondary information into the interpolation process (e.g., SKlm, KED, and CK), have

been used for estimating air temperature and bird diversity using the normalized

difference vegetation index. Factorial kriging has been used to filter different scales

of variation in remotely sensed imagery and plant and soil data and to explore

scale-dependent correlations in health geography. Indicator kriging has rarely been

used in geography, but examples include incorporating it into the classification of

hyperspectral imagery and estimating pre-settlement vegetation patterns. Although

indicator and disjunctive kriging are important in risk assessment, which is partic-

ularly important to geographers studying natural hazards, no applications of these

two specific techniques were found. Furthermore, disjunctive kriging appears not to

have been referred to at all in the geographical literature. Also, although adapta-

tions of kriging, such as binomial CK and Poisson kriging, are ideal for examining

animal count data, which are often highly skewed and involve small numbers, no

examples were found in the biogeography literature.

Geostatistical change of support has received some attention in the geograph-

ical literature; it links with scale issues and the modifiable areal unit problem. In

remote sensing and biogeography, geostatistics has been used to investigate the

effects of change of support in sampling, but most investigations are in human

geography and are relatively recent, applying techniques such as ATP kriging.

Geographical Analysis

20

Tab

le2

Sum

mar

yof

aG

eobas

ese

arch

for

geost

atis

tics

1ar

ticl

esap

pea

ring

injo

urn

als

wit

hG

eogr

a�in

the

titl

e

Phys

ical

geogr

aphy

(subfi

elds)

Geo

stat

isti

cal

area

of

inve

stig

atio

n

Bio

geogr

aphy

Cli

mat

olo

gyG

eom

orp

holo

gyH

ydro

logy

Rem

ote

sensi

ng

Hum

an

geogr

aphy

Tota

lre

fs.

Kri

ging

for

inte

rpola

tion/

(com

par

ativ

e

inte

rpola

tion

studie

s)

9re

fs.

(1re

f.)—

Stra

nd

(1998)

Hes

slet

al.

(2007)

7re

fs.

(5re

fs.)

Wri

ght

etal

.

(2002)

Tat

alovi

ch,

Wil

son,

and

Cock

burn

(2006)

5re

fs.

(13

refs

.)—

Hock

and

Jense

n

(1999)

Lay

and

Wan

g(1

996)

3re

fs.—

Li,

Song,

and

Xia

o(2

005)

4re

fs.—

De

Cola

(2002)

28

refs

.(1

9

refs

.)

Inte

rpre

tive

valu

e

of

the

vari

ogr

am

8re

fs.—

Ken

t

etal

.(2

006)

4re

fs.—

Gom

ersa

ll

and

Hin

kel

(2001)

1re

f.—

Bia

n

and

Xie

(2004)

3re

fs.—

Ma,

Ma,

and

Xu

(2004)

16

refs

.

Mult

ivar

iate

geost

atis

tics

(KED

,

CK

,SK

lm)

7re

fs.—

Lin

etal

.(2

008)

2re

fs.—

Her

nan

dez

Rober

to(2

001)

9re

fs.

Fact

ori

alkr

igin

g2

refs

.—R

odge

rs

and

Oli

ver

(2007)

1re

f.—

War

r,

Oli

ver,

and

Whit

e(2

002)

1re

f.—

Goova

erts

,

Jacq

uez

,an

d

Gre

ilin

g

(2005)

4re

fs.

Indic

ator

and

dis

junct

ive

krig

ing

2re

fs.—

Wan

g

(2007)

1re

f.—

Goova

erts

(2002a)

3re

fs.

Bin

om

ial/

Pois

son

krig

ing

7re

fs.—

Goova

erts

(2005)

7re

fs.

Geography, Spatial Data Analysis, and GeostatisticsRobert P. Haining et al.

21

Tab

le2

Conti

nued

Phys

ical

geogr

aphy

(subfi

elds)

Geo

stat

isti

cal

area

of

inve

stig

atio

n

Bio

geogr

aphy

Cli

mat

olo

gyG

eom

orp

holo

gyH

ydro

logy

Rem

ote

sensi

ng

Hum

an

geogr

aphy

Tota

lre

fs.

Chan

geof

support

issu

es/A

toP,

Ato

A

krig

ing

1re

f.—

Bel

lehum

eur

and

Lege

ndre

(1997)

1re

f.—

Mas

on,

O’C

onai

ll,

and

McK

endri

ck

(1994)

3re

fs.—

Goova

erts

(2006b)

5re

fs.

Spac

e-ti

me

geost

atis

tics

1re

f.—

Janis

and

Robes

on

(2004)

1re

f.—

Su,

Song,

and

Zhan

g(2

003)

2re

fs.

Sam

pli

ng

schem

e

des

ign

1re

f.—

Lin

etal

.(2

008)

1re

f.—

Finle

y

etal

.(2

007)

2re

fs.

Dat

aco

mpre

ssio

n1

ref.

—A

tkin

son,

Curr

an,

and

Web

ster

(1990)

1re

f.

Spat

ial

wei

ghti

ng

of

clas

sifi

cati

on

1re

f.—

Goova

erts

(2002a)

1re

f.

Geo

stat

isti

cal

sim

ula

tion

4re

fs.—

Bis

hop,

Min

asny,

and

McB

ratn

ey(2

006)

1re

f.—

Goova

erts

(2002b)

2re

fs.—

Goova

erts

(2006a)

7re

fs.

Geo

stat

isti

cal

theo

ry/r

evie

w

arti

cles

1re

f.—

Lark

(2000)

1re

f.—

Dig

gle

and

Rib

eiro

(2002)

8re

fs.—

Corn

ford

,

Csa

to,

and

Opper

(2005)

3re

fs.—

Gri

ffit

h

(2002)

13

refs

.

Geographical Analysis

22

Geographers study both spatial and temporal variation, and spatio-temporal

variograms and space-time kriging can be used for predicting phenomena in these

domains. However, only two applications of space-time geostatistics were found:

one for determining the representativeness of air temperature records and one for

assessing the spatio-temporal variation of groundwater salt content.

Only two studies were identified where geostatistics has been applied to sam-

pling issues: one used geostatistics to determine the location of pollutants following

a bioterrorist attack, and the other to establish sampling schemes for mapping bird

diversity. Although the storage and classification of remotely sensed image data are

important in geography, only one article was found that uses geostatistics for data

compression and only one that uses spatial weighting by the variogram to improve

classification contiguity.

Most applications of geostatistical simulation are in geology or geomorphology

for risk assessment and uncertainty, but they are also used in remote sensing and

health geography. For the latter, geostatistical simulation is used to assess the un-

certainty associated with rare disease clusters.

Several theoretical articles were found that introduce new geostatistical con-

cepts or techniques or that provide reviews. These are the second most abundant

type of geostatistics article; however, the authors are not geographers but rather

prolific publishers of geostatistical applications in the applied sciences.

Concluding remarks

Geostatistical approaches to the analysis of spatial data have been underexploited

in geographical research, particularly in human geography. Two main reasons ap-

pear to account for this. First, traditional geostatistical techniques are applicable to

attributes with a continuous spatial index (Ripley 1981; Cressie 1993, pp. 8–9),

making these techniques less relevant to many problems in human geography,

where data frequently relate to areas. Second, geostatistics is often perceived as

being of use only for spatial interpolation or kriging, including mapping and sample

design when obtaining primary data. Other reasons sometimes cited for the un-

deruse of geostatistical approaches include lack of time allotted to teaching geo-

statistics in spatial data analysis courses in geography departments compared with

other forms of spatial analysis and lack of instruction about the appropriate use of

available software.

Geostatistics embraces a broad range of tools and modeling techniques that

can be applied to many spatial problems, including prediction, determination of

the scale of spatial variation, design of sampling for primary data collection,

smoothing of noisy maps, region identification, multivariate analysis, probability

mapping, and change of support. It has a research literature that includes many

disciplines. Most important, however, geostatistics provides the spatial analyst with

a statistically rigorous model of how properties vary in space (RVT) that recognizes

the different components of variation (from locally erratic to spatially structured

Geography, Spatial Data Analysis, and GeostatisticsRobert P. Haining et al.

23

components) and furnishes permissible models for those components of variation,

as well as procedures for fitting them to real data.

Today, geostatistical methods can be applied to regional data, although their

use with this type of data does present challenges. Proximity measures are limited

to the distance between centroids, which cannot be defined uniquely when spatial

units have nonzero areas. Data based on rates and proportions are aggregated, and

the denominators often vary among spatial units reflecting differences in population

size. Some variation in data can be a consequence of these underlying differences

in sample support (Krivoruchko, Gotway, and Zhigimont 2003), and these issues

need to be taken into account when dealing with regional data.

More user-friendly software is now available. A list of free and commercial

software for geostatistical analysis is given at http://www.ai-geostats.org/index.

php?id=107, although it is by no means exhaustive, failing to include packages

such as GenStat (developed at Rothamsted Research), S-Plus, and Terraseer’s Space

Time Information System, which currently has a beta package that can accommo-

date different types of geographic data and has capabilities for ATA, ATP, and

Poisson kriging. In addition, many computer programs are available from articles

published in Computers & Geosciences, as well as authors making their code

freely available on their personal Web sites. Deutsch and Journel (1998) provide a

comprehensive set of Fortran programs for many geostatistical techniques in Geo-

statistical Software Library and User’s Guide, and most of these programs have

been adapted for a Windows environment in the freely available Stanford Geosta-

tistical Modeling Software (Remy, Boucher, and Wu 2009). However, the former

has no way of modeling a variogram once it has been computed without using

other software, and the latter employs visual fitting. Gstat is a freely available com-

mand line software package that is favored by many and includes the capability for

variogram model fitting both numerically and visually (Pebesma and Wesseling

1998). With new developments in methodology and software, exciting times lie

ahead for geographers exploring the benefits of geostatistics for solving geographic

problems.

Notes

1 This maps into what Tobler (1970) describes as ‘‘The First Law of Geography’’: ‘‘everything

is related to everything else, but near things are more related than distant things.’’ Bane-

rjee, Carlin, and Gelfand (2004, p. 39) refer to ‘‘The First Law of Geostatistics,’’ which

describes spatial structure in similar but more formal terms. This observation has statistical

roots that go back to the early twentieth century (see Student 1911).

2 The strategy used to search for geostatistics articles was as follows: Year range: 1990–June

2008; Document type: all; (Geobase) Subject/Title/Abstract includes: krig� OR variogram

OR geostat�; (ISI) Title includes: krig� OR variogram OR geostat�. Geostatistics articles

within geography were identified as those that contained the three preceding keywords in

the subject/title/abstract and also were found in a journal with geogra� in the title. We

recognize that the search terms are not exhaustive and will not identify all geostatistics

Geographical Analysis

24

articles in the academic literature in general or within geography, but some basic patterns

may be observed that we think are informative. The terms used limit the number of articles

identified, but few, if any, nongeostatistics ones would be identified with these terms.

Simulation was not included because it would cause potential confusion. For more

information arising from the bibliometric search and for more references obtained from the

literature review in this article, interested readers should visit http://www.geog.cam.ac.uk/

people/haining/ or contact the corresponding author.

3 See note 2. We do not claim to have identified all areas of application. However, for more

details about references, go to http://www.geog.cam.ac.uk/people/haining/ or contact the

corresponding author.

References

Ali, M., P. Goovaerts, N. Nazia, M. Z. Haq, M. Yunus, and M. Emch. (2006). ‘‘Application of

Poisson Kriging to the Mapping of Cholera and Dysentery Incidence in an Endemic Area

of Bangladesh.’’ International Journal of Health Geographics 5, 45.

Anselin, L. (1988). Spatial Econometrics: Methods and Models. Dordrecht, The Netherlands:

Kluwer.

Anselin, L. (1995). ‘‘Local Indicators of Spatial Association—LISA.’’ Geographical Analysis

27, 93–115.

Armstrong, M. (1998). Basic Linear Geostatistics. Berlin, Germany: Springer.

Atkinson, P. M. (1991). ‘‘Optimal Ground-Based Sampling for Remote Sensing

Investigations. Estimating the Regional Mean.’’ International Journal of Remote Sensing

12, 559–67.

Atkinson, P. M., P. J. Curran, and R. Webster. (1990). ‘‘Sampling Remotely Sensed Imagery

for Storage, Retrieval, and Reconstruction.’’ Professional Geographer 42, 345–53.

Atkinson, P. M., and P. Lewis. (2000). ‘‘Geostatistical Classification for Remote Sensing: An

Introduction.’’ Computers & Geosciences 26, 361–71.

Banerjee, S., B. P. Carlin, and A. E. Gelfand. (2004). Hierarchical Modeling and Analysis for

Spatial Data. Boca Raton, FL: Chapman & Hall/CRC.

Bellehumeur, C., and P. Legendre. (1997). ‘‘Aggregation of Sampling Units: An Analytical

Solution to Predict Variance.’’ Geographical Analysis 29, 258–66.

Berry, B. J. L., and D. Marble. (1968). Spatial Analysis: A Reader. Englewood Cliffs, NJ:

Prentice Hall.

Besag, J. E., J. York, and A. Mollie. (1991). ‘‘Bayesian Image Restoration with Two

Applications in Spatial Statistics.’’ Annals of the Institute of Statistical Mathematics 43,

1–21.

Bian, L., and Z. Xie. (2004). ‘‘A Spatial Dependence Approach to Retrieving Industrial

Complexes from Digital Images.’’ Professional Geographer 56, 381–93.

Bierkens, M. F. P. (2006). ‘‘Designing a Monitoring Network for Detecting Groundwater

Pollution with Stochastic Simulation and a Cost Model.’’ Stochastic Environmental

Research and Risk Assessment 20, 335–51.

Bilodeau, M., F. Meyer, and M. Schmitt, eds. (2005). Space, Structure and Randomness:

Contributions in Honor of Georges Matheron in the Fields of Geostatistics, Random Sets

and Mathematical Morphology. New York: Springer.

Geography, Spatial Data Analysis, and GeostatisticsRobert P. Haining et al.

25

Bishop, T. F. A., B. Minasny, and A. B. McBratney. (2006). ‘‘Uncertainty Analysis for Soil-

Terrain Models.’’ International Journal of Geographical Information Science 20, 117–34.

Brus, D. J., J. J. De Gruijter, D. J. J. Walvoort, F. De Vries, J. J. B. Bronswijk, P. F. A. M.

Romkens, and W. De Vries. (2002). ‘‘Heavy Metals in the Environment: Mapping the

Probability of Exceeding Critical Thresholds for Cadmium Concentrations in Soils in the

Netherlands.’’ Journal of Environmental Quality 31, 1875–84.

Chiles, J.-P., and P. Delfiner. (1999). Geostatistics. Modeling Spatial Uncertainty. New York:

Wiley.

Christensen, R. (2001). Linear Models for Multivariate, Time Series, and Spatial Data, 2nd ed.

New York: Springer.

Cliff, A. D., and J. K. Ord. (1969). ‘‘The Problem of Spatial Autocorrelation.’’ In London

Papers in Regional Science, Volume 1: 25–55, edited by A. J. Scott. London: Pion.

Cliff, A. D., and J. K. Ord. (1973). Spatial Autocorrelation. London: Pion.

Cornford, D., L. Csato, and M. Opper. (2005). ‘‘Sequential, Bayesian Geostatistics:

A Principled Method for Large Data Sets.’’ Geographical Analysis 37, 183–99.

Cressie, N. A. C. (1993). Statistics for Spatial Data, 2nd ed. New York: Wiley.

Dacey, M. (1968). ‘‘A Review of Measures of Contiguity for Two and K-Color Maps.’’ In

Spatial Analysis: a Reader in Statistical Geography, 479–65, edited by B. J. L. Berry and

D. Marble. Englewood Cliffs, NJ: Prentice Hall.

De Cola, L. (2002). ‘‘Spatial Forecasting of Disease Risk and Uncertainty.’’ Cartography and

Geographic Information Science 29, 363–80.

Deutsch, C. V., and A. G. Journel. (1998). GSLIB: Geostatistical Software Library, 2nd ed.

New York: Oxford University Press.

Diggle, P. J., and P. J. Ribeiro Jr. (2002). ‘‘Bayesian Inference in Gaussian Model-based

Geostatistics.’’ Geographical and Environmental Modelling 6, 129–46.

Diggle, P. J., and P. J. Ribeiro Jr. (2007). Model-Based Geostatistics. New York: Springer.

Finley, P., J. L. Ramsey, B. Melton, and S. A. McKenna. (2007). ‘‘Using GIS Technology to

Manage Information Following a Bioterrorism Attack.’’ Journal of Map and Geography

Libraries 4, 207–20.

Fotheringham, A. S., C. Brunsdon, and M. Charlton. (2000). Quantitative Geography:

Perspectives on Spatial Data Analysis. London: Sage.

Frogbrook, Z. L., and M. A. Oliver. (2007). ‘‘Identifying Management Zones in Agricultural

Fields Using Spatially Constrained Classification of Soil and Ancillary Data.’’ Soil Use

and Management 23, 40–51.

Gaus, I., D. G. Kinniburgh, J. C. Talbot, and R. Webster. (2003). ‘‘Geostatistical Analysis of

Arsenic Concentration in Groundwater in Bangladesh Using Disjunctive Kriging.’’

Environmental Geology 44, 939–48.

Geary, R. C. (1954). ‘‘The Contiguity Ratio and Statistical Mapping.’’ The Incorporated

Statistician 5, 115–45.

Gomersall, C. E., and K. M. Hinkel. (2001). ‘‘Estimating the Variability of Active-Layer Thaw

Depth in Two Physiographic Regions of Northern Alaska.’’ Geographical Analysis 33,

141–55.

Goodchild, M., and R. P. Haining. (2004). ‘‘GIS and Spatial Data Analysis: Converging

Perspectives.’’ Papers in Regional Science 83, 363–85.

Goovaerts, P. (1997). Geostatistics for Natural Resources Evaluation. New York: Oxford

University Press.

Geographical Analysis

26

Goovaerts, P. (2001). ‘‘Geostatistical Modelling of Uncertainty in Soil Science.’’ Geoderma

103, 3–26.

Goovaerts, P. (2002a). ‘‘Geostatistical Incorporation of Spatial Coordinates into

Supervised Classification of Hyperspectral Data.’’ Journal of Geographical Systems 4,

99–111.

Goovaerts, P. (2002b). ‘‘Geostatistical Modelling of Spatial Uncertainty Using p-Field

Simulation with Conditional Probability Fields.’’ International Journal of Geographical

Information Science 16, 167–78.

Goovaerts, P. (2005). ‘‘Geostatistical Analysis of Disease Data: Estimation of Cancer

Mortality Risk from Empirical Frequencies Using Poisson Kriging.’’ International Journal

of Health Geographics 4, 31.

Goovaerts, P. (2006a). ‘‘Geostatistical Analysis of Disease Data: Visualization and

Propagation of Spatial Uncertainty in Cancer Mortality Risk Using Poisson Kriging and

p-Field Simulation.’’ International Journal of Health Geographics 5, 7.

Goovaerts, P. (2006b). ‘‘Geostatistical Analysis of Disease Data: Accounting for Spatial

Support and Population Density in the Isopleth Mapping of Cancer Mortality Risk Using

Area to Point Poisson Kriging.’’ International Journal of Health Geographics 5, 52.

Goovaerts, P. (2008). ‘‘Kriging and Semivariogram Deconvolution in the Presence of

Irregular Geographical Units.’’ Mathematical Geosciences 40, 101–28.

Goovaerts, P., G. M. Jacquez, and D. Greiling. (2005). ‘‘Exploring Scale-Dependent

Correlations Between Cancer Mortality Rates Using Factorial Kriging and Population-

Weighted Semivariograms.’’ Geographical Analysis 37, 152–82.

Goovaerts, P., and R. Webster. (1994). ‘‘Scale-dependent Correlation Between Topsoil Copper

and Cobalt Concentrations in Scotland.’’ European Journal of Soil Science 45, 79–95.

Gotway, C. A., and L. J. Young. (2002). ‘‘Combining Incompatible Spatial Data.’’ Journal of

the American Statistical Association 97(459), 632–48.

Gotway, C. A., and L. J. Young. (2005). ‘‘Change of Support: An Inter-disciplinary

Challenge.’’ In geoENV V—Geostatistics for Environmental Applications, 1–13, edited

by P. Renard, H. Demougeot-Renard, and R. Froidevaux. Berlin, Germany: Springer.

Griffith, D. A. (2002). ‘‘Modeling Spatial Dependence in High Spatial Resolution

Hyperspectral Data Sets.’’ Journal of Geographical Systems 4, 43–51.

Haas, T. C. (1990). ‘‘Kriging and Automated Variogram Modeling Within a Moving

Window.’’ Atmospheric Environment, Part A 24, 1759–69.

Haining, R. P. (1993). Spatial Data Analysis in the Social and Environmental Sciences.

Cambridge, UK: Cambridge University Press.

Haining, R. P. (2003). Spatial Data Analysis: Theory and Practice. Cambridge, UK:

Cambridge University Press.

Hernandez Roberto, R. (2001). ‘‘Comparing Multivariate Kriging Methods for Spatial

Estimation of Air Temperature.’’ Estudios Geograficos 243, 285–308.

Hessl, A., J. Miller, J. Kernan, D. Keenum, and D. McKenzie. (2007). ‘‘Mapping Paleo-Fire

Boundaries from Binary Point Data: Comparing Interpolation Methods.’’ Professional

Geographer 59, 87–104.

Heuvelink, G. B. M., P. Musters, and E. J. Pebesma. (1996). ‘‘Spatio-Temporal Kriging of Soil

Water Content.’’ In Geostatistics Wollongong 96—Proceedings of the Fifth International

Geostatistics Congress, Wollongong, Australia, 1020–30, edited by E. Y. Baafi.

Dordrecht, The Netherlands: Kluwer.

Geography, Spatial Data Analysis, and GeostatisticsRobert P. Haining et al.

27

Hock, R., and H. Jensen. (1999). ‘‘Application of Kriging Interpolation for Glacier Mass

Balance Computations.’’ Geografiska Annaler, Series A: Physical Geography 81,

611–19.

Isaaks, E. H., and R. M. Srivastava. (1989). Applied Geostatistics. Oxford, UK: Oxford

University Press.

Janis, M. J., and S. M. Robeson. (2004). ‘‘Determining the Spatial Representativeness of Air-

Temperature Records Using Variogram-Nugget Time Series.’’ Physical Geography 25,

513–30.

Journel, A. G. (1982). ‘‘Indicator Approach to Estimation of Spatial Distributions.’’ In 17th

Application of Computers and Operations Research in the Mineral Industry, Golden,

CO, Soc of Min Eng of AIME. New York.

Journel, A. G., and C. J. Huijbregts. (1978). Mining Geostatistics. London: Academic Press.

Jowett, G. H. (1955). ‘‘Sampling Properties of Local Statistics in Stationary Stochastic Series.’’

Biometrika 42, 160–69.

Kent, M., R. A. Moyeed, C. L. Reid, R. Pakeman, and R. Weaver. (2006). ‘‘Geostatistics,

Spatial Rate of Change Analysis and Boundary Detection in Plant Ecology and

Biogeography.’’ Progress in Physical Geography 30, 201–31.

Kerry, R., and M. A. Oliver. (2007). ‘‘The Analysis of Ranked Observations of Soil Structure

Using Indicator Geostatistics.’’ Geoderma 140, 397–416.

Kolmogorov, A. N. (1941). ‘‘The Local Structure of Turbulence in an Incompressible Fluid at

Very Large Reynolds Numbers.’’ Doklady Academii Nauk SSSR 30, 301–05.

Krige, D. G. (1951). ‘‘A Statistical Approach to Some Basic Mine Valuation Problems on the

Witwatersrand.’’ Journal of the Chemistry, Metallurgical & Mining Society of South

Africa 52, 119–39.

Krishna Iyer, P. V. (1949). ‘‘The First and Second Moments of Some Probability Distributions

Arising from Points on a Lattice and Their Applications.’’ Biometrika 36, 135–41.

Krivoruchko, K., C. A. Gotway, and A. Zhigimont. (2003). ‘‘Statistical Tools for Regional

Data Analysis Using GIS.’’ In GIS:2003 Proceedings of the 11th ACM International

Symposium on Advances in Geographic Information Systems New Orleans, Lousisiana,

41–48, edited by E. Hoel and P. Rigaux. New York: ACM Press.

Kyriakidis, P. C. (2004). ‘‘A Geostatistical Framework for Area-to-Point Spatial

Interpolation.’’ Geographical Analysis 36, 259–89.

Kyriakidis, P. C., and A. G. Journel. (1999). ‘‘Geostatistical Space-Time Models.’’

Mathematical Geology 31, 651–84.

Kyriakidis, P. C., and A. G. Journel. (2001a). ‘‘Stochastic Modeling of Atmospheric Pollution:

A Spatial Time Series Framework. Part I: Methodology.’’ Atmospheric Environment 35,

2331–37.

Kyriakidis, P. C., and A. G. Journel. (2001b). ‘‘Stochastic Modeling of Atmospheric Pollution:

A Spatial Time Series Framework. Part II: Application to Monitoring Monthly Sulphate

Deposition over Europe.’’ Atmospheric Environment 35, 2339–48.

Lark, R. M. (2000). ‘‘Regression Analysis with Spatially Autocorrelated Error: Simulation

Studies and Application to Mapping of Soil Organic Matter.’’ International Journal of

Geographical Information Science 14, 247–64.

Lark, R. M., B. R. Cullis, and S. J. Welham. (2006). ‘‘On Optimal Prediction of Soil Properties

in the Presence of Spatial Trend: The Empirical Best Linear Unbiased Predictor (E-BLUP)

with REML.’’ European Journal of Soil Science 57, 787–99.

Geographical Analysis

28

Lark, R. M., and R. B. Ferguson. (2004). ‘‘Mapping Risk of Soil Nutrient Deficiency or Excess

by Disjunctive and Indicator Kriging.’’ Geoderma 118, 39–53.

Lay, J. G., and H. S. Wang. (1996). ‘‘A Comparative Study on the Interpolation of Digital

Contours.’’ Journal of Geographical Science 21, 83–94.

Li, X., D. Song, and D. Xiao. (2005). ‘‘The Variability of Groundwater Mineralization in

Minqin Oasis.’’ Acta Geographica Sinica 60, 319–27.

Lin, Y.-P., M.-S. Yeh, D.-P. Deng, and Y.-C. Wang. (2008). ‘‘Geostatistical Approaches and

Optimal Additional Sampling Schemes for Spatial Patterns and Future Sampling of Bird

Diversity.’’ Global Ecology and Biogeography 17, 175–88.

Lloyd, C. D., and P. M. Atkinson. (2001). ‘‘Assessing Uncertainty in Estimates with Ordinary

and Indicator Kriging.’’ Computers & Geosciences 27, 929–37.

Ma, X., R. Ma, and J. Xu. (2004). ‘‘Spatial Structure of Cities and Towns with ESDA-GIS

Framework.’’ Acta Geographica Sinica 59, 1048–57.

Mason, D. C., M. O’Conaill, and I. McKendrick. (1994). ‘‘Variable Resolution Block Kriging

Using a Hierarchical Spatial Data Structure.’’ International Journal of Geographical

Information Systems 8, 429–49.

Matern, B. (1960). Spatial Variation. Meddelanden fran Statens Skogsforskningsinstitut, 49,

No. 5. Lecture Notes in Statistics, No. 36, 2nd ed, 1986. New York: Springer.

Matheron, G. (1963). ‘‘Principles of Geostatistics.’’ Economic Geology 58, 1246–66.

Matheron, G. (1965). Les variables regionalisees et leur estimation: Une Application de la

theorie de fonctions Aleatoires aux sciences de la nature. Paris: Masson et Cie.

Matheron, G. (1969). Theorie des ensembles aleatoires. Les Cahiers du Centre de

Morphologie Mathematique, Fascicule 4. Fountainebleau: Paris School of Mines.

Matheron, G. (1973). ‘‘The Intrinsic Random Functions and Their Applications.’’ Advances in

Applied Probability 5, 439–68.

Matheron, G. (1982). Pour une analyse krigreante de donnees regionalisees. Note N-732 du

Centre de Geostatistique. Fontainebleau: Ecole des Mines de Paris.

McBratney, A. B., G. A. Hart, and D. McGarry. (1991). ‘‘The Use of Region Partitioning to

Improve the Representation of Geostatistically Mapped Soil Attributes.’’ Journal of Soil

Science 42, 513–32.

McBratney, A. B., I. O. A. Odeh, T. F. A. Bishop, M. S. Dunbar, and T. M. Shatar. (2000). ‘‘

An Overview of Pedometric Techniques for Use in Soil Survey.’’ Geoderma 97,

293–327.

McBratney, A. B., R. Webster, and T. M. Burgess. (1981). ‘‘The Design of Optimal Sampling

Schemes for Local Estimation and Mapping of Regionalized Variables—II.’’ Computers

& Geosciences 7, 331–34.

Mercer, W. B., and A. D. Hall. (1911). ‘‘The Experimental Error of Field Trials.’’ Journal of

Agricultural Science 4, 107–32.

Monestiez, P., L. Dubroca, E. Bonnin, J. P. Durbec, and C. Guinet. (2005). ‘‘Comparison of

Model Based Geostatistical Methods in Ecology: Application to Fin Whale Spatial

Distribution in Northwestern Mediterranean Sea.’’ In Geostatistics Banff 2004, Volume

2: 777–86, edited by O. Leuangthong and C. V. Deutsch. Dordrecht, The Netherlands:

Kluwer.

Monestiez, P., L. Dubroca, E. Bonnin, J. P. Durbec, and C. Guinet. (2006). ‘‘Geostatistical

Modelling of Spatial Distribution of Balaenoptera Physalus in the Northwestern

Geography, Spatial Data Analysis, and GeostatisticsRobert P. Haining et al.

29

Mediterranean Sea from Sparse Count Data and Heterogeneous Observation Efforts.’’

Ecological Modelling 193, 615–28.

Moran, P. A. P. (1950). ‘‘Notes on Continuous Stochastic Phenomena.’’ Biometrika 37,

17–23.

Odeh, I. O. A., and A. B. McBratney. (2000). ‘‘Using AVHRR Images for Spatial Prediction of

Clay Content in the Lower Namoi Valley of eastern Australia.’’ Geoderma 97, 237–54.

Odeh, I. O. A., A. B. McBratney, and D. J. Chittleborough. (1994). ‘‘Spatial Prediction of Soil

Properties from Landform Attributes Derived from a Digital Elevation Model.’’

Geoderma 63, 197–214.

Oliver, M. A., J. A. Shine, and K. A. Slocum. (2005). ‘‘Using the Variogram to Explore

Imagery of Two Different Spatial Resolutions.’’ International Journal of Remote Sensing

26(15), 3225–40.

Oliver, M. A., and R. Webster. (1989). ‘‘A Geostatistical Basis for Spatial Weighting in

Multivariate Classification.’’ Journal of Mathematical Geology 21, 15–25.

Oliver, M. A., R. Webster, C. Lajaunie, K. R. Muir, S. E. Parkes, A. H. Cameron, M. C. G.

Stevens, and J. R. Mann. (1998). ‘‘Binomial Cokriging for Estimating and Mapping the

Risk of Childhood Cancer.’’ Mathematical Medicine and Biology 15(3), 279–97.

Oliver, M. A., R. Webster, and S. P. McGrath. (1996). ‘‘Disjunctive Kriging for Environmental

Management.’’ Environmetrics 7, 333–58.

Oliver, M. A., R. Webster, and K. Slocum. (2000). ‘‘Filtering SPOT Imagery by Kriging

Analysis.’’ International Journal of Remote Sensing 21, 735–52.

Omre, H. (1987). ‘‘Bayesian Kriging—Merging Observations and Qualified Guesses in

Kriging.’’ Mathematical Geology 19, 25–39.

Pardo-Iguzquiza, E. (1998a). ‘‘Maximum Likelihood Estimation of Spatial Covariance

Parameters.’’ Mathematical Geology 30, 95–107.

Pardo-Iguzquiza, E. (1998b). ‘‘MLREML4: A Program for the Inference of the Power

Variogram Model by Maximum Likelihood and Restricted Maximum Likelihood.’’

Computers & Geosciences 24, 537–43.

Pebesma, E. J., and C. G. Wesseling. (1998). ‘‘Gstat, a Program for Geostatistical Modelling,

Prediction and Simulation.’’ Computers & Geosciences 24, 17–31.

Rawlins, B. G., B. P. Marchant, D. Smyth, C. Scheib, R. M. Lark, and C. Jordan. (2009).

‘‘Airbourne Radiometrics Survey Data and a DTM as Covariates for Regional Scale

Mapping of Soil Organic Carbon Across Northern Ireland.’’ European Journal of Soil

Science 60, 44–54.

Remy, N., A. Boucher, and J. Wu. (2009). Applied Geostatistics with SGeMS: A User’s

Guide. Cambridge, UK: Cambridge University Press.

Ripley, B. D. 1981. Spatial Statistics. New York: Wiley.

Rodgers, S. E., and M. A. Oliver. (2007). ‘‘A Geostatistical Analysis of Soil, Vegetation,

and Image Data Characterizing Land Surface Variation.’’ Geographical Analysis 39,

195–216.

Stein, M. L. (1999). Interpolation of Spatial Data: Some Theory for Kriging. New York:

Springer.

Strand, G.-H. (1998). ‘‘Kriging the Potential Tree Level in Norway.’’ Norsk Geografisk

Tidsskrift 52, 17–25.

Student. (1911). ‘‘Appendix to The Experimental Error of Field Trials. Mercer, W. B., and

Hall, A. D.’’ Journal of Agricultural Science 4, 107–32.

Geographical Analysis

30

Su, L., Y. Song, and Z. Zhang. (2003). ‘‘Study on the Spatio-Temporal Variation of

Groundwater Salt Content in Xinjiang Weigan Catchment.’’ Acta Geographica Sinica

58, 854–60.

Tatalovich, Z., J. P. Wilson, and M. Cockburn. (2006). ‘‘A Comparison of Thiessen Polygon,

Kriging and Spline Models of Potential UV Exposure.’’ Cartography and Geographic

Information Science 33, 217–31.

Tobler, W. (1970). ‘‘A Computer Movie Simulating Urban Growth in the Detroit Region.’’

Economic Geography 46, 234–40.

Wackernagel, H. 1995. Multivariate Geostatistics. An Introduction with Applications. Berlin,

Germany: Springer.

Walter, C., A. B. McBratney, A. Douaoui, and B. Minasny. (2001). ‘‘Spatial Prediction of

Topsoil Salinity in the Chelif Valley, Algeria, Using Local Ordinary Kriging with Local

Variograms Versus Whole-Area Variogram.’’ Australian Journal of Soil Research 39,

259–72.

Wang, Y.-C. (2007). ‘‘Spatial Patterns and Vegetation-Site Relationships of the Presettlement

Forests in Western New York, USA.’’ Journal of Biogeography 34, 500–13.

Warr, B., M. A. Oliver, and K. White. (2002). ‘‘The Application of Factorial Kriging and

Fourier Analysis for Remotely Sensed Data Simplification and Feature Accentuation.’’

Geographical and Environmental Modelling 6, 171–87.

Webster, R., and M. A. Oliver. (1992). ‘‘Sample Adequately to Estimate Variograms of Soil

Properties.’’ Journal of Soil Science 43, 177–92.

Webster, R., and M. A. Oliver. (2007). Geostatistics for Environmental Scientists, 2nd ed.

Chichester, UK: Wiley.

Wright, S. M., D. C. Howard, J. Barry, and J. T. Smith. (2002). ‘‘Spatial Variation of

Radiocaesium Deposition in Cumbria.’’ Geographical and Environmental Modelling 6,

203–16.

Zhou, F., H. C. Guo, Y. S. Ho, and C. Z. Wu. (2007). ‘‘Scientometric Analysis of Geostatistics

Using Multivariate Methods.’’ Scientometrics 73, 265–79.

Geography, Spatial Data Analysis, and GeostatisticsRobert P. Haining et al.

31


Recommended