Statistical Inference and Visualization in Scale-Space
for Spatially Dependent Images
Amy Vaughan
College of Business and Public Administration, Drake University, Des Moines, IA 50311, USA
Mikyoung Jun
Department of Statistics, Texas A&M University, College Station, TX 77843, USA
Cheolwoo Park
Department of Statistics, University of Georgia, Athens, GA 30602, USA
July 15, 2011
Abstract
SiZer (SIgnificant ZERo crossing of the derivatives) is a graphical scale-space visualization
tool that allows for statistical inferences. In this paper we develop a spatial SiZer for finding
significant features and conducting goodness-of-fit tests for spatially dependent images. The
spatial SiZer utilizes a family of kernel estimates of the image and provides not only exploratory
data analysis but also statistical inference with spatial correlation taken into account. It is also
capable of comparing the observed image with a specific null model being tested by adjusting the
statistical inference using an assumed covariance structure. Pixel locations having statistically
significant differences between the image and a given null model are highlighted by arrows. The
spatial SiZer is compared with the existing independent SiZer via the analysis of simulated data
with and without signal on both planar and spherical domains. We apply the spatial SiZer
method to the decadal temperature change over some regions of the Earth.
Key words: Goodness-of-fit test, Image data analysis, Kernel smoothing, Scale-space, Spatial
correlation, Statistical significance.
1
1 Introduction
SiZer (SIgnificant ZERo crossing of the derivatives) was developed by Chaudhuri and Marron (1999)
as an exploratory data analysis tool combined with statistical inferences. It provides a way to look
at data so that analysts are able to uncover underlying structure in the data, test the data against
underlying assumptions or potential models, and detect any possible anomalies. SiZer is based
on a scale-space idea from computer vision, see Lindeberg (1994), where it refers to a family of
smooths of a digital image. No particular level of smoothing is regarded as correct and each smooth
is thought to provide information about the underlying image structure at a particular scale. In
SiZer, scale-space is a family of kernel smooths indexed by the bandwidth. Therefore, SiZer is a
more advanced version of a basic statistical graphic, such as a plot or chart, that can simultaneously
look at data with different scopes.
SiZer is particularly appealing to statisticians for two reasons. First, it considers a wide range
of bandwidths to avoid the classical problem of bandwidth selection, which has been a hurdle to the
application of smoothers. This allows us to do statistical inference and detect all the information
that is available at each individual level of resolution. Second, the target of the SiZer analysis
is shifted from the true underlying curve to smoothed versions of the underlying curve, with the
idea that truth exists only at each scale. This allows us to avoid a bias problem which occurs in
estimating a true underlying function.
Several other versions of SiZer have been developed in recent years. Park et al. (2004) proposed
a dependent SiZer that compares the observed time series with an assumed model. Rondonotti
et al. (2007) created SiZer for time series, which was later improved by Park et al. (2009a), so that
it is able to estimate an autocovariance function in order to detect significant features in a time
series. Hannig and Lee (2006) developed a robust version of SiZer which can be used for identifying
outliers. Park and Kang (2008) proposed a version of SiZer which can compare multiple curves with
independent data based on their differences of smooths and Park et al. (2009b) introduced a SiZer
that puts forth a method for comparing two (or more) time series. Park et al. (2010) studied a
SiZer which targets the quantile composition of the data instead of the mean structure. In addition,
various Bayesian versions of SiZer have also been proposed. These include Erasto and Holmstrom
(2005), Godtliebsen and Oigard (2005), Oigard et al. (2006) and Erasto and Holmstrom (2007).
Note that all of these tools are restricted to one dimensional data, and thus they are not readily
applicable to image data in a higher dimension.
2
In two dimensions, Ganguli and Wand (2007) studied additive modeling for generalized linear
models, and Godtliebsen et al. (2002) and Duong et al. (2008) studied multivariate kernel esti-
mation in scale-space. Godtliebsen et al. (2004) proposed what they refer to as S3, Significance
in Scale-Space, in a regression setting that analyzes image data. Since they assume the errors
are independent, we call S3 the independent SiZer in this paper. They state two main challenges
of analyzing two dimensional data in scale-space: statistical inference and visualization. In one
dimension, statistical inference is based on the derivatives of a curve that have statistically signif-
icant increases and decreases, and the scale-space is viewed as an overlay of curves. Instead, for
two dimensions Godtliebsen et al. (2002) and Godtliebsen et al. (2004) use partial derivatives, or
gradients for statistical inference and introduce a movie version that shows the progression through
various bandwidths. This tool is useful for finding meaningful structure in a given image, but it
has its limitations due to the independence assumption.
Image data, or spatial data, are commonly observed with spatial dependence structures and
we need to take the spatial dependence into account when we do inference on the mean structure.
Figure 1 shows the decadal temperature change for 1980–1999 over the area around America and
the Himalayan region (see Section 4.1 for more details on this data). The interest here is to de-
tect subregions with statistically significant temperature changes. Like most of the environmental
variables, the temperature variable usually exhibits strong spatial dependence and sometimes its
dependence structure is quite complex (such as nonstationary). For this data set, since we con-
sider the decadal temperature change, the covariance structure may not be as complicated as the
temperature variable itself, but there still should be strong spatial dependence in the data. In fact,
using the coded scale along the right vertical axis of each map, Figure 1 shows that neighboring
pixels are quite correlated. Shen et al. (2002) analyzed a similar data set to detect statistically
significant changes in the decadal temperature averages over some parts of America and they used
a method called enhanced false discovery rate. Their method does consider the spatial dependence
structure of the field.
In this paper, we propose a two dimensional SiZer that improves the independent SiZer by
taking the spatial correlation structure into account in image analysis. The proposed method takes
the noisy image data and finds significant features compared with a given spatial model. It is
also useful to assess models being considered as possible candidates to explain the image well.
Although there exists some work for goodness-of-fit tests for spatial point processes (Diggle, 1979)
3
and spectral densities (Crujeiras et al., 2010), there has not been much work on goodness-of-fit
test tools for spatial models. Therefore, the spatial SiZer can provide informative analysis with
visualization for a vast range of statistical problems.
Section 2 reviews spatial correlation models and provides details on the proposed spatial ver-
sion of SiZer. Section 3 tests the spatial SiZer and compares it with the independent SiZer using
simulated examples. This study is conducted in both planar and spherical domains and the simu-
lated fields are with and without a signal. In Section 4 we illustrate the usefulness of the proposed
method in the problem of detecting the decadal temperature change over some subregions of the
Earth. Finally, we discuss extensions of the proposed method in Section 5.
2 Methodology
We first review some parametric covariance models for spatial random fields in Section 2.1 and then
introduce the proposed SiZer method in Section 2.2.
2.1 Models for spatial dependence
Consider data sets that are observed in a spatial domain, D ⊂ Rd. Let us denote a random field by
Z(s) indexed by spatial location s ∈ D. The process Z(s) may exhibit spatial dependence, similar
to time series where the correlation between different time points must be taken into account. In
observing {Z(s1), ..., Z(sn)} at spatial locations {s1, . . . , sn}, one of the key issues is how to model
the covariance structure of the process Z across space.
In modeling the covariance structure of a spatial process, we often make some simplifying
assumptions. Consider a random field {Z(s) : s ∈ D}. Suppose that the mean E{Z(s)} and
the covariance C(s1, s2) = Cov{Z(s1), Z(s2)} exist and are finite for all s, s1, s2 ∈ D. In general,
without any simplifying assumptions, the covariance structure is nonstationary in the sense that
Cov{Z(s1), Z(s2)} should depend on both s1 and s2. We say the process is (weakly) stationary if
the covariance only depends on the difference between the two locations, that is, there exists an
appropriate function C1 such that C1(s1 − s2) = Cov{Z(s1), Z(s2)} for all s1, s2. Furthermore, we
say the process is (weakly) isotropic, if for a suitable function C2, C2(||s1−s2||) = Cov{Z(s1), Z(s2)},
that is, the covariance only depends on the distance between the two locations. Here || · || denotes
the Euclidean distance, but sometimes we need to use a different metric for some spatial domains
4
(see the last paragraph of this section). Examples of commonly used isotropic covariance functions
are the Gaussian function, C2(x) = a exp (−x2/b2), the exponential function, C2(x) = a exp (−x/b),
and the Matern function given by
C2(x) = a(x/b)νKν(x/b). (2.1)
The function C2 is defined on the domain [0,∞) and it needs to be positive definite on the domain to
be a suitable covariance function for any random process. The parameter a > 0 is called the sill and
it determines the overall level of covariance. For the Gaussian and exponential functions, a gives the
variance of the process at any location and for the Matern function, a · 2ν−1Γ(ν) gives the variance
of the process under the parameterization in (2.1). The parameter b > 0 is called the spatial range
and it determines how far the spatial correlation lasts. For a Matern covariance function, we have
an additional parameter, smoothness, ν > 0, which determines the smoothness of the process. The
function Kν is the modified Bessel function. The smoothness parameter, ν, gives greater flexibility
to the covariance structure compared to Gaussian or exponential functions. In fact, Gaussian and
exponential models are special cases of the Matern model in that when ν = ∞, the Matern model
is the same as the Gaussian model and when ν = 0.5, it is the same as the exponential model.
Note that the stationarity or isotropy assumption requires that the mean should be constant over
the entire domain. That is, we require E{Z(s)} = µ for all s ∈ D. However, for the SiZer problem,
our goal is to detect the mean structure and thus we cannot assume any structure on the mean
of the random field. This difficulty poses a fundamental identifiability problem of the mean and
(co)variance and we will discuss this issue further in Section 4.2.
If the process is defined on a planar domain, that is, if D ⊂ R2, then we simply use the
Euclidean distance in evaluating the covariance using (2.1). However, if the process is defined on
a spherical domain, which is the case for many of environmental data sets, the distance in (2.1)
needs to be the chordal distance to have the Matern model positive definite on the surface of the
sphere for all smoothness parameter values (see Jun and Stein (2007) for further discussion). The
chordal distance between the two locations on the surface of a sphere, (L1, l1) and (L2, l2) for L1, L2
latitudes and l1, l2 longitudes, is evaluated by
ch(L1, L2, l1 − l2) = 2R{sin2
(L1 − L2
2
)+ cosL1 cosL2 sin
2( l1 − l2
2
)}1/2, (2.2)
where R is the radius of the sphere. The distances for the data in Sections 3.2 and 4 are calculated
using (2.2), while the distance in Section 3.1 is the usual Euclidean distance in R2.
5
2.2 Spatial SiZer
We build the spatial SiZer by extending the independent SiZer from the i.i.d setting to the spatially
correlated case. We illustrate the proposed method under the setting and notations of Godtliebsen
et al. (2004). The statistical model that underlies the spatial SiZer is
Yi,j = s(i, j) + ϵi,j ,
where i = 1, . . . , n and j = 1, . . . ,m index pixel locations, s represents the underlying deterministic
signal, and the ϵi,j ’s have mean zero and dependent correlation structure. We assume the covariance
structure is isotropic, that is, Cov(ϵi,j , ϵi′,j′) = C2(d(i,j),(i′,j′)), where d(i,j),(i′,j′) is the distance
between the two points, (i, j) and (i′, j′), and C2 is an isotropic covariance function. Note that the
independent SiZer (Godtliebsen et al., 2004) assumes the ϵi,j ’s to be independent errors.
The estimates of the signal s in scale-space are simply Gaussian smooths indexed by the band-
width (see Jones and Wand (1995) for example); that is, discrete two-dimensional convolutions of
a spherically symmetric Gaussian density with the data. One can use two different bandwidths for
two different directions (i, j), but we use the same bandwidth h for simplicity. The estimates are
denoted as
sh(i, j) =
n∑i′=1
m∑j′=1
Yi′,j′Kh(i− i′, j − j′),
or in matrix notation
sh = Kh ∗ Y
where * denotes bivariate discrete convolution, Y = Yi,j for i = 1, . . . , n and j = 1, . . . ,m,
Kh = Kh(i, j). Higdon (2002) provided some examples of bivariate kernels including Gaussian
and spherical kernels. In this paper we use the bivariate Gaussian kernels. In presenting the SiZer
results graphically we let the off diagonal elements of the 2 × 2 covariance matrix in the bivari-
ate Gaussian kernels zero and the diagonal elements equal. This reduces to the product of two
univariate kernels,
Kh(i, j) = Kh(i)Kh(j),
for i = (1− n), . . . , (n− 1) and j = (1−m), . . . , (m− 1) with
Kh(i) =exp(−(i/h)2/2)∑n−1
i′=1−n exp(−(i′/h)2/2).
6
Godtliebsen et al. (2004) suggested to subtract the mean of the Yi,j before smoothing to over-
come the boundary effects due to a result of averaging in zeros from outside the image. Hence,
sh = A(Y ) +Kh ∗ (Y −A(Y )),
where A is the matrix operator which returns the constant matrix whose common entries are the
average of the entries of its matrix argument.
Another consideration is the number of points that should be inside each kernel window, where
the effective sample size here is given as
ESS = (Kh ∗ 1)/(Kh(0, 0)).
Here 1 is the n by m matrix having a one in each entry. Using the Binomial rule, ESS(i, j) < 5
indicates the region where the data are sparse for inference. To make S3 inferences simultaneously
across location, Godtliebsen et al. (2004) used an average effective sample size of
ESS2 =
n∑i=1
m∑j=1
ESS(i, j)
/(nm).
To adjust ESS2 for spatially correlated images, we borrow an idea from Rondonotti et al.
(2007); the amount of information about the signal available in correlated data is not the same
as the amount of information in independent data. The patterns in positively correlated errors
behave closely to high frequency signal components, which appear in a family of smooths for a
wide range of bandwidths. On the contrary, a family of smooths changes less as a function of
the bandwidth in negatively correlated errors due to the tendency of alternating up and down
of such errors. Therefore, positively (negatively) correlated data contain less (more, respectively)
information about the signal than independent data. Using statistical information ideas, a simple
measure of information in the data is provided by the ratio
I =
√ ∑ni=1
∑mj=1 σ
2i,j
V ar(∑n
i=1
∑mj=1 Yi,j)
,
where V (ϵi,j = σ2i,j and
V ar
n∑i=1
m∑j=1
Yi,j
=
n∑i=1
m∑j=1
n∑i′=1
m∑j′=1
C2(d(i,j),(i′,j′)).
7
Using the information I, which reflects the type and the magnitude of the correlation structure,
the modified effective sample size (MESS) is given as
MESS = I × ESS2.
Note that for independent data, I = 1, which induces MESS = ESS2.
MESS is used to calculate the approximate number of independent averages
ℓ =nm
(MESS). (2.3)
For statistical inference of SiZer, let us consider that the norm of the gradient of the underlying
signal s at any given (i, j) location is
G(s) = [(s1)2 + (s2)
2]1/2,
where s1 (and s2) is the partial derivative in the vertical (and horizontal) direction. Then, the
corresponding estimate of the gradient in scale-space is
Gh(s) = [(sh,1)2 + (sh,2)
2]1/2,
where the partial derivatives are estimated by
sh,1 = Kh,1 ∗ Y , sh,2 = Kh,2 ∗ Y ,
where Kh,1(i, j) = K ′h(i)Kh(j), Kh,2(i, j) = Kh(i)K
′h(j), and K ′
h(i) = (−i/h)Kh(i). The gradient
version of the SiZer flags pixels with arrows as significant when Gh(s) is higher than the noise level,
rejecting a null hypothesis of the form
H0 : Gh(s) = 0.
For the independent SiZer, the null distribution of this test is based on the bivariate Gaussian
distribution sh,1
sh,2
∼ N
0
0
,
τ21 τ212
τ212 τ22
,
which is exact if the noise terms ϵi,j have a Gaussian distribution or follow the Central Limit
Theorem (Godtliebsen et al., 2004). By the assumption of independence, τ212 ≈ 0. The usefulness
of the independent SiZer, however, is diminished if the image is spatially dependent. Due to the
8
assumption of independent errors, significant features found by the SiZer may be simply artifacts of
the natural dependence inherent to the spatially dependent data (see Figure 5(c) as an example).
This motivates the need to properly incorporate dependence into the SiZer analysis.
For the spatial SiZer, τ212 is different from zero and the correlation should be accounted for in
estimating τ21 and τ22 . Therefore, the variance for the partial derivative in the vertical direction,
indexed by 1, is
τ21 = Var(sh,1(i, j))
= Cov
n∑i′=1
m∑j′=1
Yi′,j′Kh,1(i− i′, j − j′),n∑
i′′=1
m∑j′′=1
Yi′′,j′′Kh,1(i− i′′, j − j′′)
=
n∑i′=1
m∑j′=1
n∑i′′=1
m∑j′′=1
Kh,1(i− i′, j − j′)Kh,1(i− i′′, j − j′′)Cov(Yi′,j′ , Yi′′,j′′)
=
n∑i′=1
m∑j′=1
n∑i′′=1
m∑j′′=1
Kh,1(i− i′, j − j′)Kh,1(i− i′′, j − j′′)C2(d(i′,j′),(i′′,j′′)).
The variance in the horizontal direction is given similarly. The covariance between the partial
derivatives in the horizontal and vertical directions is given by
τ212 = Cov(sh,1, sh,2)
=n∑
i′=1
m∑j′=1
n∑i′′=1
m∑j′′=1
Kh,1(i− i′, j − j′)Kh,2(i− i′′, j − j′′)C2(d(i′,j′),(i′′,j′′)).
Here, C2 can be modeled through a parametric model as explained in Section 2.1. Once these
pieces are computed, they are put together to form the covariance matrix
Σ =
τ21 τ212
τ212 τ22
and it leads to the resulting distribution that can be seen as τ21 τ212
τ212 τ22
−1/2 sh,1
sh,2
=
th,1
th,2
∼ N
0
0
,
1 0
0 1
.
The sum of the squares of these two test statistics results in
t2h,1 + t2h,2 ∼ χ22(α
′),
and thus the null hypothesis H0 is rejected for those pixels which have values
t2h,1 + t2h,2 > qχ22(α′). (2.4)
9
Here α′ is the adjusted significance level that achieves the nominal level of α. The nominal level of
α used in all of the numerical examples in this paper is α=0.05. Doing an approximation based on
the number of independent blocks, ℓ in (2.3), we define
qχ22(α′) = −2 log(1− (1− α)1/ℓ). (2.5)
as the quantile for determining significance. Note that the independent SiZer utilizes the different
quantile because ℓ is defined using ESS2 which does not take the correlation into account.
For pixels whose null hypothesis is rejected by (2.4), we determine that pixel location to have
a gradient significantly higher than those of its surrounding area. To denote this significance,
we follow the visualization suggested by Godtliebsen et al. (2004), which draws an arrow in that
gradient direction using the corresponding vertical and horizontal direction vector (sh,1, sh,2). A
statistically significant extreme is therefore surrounded by a ring of significant gradient arrows
pointing towards the peak or valley (see Figure 3(a) as an example).
As noted in (2.4), statistical inference depends on the smoothing level h. A fundamental idea
of SiZer is to use all levels of resolution of the image instead of selecting the best one, and thus
a series of images (or a movie version) indexed by the bandwidth is presented in SiZer analysis.
For small bandwidths the images still contain a substantial noise component, which can lead to
not very clearly defined outlines of the images and few arrows that appear to indicate statistically
significant features. As the bandwidth gets large, these images show a blurring of the outlines of the
images and many structures marked as significant, which is not particularly useful, since features
of interest are typically not visible at this scale. Some slice choice in between therefore typically
reveals important features of the image and we choose two bandwidths (h = 2, 4) to visualize SiZer
analysis in the next two sections to save space.
3 Simulation results
We compare the spatial SiZer proposed in this paper with the original independent SiZer in a planar
domain (R2) as well as a spherical domain (an approximation of the surface of the Earth). For
each domain, we generate the error process through Gaussian random fields with the covariance
structure from the Matern function given in (2.1). As mentioned at the end of Section 2.1, for the
spherical domain, we need to use the chordal distance of the two points on the surface of the sphere
10
given in (2.2) to get a valid covariance model. We repeat the SiZer analysis many times and show
representative images in this section as we obtain similar results.
For the spherical domain case, we also present the proposed spatial SiZer analysis result when
the covariance structure is misspecified. We present the case for both the data with and without
the signal.
3.1 Planar domain
In the simulation, we generate a signal of image from the following equation for i = 1, ..., n and
j = 1, ...,m
s(i, j) = 10
[cos
(180× 10
π
(i− n
2
))· cos
(180× 10
π
(j − m
2
))]+
. (3.1)
Here, n = 30 is the length in the vertical direction, m = 30 is the length in the horizontal direction,
and (·)+ indicates the positive part of the function (the negative pieces are set to zero). For
the covariance structure, we try eight different combinations of parameter values in (2.1), that is,
a = 5, 10, b = 2, 5, and ν = 0.5, 1.5. Figure 2 (a) displays the original signal in (3.1). To save space,
we report only two cases below, (i) a = 5, b = 2, ν = 0.5 and (ii) a = 10, b = 5, ν = 1.5, but other
results are similar.
In Figure 3, it can be seen that the spatial SiZer highlights the most prominent feature of
interest cleanly for both h = 2 and 4 as the arrows of the spatial SiZer point inward toward the
center of the mode. Also, it captures all four small features at the corners. The independent SiZer,
although its arrows are angled for convergence at the center of the mode, marking its peak as the
significant focal point, has such a vast number of spurious pixels that they cover almost the entire
map. Thus, this independent SiZer provides an inaccurate representation of the amount of actual
trend that lies within the image. Figure 4 shows a similar result except that the small features at
the corners are not well captured by the spatial SiZer for h = 4. However, this does demonstrate
the fact that when a large value of a smoothing parameter is used, often only large scale features
are revealed with this macroscopic vision. Similarly, when a small value of a smoothing parameter
is used, more small scale features are identified with this microscopic vision. In summary, Figures
3 and 4 show that the spatial SiZer outperforms the independent SiZer since it correctly identifies
important features and also reduces number of spuriously highlighted pixels.
In the next two figures, we use the same error structure as above, but remove the signal.
11
Therefore, a properly working SiZer tool should find no features. In Figures 5 and 6, the spatial
SiZer’s performance is exemplary, with not a single pixel erroneously highlighted. The bright and
dark pixels seen in the image are known to be sampling artifacts and are correctly not highlighted
as significant in the spatial SiZer. This can also be viewed as a goodness-of-fit test, that is the image
can be modeled by the Matern function with the given parameters. However, in the independent
SiZer there are numerous arrows converging to focus on some of both the bright and dark areas
of the map. These abundant arrows often seem to point at nothing or fail to completely enclose
the convergence point of its arrows. This shows that the independent SiZer has denoted that there
is a significant trend present at all of these random locations, where in fact there is not. These
cases demonstrate that the spatial SiZer can correctly differentiate between the dependent error
structure and true significant trend, while the independent SiZer cannot.
3.2 Spherical domain
We consider the domain on the surface of the Earth and create a 30× 30 regularly spaced domain
over latitude range 11◦ S to 47◦ S and longitude range 148.75◦ E to 221.25◦ E. This particular
subregion is chosen arbitrarily and thus in the figures with results, we do not overlay the world
map. The mean and covariance structures are similar to those in Section 3.1 except that for the
data with signal, we use
s(L, l) = 4 exp[−√
{(l − 160)2 + (L− 30)2}/30]+ 4 exp
[−√{(l/2− 100)2 + L2}/10
], (3.2)
for L latitude and l longitude (in degrees). For the covariance parameters, we use a = 0.03, 0.08,
b = 500, 1000, and ν = 0.5, 1.5. Note here that the unit for the spatial distance is miles, that is,
the radius of the Earth, R, in (2.2) is in miles. Figure 2 (b) displays the original signal in (3.2).
To save space we only report two cases below, (i) a = 0.03, b = 1000, ν = 0.5 and (ii) a =
0.08, b = 500, ν = 1.5, but other results are similar. In Figures 7 and 8, when h = 2, the independent
SiZer again has numerous spurious pixels around the true features and in the upper right hand
corner of the image, that appear to point at nothing and never converge. For h = 4, the image
in both SiZers becomes oversmoothed and the arrows can barely be separated enough to mark the
two modes. With the independent SiZer in Figure 7 (d) the map even has a pixel at every possible
location of the image, thus providing no case of differentiating signal from noise. This is also nearly
the case in Figure 8 (d), when no real informational inference is given when almost every pixel
12
in the map has been deemed significant. Figures 7 and 8 again show the superior performance of
the spatial SiZer to the independent SiZer because it finds the true features (see Figure 8 (a)) and
greatly reduces the number of spuriously highlighted pixels. This example, however, shows that
the spatial SiZer also has room for improvement because it flags some spurious pixels as significant
(see Figure 7 (a)). As noted in Section 2.2, using a new quantile which takes the spatial correlation
into account is one way of improving the current tool.
We also conduct goodness-of-fit tests for both SiZers by removing the signal and generating
only noise. In Figures 9 and 10 the spatial SiZer has not a single highlighted pixel, thus it does
not show any significant features. This clearly confirms that the data are consistent with the error
generated from the Matern function with the given parameters. As to the opposite conclusion, the
independent SiZer flags varied bright and dark pixels as significant all over the image. In Figures
9 (d) and 10 (d), when h = 4, almost all of the image is highlighted again, blurring the arrows
together which have declared significant trend, but are actually merely sampling artifacts. This
clearly implies that the image cannot be modeled by an independent Gaussian model.
Lastly, we test the sensitivity of the spatial SiZer method on the covariance structure since
in many real applications the true covariance structure is unknown. Figure 11 shows the SiZer
analysis when true covariance structure is given by Matern function with a = 0.08, b = 500, ν = 1.5
but we assume Matern covariance structure with a = 0.08, b = 500, ν = 0.5. In other words, the
true covariance structure is much smoother than the assumed covariance structure (note that the
assumed covariance structure is the same as the exponential covariance function). In both cases
of with and without the signal, the result shows that the proposed SiZer correctly identifies the
location of the signal. We also test the SiZer method when the data are generated using a spherical
covariance function, C2(x) = 0.03{1 − 32(
x500) +
12(
x500)
3}1(x<500) with the same signal as above.
Figure 12 presents the result when we assume the Matern covariance function with a = 0.026, b =
155.08, ν = 0.87, instead of the true spherical covariance structure. The parameter values used are
the maximum likelihood estimates from the Matern covariance function. Figure 12(a) shows that
the proposed SiZer using the incorrect covariance structure gives a reasonable result although it
tends to flag more spurious pixels. This may imply that the Matern covariance function is flexible
enough that as long as we use the covariance parameter estimates from the data, the spatial SiZer
produces accurate analyses.
The simulations in Sections 3.1 and 3.2 demonstrate that the spatial SiZer performs superior to
13
the independent SiZer as it correctly identifies significant features, limits the number of spuriously
highlighted pixels, and successfully conducts goodness-of-fit tests for various types of errors from
the Matern covariance function.
4 Application
In this section, we apply the methodology developed in Section 2 to the problem of detecting the
decadal temperature change over some regions of the Earth. One of the regions that we analyze is
similarly studied by Shen et al. (2002) and we compare our result with theirs in Section 4.3.
4.1 Data
To study the climate change specifically induced by the anthropogenic emissions, various organi-
zations throughout the world are developing climate models under the coordination of the Inter-
governmental Panel on Climate Change (IPCC). There are 20+ state-of-the-art climate models
being developed worldwide and Jun et al. (2008) analyzed some of these models, some outputs of
which are used in this paper. Shen et al. (2002) dealt with the output from an older version of the
climate model called CSM, developed by the National Center for Atmospheric Research (NCAR).
In this paper, we use the outputs of two climate models, one, CCSM3 (this is the model’s IPCC
I.D.), developed by NCAR, and the other, GFDL-CM2.0, developed by NOAA/Geophysical Fluid
Dynamics Lab. The CCSM3 model is a newer and more advanced version of CSM, to the extent
that they are almost two completely different models. Similar to Shen et al. (2002), we will also
take a look at the time period of 1980–1999. The climate model outputs are given in monthly
averages. The spatial grid resolution of the CCSM3 is 256 × 128 and that of the GFDL-CM2.0 is
144× 90.
Our interest is in the average temperature change between the periods 1980–1989 and 1990–
1999, in a spatial domain containing some parts of North and South America as well as a domain
around the Himalayan area. The American region gives an image of size 46×57 when we take a look
with the CCSM3 model and a 33× 33 size pixel image with GFDL-CM2.0. The Himalayan area is
a 41× 52 with CCSM3 and a 30× 30 pixel image with the GFDL-CM2.0 model. We aim to detect
whether the surface temperature has changed over these two decades and we also want to identify
the regions of change using the spatial SiZer, if any such change exists. Also, we will compare the
14
results on these climate changes between the two climate models, CCSM3 and GFDL-CM2.0, as
well as comparing our result from using the CCSM3 model with the CSM results presented in Shen
et al. (2002).
Figure 1 shows the decadal difference of surface temperature (unit: K) over the two regions
with the two climate models. To evaluate the level of surface temperature change, we compare
the variation in hue of adjacent pixels. Those with shades that are closer in terms of levels of
brightness and darkness are more similar than those that are not. Using this as our guide, we
can see that the patterns and values of the temperature changes for the American region are quite
similar between the two models. Note that the pattern of temperature change is smoother over the
ocean as expected. However, the pattern of temperature changes over the Himalayan area from
the two models are quite different. The two maps seem to use different colors for similar regions.
For example, GFDL-CM2.0 implies noticeable temperature decrease over East Africa but CCSM3
does not suggest clear temperature change over that region. On the other hand, CCSM3 suggests
apparent temperature decrease over the region with latitude range 30◦ N to 55◦ N and longitude
range 50◦ E to 80◦ E but it is not the case with the GFDL-CM2.0 output. Our goal is to test the
statistical significance of all of these patterns.
4.2 Covariance estimation
It is common to model a spatial random field with a fixed mean structure through certain covariates
and the covariance structure through some parametric covariance functions. The parameters in the
mean as well as in the covariance function can be estimated through the maximum likelihood esti-
mation method or weighted least squares method. To improve the covariance parameter estimation
in the maximum likelihood method, we use the restricted maximum likelihood (REML) method.
For detailed discussion on advantages and disadvantages of these methods, see Cressie (1993). For
our analysis, we need to estimate covariance parameters to implement the spatial SiZer developed
in Section 2.2.
Since we cannot assume any structure on the mean part of the process under our setting, the
estimation of the covariance parameters can be problematic. A similar (but simpler) problem arises
for a time series with temporal dependence, and Park et al. (2009a) used time differencing as an
attempt to remove the mean part and then applied the weighted least squares method to estimate
an autocorrelation function. They also used a regularization technique to improve the estimation.
15
If the mean structure for the spatial process is close to being constant (nonzero) or is slowly and
smoothly varying, then we may also consider spatial differencing: for each pixel, we subtract the
average of neighboring pixels to get rid of the mean structure. Let Z = {Z(s1), . . . , Z(sn)} be the
vector of the data (suppose we have n spatial locations total) and K be the appropriate differencing
matrix. Then similarly to the REML method, we can maximize the likelihood of KZ to get the
estimates for the covariance parameters.
In our analysis, we estimate the covariance parameters using a maximum likelihood estimation
method with and without spatial differencing. Interestingly, the covariance parameter estimates by
these two approaches for both data sets are fairly close. In the next subsection, we present results
using the covariance parameter estimates without spatial differencing (but we subtract the spatial
average from the original data to make the overall mean zero).
4.3 Result
Using the MLE approach to estimation, the values of the covariance parameters for the Matern
function for the CCSM3 model outputs are a = 0.024, b = 261.04, ν = 1.33 for the region over
America and a = 0.027, b = 213.74, ν = 1.27 for the region around the Himalayan area. Using
the GFDL-CM2.0 model, the MLE values of the covariance parameters for the Matern function
for the outputs are a = 0.052, b = 425.82, ν = 1.12 for the region over America and a = 0.055,
b = 532.32, ν = 0.83 for the region around the Himalayan area. The parameter estimates with
spatial differencing are quite similar for both cases under the same model. This similarity between
the fits with and without spatial differencing gives us confidence that the true mean structure is not
too far from being a constant, and thus the mean structure may not affect the covariance parameter
estimates significantly. Note that the parameter estimates for the two regions are quite similar for
each of the two climate models.
We now show the results of both SiZer methods portrayed through Figures 13 - 16. For both the
American data in Figures 13 and 15 and the Himalayan data in Figures 14 and 16, big differences
between the two climate models can be seen from the spatial SiZer plots. If we take a look at the
spatial SiZer plots using the GFDL-CM2.0 model output in Figures 13 and 14, since there are no
arrows, this implies that there is no significant temperature change present. To the contrary, in
the spatial SiZer plots at both bandwidths utilizing the CCSM3 model output in Figures 15 and
16, it can be seen that there are several regions with significant temperature changes denoted by
16
the arrows. Note that different h values for the spatial SiZer give quite different results for CCSM3
in Figure 15. Shen et al. (2002) also reported quite significant temperature changes in several
regions over the America, which is somewhat consistent with our CCSM3 results. The fact that the
independent SiZer determines that there are significant changes at many places on maps from both
climate model outputs over both regions indicates that the independent SiZer cannot accurately
account for the spatial dependence that is present in this temperature change data.
5 Conclusion and future work
In this paper, we develop a spatial SiZer, which takes into account the spatial dependence structure
in an image. Through simulation results, we demonstrate that the spatial SiZer works well while
the independent SiZer flags spurious signals. Our application result shows that CCSM3 climate
model suggests significant decadal temperature changes over the regions of America and Himalayan
area, while GFDL-CM2.0 climate model suggests no significant temperature change over the same
regions.
We have two suggestions for future work. First, one can develop a SiZer which is capable of
comparing several images. This is particularly useful when different algorithms are applied to an
image and one wants to compare the output images. Another example is when one is interested in
finding statistically significant resolution dependent differences among several noisy images. Com-
parison between two images is usually made numerically with, for example, Mean Squared Error,
which simply incorporates tallies over the two images of corresponding pixel differences, regardless
of position or intensity. But, it does not always give convincing results and no statistical inference is
involved. Moreover, statistical comparison of several images has not been widely studied. Recently,
Holmstrom and Pasanen (2009) use a Bayesian scale-space approach to capture the scale dependent
differences in two noisy images of the same scene taken at two different instants of time.
Second, a further step would be to construct a SiZer for data with spatial-temporal dependence.
For example, in functional Magnetic Resonance Imaging (fMRI) studies, data are collected in a
sequence of three dimensional images over time. One scanned image is composed of 64× 64 pixels
and this scan process continues for a certain time period, which typically results in a couple of
hundred images. Thus, each pixel is a time series with a couple of hundred time points. The
simultaneous exploration of spatial and temporal correlations in fMRI data can be achieved by
17
combining SiZer for time series and spatial SiZer, and constructing a three dimensional SiZer for
spatial-temporal data. There are various parametric spatio-temporal covariance models available
to model the fMRI data. Furthermore, visualization of high dimensional statistical inference will
be yet another nontrivial component of the proposed future research.
Acknowledgments
This work is part of the first author’s dissertation. Mikyoung Jun acknowledges support from
NSF grants ATM-0620624 and DMS-0906532. Mikyoung Jun’s research is partially supported
by Award No. KUS-C1-016-04, made by King Abdullah University of Science and Technology
(KAUST). The authors acknowledge the modeling groups for making their simulations available
for analysis, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) for collecting
and archiving the CMIP3 model output, and the World Climate Research Programme (WCRP)’s
Working Group on Coupled Modelling (WGCM) for organizing the model data analysis activity.
The WCRP CMIP3 multi-model dataset is supported by the Office of Science, U.S. Department of
Energy.
References
Chaudhuri, P. and Marron, J. S. (1999). SiZer for exploration of structures in curves. Journal of
the American Statistical Association, 94:807–823.
Cressie, N. (1993). Statistics for Spatial Data. Wiley, NY.
Crujeiras, R. M., Fernandez-Casal, R., and Gonzalez-Manteiga, W. (2010). Goodness-of-fit tests
for the spatial spectral density. Stoch Environ Res Rish Assess, 24:67–79.
Diggle, P. J. (1979). On parameter estimation and goodness-of-fit testing for spatial point patterns.
Biometrics, 35:87–101.
Duong, T., Cowling, A., Koch, I., and Wand, M. P. (2008). Feature Significance for Multivariate
Kernel Density Estimation. Computational Statistics and Data Analysis, 52:4225–4242.
Erasto, P. and Holmstrom, L. (2005). Bayesian multiscale smoothing for making inferences about
features in scatter plots. Journal of Computational and Graphical Statistics, 14:569–589.
18
Erasto, P. and Holmstrom, L. (2007). Bayesian analysis of features in a scatter plot with dependent
observations and errors in predictors. Journal of Statistical Computation and Simulation, 77:421–
434.
Ganguli, B. and Wand, M. P. (2007). Feature significance in generalized additive models. Statistics
and Computing, 17:179–192.
Godtliebsen, F., Marron, J. S., and Chaudhuri, P. (2002). Significance in scale space for bivariate
density estimation. Journal of Computational and Graphical Statistics, 11:1–21.
Godtliebsen, F., Marron, J. S., and Chaudhuri, P. (2004). Statistical Significance of Features in
Digital Images. Image and Vision Computing, 22:1093–1104.
Godtliebsen, F. and Oigard, T. A. (2005). A visual display device for significant features in com-
plicated signals. Computational Statistics and Data Analysis, 48:317–343.
Hannig, J. and Lee, T. (2006). Robust SiZer for exploration of regression structures and outlier
detection. Journal of Computational & Graphical Statistics, 15:101–117.
Higdon, D. (2002). Space and space-time modeling using process convolutions. In Quantitative
Methods for Current Environmental Issues (C. Anderson, et al. eds), pages 37–54. Springer,
London.
Holmstrom, L. and Pasanen, L. (2009). Bayesian scale space analysis of differences in images.
Submitted.
Jones, M. and Wand, M. (1995). Kernel Smoothing. Chapman & Hall, London.
Jun, M., Knutti, R., and Nychka, D. W. (2008). Spatial analysis to quantify numerical model
bias and dependence: How many climate models are there? Journal of the American Statistical
Association, 103:934–947.
Jun, M. and Stein, M. L. (2007). An approach to producing space-time covariance functions on
spheres. Technometrics, 49:468–479.
Lindeberg, T. (1994). Scale-Space Theory in Computer Vision. Kluwer, Boston.
19
Oigard, T. A., Rue, H., and Godtliebsen, F. (2006). Bayesian multiscale analysis for time series
data. Computational Statistics and Data Analysis, 51:1719–1730.
Park, C., Hannig, J., and Kang, K. (2009a). Improved SiZer for time series. Statistica Sinica,
19:1511–1530.
Park, C. and Kang, K. (2008). SiZer analysis for the comparison of regression curves. Computational
Statistics and Data Analysis, 52:3954–3970.
Park, C., Lee, T., and Hannig, J. (2010). Multiscale exploratory analysis of regression quantiles
using quantile SiZer. Journal of Computational and Graphical Statistics, 19:497–513.
Park, C., Marron, J. S., and Rondonotti, V. (2004). Dependent SiZer: goodness of fit tests for time
series models. Journal of Applied Statistics, 31:999–1017.
Park, C., Vaughan, A., Hannig, J., and Kang, K. (2009b). Sizer analysis for the comparison of time
series. Journal of Statistical Planning and Inference, 139:3974–3988.
Rondonotti, V., Marron, J. S., and Park, C. (2007). SiZer for time series: a new approach to the
analysis of trends. Electronic Journal of Statistics, 1:268–289.
Shen, X., Huang, H.-C., and Cressie, N. (2002). Nonparametric hypothesis testing for a spatial
signal. Journal of the American Statistical Association, 97:1122–1140.
20
105oW 90oW 75oW 60oW 45oW
24oS
12oS
0o
12oN
24oN
longitude
latitu
de
−0.8
−0.6
−0.4
−0.2
0
0.2
30oE 45oE 60oE 75oE 90oE
12oN
24oN
36oN
48oN
longitude
latit
ud
e
−0.6
−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
0.4
(a) North and South America (CCSM3) (b) Himalayan area (CCSM3)
105oW 90oW 75oW 60oW 45oW
24oS
12oS
0o
12oN
24oN
longitude
latitu
de
−0.8
−0.6
−0.4
−0.2
0
0.2
30oE 45oE 60oE 75oE 90oE
12oN
24oN
36oN
48oN
longitude
latit
ud
e
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
(c) North and South America (GFDL-CM2.0) (d) Himalaya area (GFDL-CM2.0)
Figure 1: Decadal temperature change for the time period of 1980–1999 over (a) and (c) parts of
North and South America, and (b) and (d) region around Himalayan area. Two climate models
are used: CCSM3 for (a) and (b), and GFDL-CM2.0 for (c) and (d). The unit is K.
21
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) Signal for a planar domain (b) Signal for a spherical domain
Figure 2: Original signals for planar and spherical domains
22
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) Spatial (h = 2) (b) Spatial (h = 4)
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(c) Independent (h = 2) (d) Independent (h = 4)
Figure 3: Signal plus Matern covariance with a = 5, b = 2, ν = 0.5 on a planar domain
23
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) Spatial (h = 2) (b) Spatial (h = 4)
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(c) Independent (h = 2) (d) Independent (h = 4)
Figure 4: Signal plus Matern covariance with a = 10, b = 5, ν = 1.5 on a planar domain
24
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) Spatial (h = 2) (b) Spatial (h = 4)
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(c) Independent (h = 2) (d) Independent (h = 4)
Figure 5: Matern covariance with a = 5, b = 2, ν = 0.5 without a signal on a planar domain
25
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) Spatial (h = 2) (b) Spatial (h = 4)
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(c) Independent (h = 2) (d) Independent (h = 4)
Figure 6: Matern covariance with a = 10, b = 5, ν = 1.5 without a signal on a planar domain
26
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) Spatial (h = 2) (b) Spatial (h = 4)
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(c) Independent (h = 2) (d) Independent (h = 4)
Figure 7: Signal plus Matern covariance with a = 0.03, b = 1000, ν = 0.5 in spherical domain
27
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) Spatial (h = 2) (b) Spatial (h = 4)
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(c) Independent (h = 2) (d) Independent (h = 4)
Figure 8: Signal plus Matern covariance with a = 0.08, b = 500, ν = 1.5 in spherical domain
28
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) Spatial (h = 2) (b) Spatial (h = 4)
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(c) Independent (h = 2) (d) Independent (h = 4)
Figure 9: Matern covariance with a = 0.03, b = 1000, ν = 0.5 without a signal in spherical domain
29
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) Spatial (h = 2) (b) Spatial (h = 4)
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(c) Independent (h = 2) (d) Independent (h = 4)
Figure 10: Matern covariance with a = 0.08, b = 500, ν = 1.5 without a signal in spherical domain
30
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) Signal (h = 2) (b) Signal (h = 4)
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(c) No signal (h = 2) (d) No signal (h = 4)
Figure 11: The simulated data are generated from Matern covariance with a = 0.08, b = 500, ν = 1.5
with and without a signal in spherical domain. In SiZer analysis, ν = 0.5 is used instead of the
true ν = 1.5.
31
5 10 15 20 25 30
5
10
15
20
25
30
5 10 15 20 25 30
5
10
15
20
25
30
(a) Spatial (h = 2) (b) Spatial (h = 4)
Figure 12: The simulated data are generated from a spherical covariance function with a signal.
The spatial SiZer uses the Matern covariance with a = 0.026, b = 155.08, ν = 0.87.
32
105oW 90oW 75oW 60oW 45oW
24oS
12oS
0o
12oN
24oN
longitude
latit
ud
e
105oW 90oW 75oW 60oW 45oW
24oS
12oS
0o
12oN
24oN
longitude
latit
ud
e
(a) Spatial (h = 2) (b) Spatial (h = 4)
105oW 90oW 75oW 60oW 45oW
24oS
12oS
0o
12oN
24oN
longitude
latit
ud
e
105oW 90oW 75oW 60oW 45oW
24oS
12oS
0o
12oN
24oN
longitude
latit
ud
e
(c) Independent (h = 2) (d) Independent (h = 4)
Figure 13: SiZer plots for the North and South America area with the GFDL-CM2.0 model
33
30oE 45oE 60oE 75oE 90oE
12oN
24oN
36oN
48oN
longitude
latit
ude
30oE 45oE 60oE 75oE 90oE
12oN
24oN
36oN
48oN
longitude
latit
ude
(a) Spatial (h = 2) (b) Spatial (h = 4)
30oE 45oE 60oE 75oE 90oE
12oN
24oN
36oN
48oN
longitude
latit
ude
30oE 45oE 60oE 75oE 90oE
12oN
24oN
36oN
48oN
longitude
latit
ude
(c) Independent (h = 2) (d) Independent (h = 4)
Figure 14: SiZer plots for the Himalayan area with the GFDL-CM2.0 model
34
105oW 90oW 75oW 60oW 45oW
24oS
12oS
0o
12oN
24oN
longitude
latit
ud
e
105oW 90oW 75oW 60oW 45oW
24oS
12oS
0o
12oN
24oN
longitude
latit
ud
e
(a) Spatial (h = 2) (b) Spatial (h = 4)
105oW 90oW 75oW 60oW 45oW
24oS
12oS
0o
12oN
24oN
longitude
latit
ud
e
105oW 90oW 75oW 60oW 45oW
24oS
12oS
0o
12oN
24oN
longitude
latit
ud
e
(c) Independent (h = 2) (d) Independent (h = 4)
Figure 15: SiZer plots for the North and South America area with the CCSM3 model
35
30oE 45oE 60oE 75oE 90oE
12oN
24oN
36oN
48oN
longitude
latit
ud
e
30oE 45oE 60oE 75oE 90oE
12oN
24oN
36oN
48oN
longitudela
titu
de
(a) Spatial (h = 2) (b) Spatial (h = 4)
30oE 45oE 60oE 75oE 90oE
12oN
24oN
36oN
48oN
longitude
latit
ud
e
30oE 45oE 60oE 75oE 90oE
12oN
24oN
36oN
48oN
longitude
latit
ud
e
(c) Independent (h = 2) (d) Independent (h = 4)
Figure 16: SiZer plots for the Himalayan area with the CCSM3 model
36