+ All Categories
Home > Documents > Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for...

Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for...

Date post: 14-Jun-2020
Category:
Upload: others
View: 12 times
Download: 1 times
Share this document with a friend
96
Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri ([email protected]) Many of these slides were excerpted from a copyrighted short course developed by Chris Wikle and Noel Cressie (University of Wollongong) based on their book Statistics for Spatio-Temporal Data 1
Transcript
Page 1: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Statistics for Spatio-Temporal Data (Tutorial)

Christopher K. Wikle

Department of StatisticsUniversity of Missouri

([email protected])

Many of these slides were excerpted from a copyrighted short coursedeveloped by Chris Wikle and Noel Cressie (University of Wollongong)

based on their book Statistics for Spatio-Temporal Data

1

Page 2: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal Processes and Data

Data from spatio-temporal processes are common in thereal world, representing a variety of interactions acrossprocesses and scales of variability.

2

Page 3: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal Processes and Data (cont.)

Although it may be informative to see snapshots of spatialevents in time (see the Missouri River scene below), tounderstand the process, we must know something aboutthe behavior from one time-period to the next.

3

Page 4: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal Processes and Data (cont.)

Similarly, high-frequency temporal information from thegage level at Hermann, MO (on the Missouri River) doesnot give a sense of the spatial extent of the flood event.

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 19980

5

10

15

20

25

30

35

40

Year

Hei

ght (

ft)

gage levelflood stage

4

Page 5: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Outline of this Tutorial

• Overview of Spatio-Temporal Modeling

• Descriptive vs Dynamical Approach

• Hierarchical Spatio-Temporal Models

• Parameterization of Linear Dynamical Spatio-Temporal Models

• Nonlinear Spatio-Temporal Dynamical Models

• Invasive Species Example

• Ocean Biogeochemical Example

• Conclusion

Most references given in this tutorial can be found in Cressie andWikle (2011) [henceforth, C&W (2011)]

5

Page 6: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal Processes and Data

There is no history without geography (and vice versa)! Weconsider space and time together

The dynamical evolution (time dimension) of spatial processesmeans that we are able to reach more forecefully for the “Why”question. (The problems are clearest when there is noaggregation; henceforth, consider processes at point-levelsupport for this tutorial, unless stated otherwise.)

Notation: Let{Y (s; t) : s ∈ Ds , t ∈ Dt}

denote a spatio-temporal random process. We sometimes writethis process as Y (s; t) or, more correctly, as Y (·; ·). For discretetime, we write Yt(s).

6

Page 7: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal Statistical Modeling

Spatio-tempral models exist in many scientific andmathematical disciplines. From a statistician’sperspective, what makes a model “statistical”?

• Uncertainty in data, model, and the associatedparameters

• Estimation of parameters and prediction of processes

We also often make a distinction between “stochastic”and “statistical”

• The former concerns random structures in models

• The latter concerns estimation and prediction givendata

7

Page 8: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal Processes and Data (cont.)

Why spatio-temporal modeling? Characterizeprocesses in the presence of uncertain and (often)incomplete observations and system knowledge, for thepurposes of:

• Prediction in space (smoothing, interpolation)

• Prediction in time (forecasting)

• Assimilation of observations with deterministic models

• Inference on parameters that explain the etiology ofthe spatio-temporal process

Traditionally, there are two approaches to modeling suchprocesses: descriptive and dynamical.

8

Page 9: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal Modeling

Descriptive (marginal) approach: Characterize thesecond-moment (covariance) behavior of the process

• Several different physical processes could imply thesame marginal structure

• Most useful when knowledge of the etiology of theprocess is limited

9

Page 10: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal Modeling

Dynamical (conditional) approach: Current values ofthe process at a location evolve from past values of theprocess at various locations

• Conditional models are closer to the etiology of thephenomenon under study

• Most useful if there is some a priori knowledgeavailable concerning the process’ behavior

Note that the descriptive approach and the dynamicalapproach can be related through their respectivecovariance functions.

10

Page 11: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

A Simple Example

Consider the deterministic 1-D space × time, reaction-diffusionequation:

∂Y (s; t)

∂t= β

∂2Y (s; t)

∂s2− αY (s; t) ,

for {s ∈ R, t ≥ 0}, where β is the diffusion coefficient and α is the

“reaction” coefficient.

Meaning of the Equation: The rate of change in Y is equal to the“spread” of Y in space (i.e., diffusion) offset by the “loss” of acertain multiple of Y (i.e., reaction).

Behavior of the Equation: From a given initial condition Y (s; 0),the process Y (s; t) dampens as time t increases.

11

Page 12: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

A Simple Example (cont.)

Y (s; 0) = I (15 ≤ s ≤ 24)

(a) α = 1, β = 20; (b) α = 0.05, β = 0.05; (c) α = 1, β = 5012

Page 13: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

A Simple Example: Stochastic VersionConsider the stochastic version of this PDE:

∂Y

∂t− β∂

2Y

∂s2+ αY = η ,

where {η(s; t) : s ∈ R, t ≥ 0} is a mean-zero, white-noise process:

E (η(s; t)) ≡ 0

cov(η(s; t), η(u; r)) = σ2I (s = u, t = r)

In this case, a statistical balance is reached between the“disturbance” caused by η(·; ·) and the smoothing effect of thediffusion and loss components. That is, from a given initial condition,the stochastic PDE results in a process that eventually achieves bothspatial and temporal stationarity. (The more general case ofstochastic PDEs in Rd is given, e.g., by Brown et al., 2000.)

13

Page 14: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Stochastic Reaction-Diffusion Simulation Plots

Y (s; 0) = I (15 ≤ s ≤ 24)

α = 1 , β = 20

(a) σ = 0.01; (b) σ = 0.1; (c) σ = 1

14

Page 15: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal Covariance Function

The stochastic reaction-diffusion equation implies a (stationary inspace and time – definition to follow) covariance function:

CY (h; τ) ≡ cov(Y (s; t),Y (s + h; t + τ))

and correlation function:

ρY (h; τ) ≡ CY (h; τ)/CY (0; 0)

Heine (1955; Biometrika) gives a closed-form solution for ρY (·; ·)for spatial lag h ∈ R and temporal lag τ ∈ R:

ρY (h; τ) = (1/2)

{e−h(α/β)1/2

Erfc

(2τ(α/β)1/2 − h/β

2(τ/β)1/2

)+ eh(α/β)1/2

Erfc

(2τ(α/β)1/2 + h/β

2(τ/β)1/2

)},

where Erfc(z) is the “complementary error function”:

Erfc(z) ≡ (2/π1/2)

∫ ∞z

e−v2dv, z ≥ 0 ;

andErfc(z) = 2 − Erfc(−z), z < 0.

15

Page 16: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Contour Plot of Spatio-Temporal Correlation Function

The plot shows ρY (h; τ) for the stochastic reaction-diffusion equationwhen α = 1 and β = 20

16

Page 17: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Plots of Marginal Spatial and Temporal CorrelationFunctions

Special cases include the marginal spatial correlation function at agiven time: (a) ρY (h; 0) = exp{−h(α/β)1/2}, h > 0; and thetemporal correlation function at a given spatial location: (b)ρY (0; τ) = Erfc(τ 1/2α1/2), τ > 0.

17

Page 18: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal Stationarity

Definition:

We say that f is a stationary spatio-temporal covariance function onRd × R, if it is nonnegative-definite and can be written as:

f ((s; t), (x; r)) = C (s− x; t − r) , s, x ∈ Rd , t, r ∈ R.

If a random process Y (·; ·) has a constant expectation and astationary covariance function CY (h; τ), then it is said to besecond-order (or weakly) stationary. (Strong stationarity implies theequivalence of the two probability measures defining the randomprocess Y (·; ·) and Y (·+ h; ·+ τ), respectively, for all h ∈ Rd and allτ ∈ R.)

18

Page 19: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Separability of Spatio-Temporal Covariance Functions

Stochastic PDEs are built from dynamical physicalconsiderations, and they imply covariance functions.Covariance functions have to be positive-definite (p-d). So,specifying classes of spatio-temporal covariance functions todescribe the dependence in spatio-temporal data is not all thateasy.Suppose the spatial C (1)(h) is p-d and the temporal C (2)(τ) isp-d. Then the separable class:

C (h; τ) ≡ C (1)(h) · C (2)(τ)

is guaranteed to be p-d.Separability is unusual in dynamical models; it says thattemporal evolution proceeds independently at each spatiallocation. That is, separability comes from a lack ofspatio-temporal interaction in Y (·; ·).

19

Page 20: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Stochastic Reaction-Diffusion and Separability

If C (h; τ) = C (1)(h) · C (2)(τ),then

C (h; 0) = C (1)(h)C (2)(0)

C (0; τ) = C (1)(0)C (2)(τ) ,

and henceρ(h; τ) =

C (1)(h) · C (2)(τ)

C (0; 0)

=C (h; 0) · C (0; τ)

C (0; 0) · C (0; 0)

= ρ(h; 0) · ρ(0; τ)

What about the stochastic reaction-diffusion equation for Y (·; ·)?Plot:

ρY (h; 0) · ρY (0; τ) versus (h, τ)

ρY (h; τ) versus (h, τ)

20

Page 21: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Contour Plots of Spatio-Temporal Correlation Functions

(a) ρY (h; 0) · ρY (0; τ); (b) ρY (h; τ)

The difference in correlation functions is striking. Hence ρY (·; ·), forthe stochastic reaction-diffusion equation, is non-separable. Note,however, that it is often difficult to see the difference betweenseparability and non-separability in realizations from a process.

21

Page 22: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Inference on a Hidden Spatio-Temporal Process

We could ignore the dynamics and treat time as another “spatial”dimension (i.e., descriptive approach). Write the data as:

Z = (Z (s1; t1), . . . ,Z (sm; tm))′ ,

which are observations taken at known space-time “locations.”

Note that the data are usually noisy and not observed at alllocations of interest.

Assume a hidden (“true”) process,{Y (s; t) : s ∈ Ds ⊂ Rd , t ≥ 0}, which is not observable due tomeasurement error and “missingness.” Write

Z = Y + ε ,

where E (ε) = 0, cov(ε) = σ2ε I.

We wish to predict Y (s0; t0) from data Z

22

Page 23: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal (Simple) KrigingPredict Y (s0; t0) with the linear predictor, λ′Z + k :

For simplicity, assume E (Y (s; t)) ≡ 0. Then k = 0, and we minimizew.r.t. λ, the mean squared prediction error,

E (Y (s0; t0)− λ′Z)2 .

This results in the simple kriging predictor:

Y (s0; t0) = c(s0; t0)′Σ−1Z Z ,

where ΣZ ≡ cov(Z), andc(s0; t0)′ = cov(Y (s0; t0),Z) = cov(Y (s0; t0),Y)

The simple kriging standard error (s.e.) is:

σk(s0; t0) = {var(Y (s0; t0))− c(s0; t0)′Σ−1Z c(s0; t0)}1/2

23

Page 24: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Kriging for Stochastic Reaction-Diffusion Equation

(a) For simplicity, assume no noise in the data Z (i.e., ε = 0)

(b) Crosses show {(si ; ti) : i = 1, . . . , 48} (“data” locations)

superimposed on the kriging predictor map, {Y (s0; t0)}(c) Kriging s.e. map, {σk(s0; t0)}

24

Page 25: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Kriging for Stochastic Reaction-Diffusion Equation (cont.)

(a) Same noiseless dataset (i.e., ε = 0)

(b) Crosses show different {(si ; ti) : i = 1, . . . , 48)} superimposed on

the kriging predictor map, {Y (s0; t0)}(c) Kriging s.e. map, {σk(s0; t0)}

25

Page 26: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal Covariance Functions

In practice, one does not typically know the underlying stochasticPDE that governs the system of interest. Even with such knowledge,it may not be easy to find the analytical covariance function.

We saw that the assumption of separability is not very realistic andthat covariance functions must satisfy the positive-definitenessproperty. This suggests the need for realistic classes ofspatio-temporal covariance functions.

In recent years, there has been good progress in developing newclasses of spatio-temporal covariance functions through the useof the spectral-domain representation and Bochner’s Theorem (e.g.,see C&W 2011, Sec. 6.1.6: Examples include the work of Cressie andHuang, 1999; Gneiting, 2002; Stein, 2005; and many others).

26

Page 27: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Spatio-Temporal Covariance Functions (cont.)

To date, available classes of (descriptive) S-T covariance functionsare not realistic for many complicated phenomena, and there can beserious computational issues with their implementation in traditionalkriging formulas due to the dimensionality of the prediction problemsof interest.

As an alternative, we can make use of dynamical (conditional)formulations. These simplify the joint-dependence structure. Inaddition, because conditional models are closer to the process’etiology, it may be easier to incorporate process knowledge directly(e.g., using dynamical models).

Consider again the stochastic reaction-diffusion equation, now fromthe dynamical perspective.

27

Page 28: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Emphasize the Dynamics

Approximate the differentials in the reaction-diffusion equation:

∂Y

∂t= β

∂2Y

∂s2− αY

with differences over the grid from 0 to L at intervals ∆s :

Y (s; t + ∆t)− Y (s; t)

∆t= β

{Y (s + ∆s ; t)− 2Y (s; t) + Y (s −∆s ; t)

∆2s

}− αY (s; t)

Define Yt ≡ (Y (∆s ; t), . . . ,Y (L−∆s ; t))′; YBt ≡ (Y (0; t),Y (L; t))′.

Then the stochastic version of the difference equation above is:

Yt+∆t = MYt + MBYBt + ηt+∆t

,

where MBYBt represents given boundary effects. The difference

equation is a good approximation to the differential equation,provided α∆t < 1 and 2β∆t/∆2

s < 1.28

Page 29: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Emphasize the Dynamics (cont.)

Importantly, the matrix M is given by

M =

θ1 θ2 0 . . . 0

θ2 θ1 θ2 . . ....

0 θ2 θ1. . .

......

. . .. . . θ2

0 0 . . . θ2 θ1

,

where θ1 = (1− α∆t − 2β∆t/∆2s ), θ2 = β∆t/∆2

s .

This can be viewed as the propagator (transition) matrix of aVAR(1) process. The matrix is defined by the dynamics. In otherwords, in a dynamic model of spatio-temporal dependence, M hasstructure (which is typically sparse).

29

Page 30: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Emphasize the Dynamics (cont.)

Conditional on the boundary effects, we see that the lagged (in time)spatial covariances are given by,

C(m)Y = MmC(0)

Y ,

where C(m)Y ≡ cov(Yt ,Yt+m∆t ); m = 0, 1, 2, . . . , and it can be shown

that the lag-0 marginal spatial covariance for Y can be written interms of the propagator matrix M and the spatial covariance matrix

for the η-process, C(0)η :

vec(C(0)Y ) = (I−M⊗M)−1vec(C(0)

η ).

This suggests that we can compare the spatio-temporal covariancestructure for this reaction-diffusion difference equation with thePDE’s theoretical form derived by Heine (1955).

30

Page 31: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Comparison of Differential and Difference Equations

Spatio-temporal correlations; α = 1, β = 20, ∆s = 1, and ∆t = 0.01Solid blue line: from differential equationRed dots: from difference equation

31

Page 32: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

The Dynamics in the Difference Equation

Think of a spatial process at time t rather than a spatio-temporalprocess. Call it the vector Yt . Then describe its dynamics by adiscrete-time Markov process; e.g., VAR(1):

Yt = MYt−1 + ηt

As implied above, the choice of M is crucial. In particular, we notethat M ≡ (mij) represent “spatial weights” of the process values fromthe past, e.g.,

Yt(si) =n∑

j=1

mijYt−1(sj) + ηt(si).

Usually, many of these coefficients have small or zero weight.Typically, the mij corresponding to nearby locations si and sj arenon-zero, and they are zero when locations are far apart.

32

Page 33: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Structure of M

These directed graphs show the case of one-dimensional space:

su = s

u = u1

u = u2

t-1 t

u

u

u

u ii

i − 2

i − 1

i + 1

i + 2

i − 2

i − 1

i + 1

i + 2

t-1 t

u u

uuu

uuu

uuu

uuu

General M M defined “spatially”

33

Page 34: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Structure of M (cont.)

The importance of the structure of M suggests ways in which thismatrix can be parameterized.

What is it about the structure of this matrix and the values of these“nearest neighbor” parameters that affect the dynamics? Can we usethis sort of scientific process knowledge (in various forms) to helpwith this parameterization?

In fact, this type of information can help us but we need an efficientframework in which to build it into the model.

The hierarchical modeling framework is quite helpful in this regard.

34

Page 35: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Towards Hierarchical Spatio-Temporal Statistical Models

• We can motivate dynamical models through mechanisticrelationships.

• These models can still be over-parameterized, or too simple forreal-world processes.

• We must account for this complexity and our uncertainty in theprocess and parameters.

• There is also uncertainty in data, and the size of the dataset canbe a problem.

• Hierarchical statistical models (specifically, BayesianHierarchical Models, BHMs) can provide a framework to accountfor these issues.

Before getting back to the dynamical specifications, consider thefollowing motivating example to illustrate the BHM approach forspatio-temporal modeling.

35

Page 36: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Motivating Problem: Spread of Invasive SpeciesEurasian Collared-Doves (ECDs)

• The Eurasian Collared-Dove (Streptopelia decaocto) originated inAsia and, starting in the 1930s, expanded its range into Europe(Hudson, 1965).

• They were first observed in the United States in the mid 1980safter being introduced into the Bahamas in 1974 from apopulation that escaped captivity (Smith, 1987).

• Since the species’ introduction in Florida, its range has beenexpanding dramatically across North America.

36

Page 37: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Breeding Bird Survey (BBS) Counts of ECD, 1986-2003

37

Page 38: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

BBS ECD Counts: 2003 and Yearly Totals)

38

Page 39: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Invasion Impacts

• ECD biological threats (Romagosa and Labisky, 2000):competition for resources with native avifauna; transmission ofdisease

• “ECD will probably colonize all of North America within a fewdecades” (Romagosa and Labisky, 2000)

Just how probable is this colonization? The example presented laterwill answer this question.

The following provides the spatio-temporal hierarchical motivation forsuch a model.

39

Page 40: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Typical Invasions

Invasive species phases:

• Introduction

• Establishment

• Range Expansion

• Saturation

Ecological models for invasions involve dispersal and growth

40

Page 41: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Uncertainty in Spread of Invasives

• Uncertainty in data (e.g, BBS counts)I differences in experience and expertise of the BBS volunteer

observers leads to differences in probability of detection

I The Eurasian Collared-Dove is similar in appearance to theRinged Turtle-Dove. Although there are fundamentaldifferences, observers routinely mistake these species,especially early in invasion.

• Uncertainty and complexity in the underlyingspatio-temporal process dynamics

I “diffusion” (spread) and growthI species interactionsI important exogenous variables

• Uncertainty in parametersI diffusion, growth, and carrying capacity vary spatially

41

Page 42: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Bayesian Hierarchical Spatio-Temporal Models

Basic rule of probability: [Z ,Y , θ] = [Z |Y , θ][Y |θ][θ]

Rather than seek to model the complicated joint distribution, wefactor this joint distribution as a product of a sequence of conditionaldistributions, to which we might be able to apply scientific insight.

Thus, for complicated spatio-temporal processes, we consider thefollowing three-stage factorization of [data, process, parameters](Berliner, 1996; Wikle, et al. 1998):

Stage 1. Data Model: [data|process, data parameters]

Stage 2. Process Model: [process|process parameters]

Stage 3. Parameter Model: [data params and process params].

42

Page 43: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Data Models

Let Za be data observed for some process Y , and let θa beparameters.

The data model is written:

[Za|Y , θa]

This distribution is much simpler than the unconditionaldistribution of [Za], because most of the complicatedstructure (spatial and temporal) comes from theprocess Y .

43

Page 44: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Data Models (cont.)Combining data sets: given observations Za,Zb for the sameprocess, Y , often we can write:

[Za,Zb|Y , θa, θb] = [Za|Y , θa][Zb|Y , θb] .

That is, conditional on the true process, the data are often assumedindependent. (Note that they are almost certainly not unconditionallyindependent!). This hierarchical framework presents a natural way toaccommodate data at differing spatial and temporal resolutions andalignments (e.g.,Wikle and Berliner, 2005).

Similarly, for multivariate process (Ya,Yb), often we can write:

[Za,Zb|Ya,Yb, θa, θb] = [Za|Ya, θa][Zb|Yb, θb] .

Again, conditional on the true processes, the data are often assumedindependent.

44

Page 45: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Process Models

Process models are also often factored into a series of conditionalmodels:

[Ya,Yb|θY ] = [Ya|Yb, θY ][Yb|θY ]

We make such an assumption when using the Markov model fordynamical processes. (For example, in the first-order case, the “a”and “b” subscripts refer to time t and t − 1, respectively.)

Such factorizations are also important for simplifying multivariateprocesses; Royle and Berliner (1999) consider such a conditionalframework for modeling multivariate spatial processes. For example,consider ozone concentration conditioned on temperature; or considerCO2 conditioned on potential temperature.

45

Page 46: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Parameter Models

Parameter models can also be factored into subcomponents. Forexample, we might assume,

[θa, θb, θY ] = [θa][θb][θY ].

That is, we often assume that parameter distributions areindependent, although subject-matter knowledge may lead to morecomplex parameter models.

Scientific insight and previous studies can facilitate the specificationof these models. For example, measurement-error parameters canoften be obtained from previous studies that focused on such issues(this is typically the case for environmental variables and someecological data such as from the BBS).

46

Page 47: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Parameter Models (cont.)

Process parameters often carry scientific insight (e.g., spatiallydependent diffusion parameters, Wikle, 2003; turbulence parameters,Wikle et al., 2001).

In some cases, we do not know much about the parameters and usevague or non-informative distributions for parameters. Alternatively,we might use data-based estimates for such parameters.

Specification of parameter distributions is often criticized for its“subjectiveness.” Such criticism is misguided! This is what brings thepower to hierarchical models.

47

Page 48: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Bayesian Hierarchical Model (BHM): Schematic Example

• [data | process, parameters]: uncertainty in observations. Forexample,

[bird-count observations | true bird counts, data parameters]

• [process | parameters]: science (diffusion PDEs); partitioned intosubcomponents (e.g., Markov process); uncertainty (additivenoise, random effects). For example,

[true bird counts | diffusion and growth processes, process params]

• [parameters]: prior scientific understanding. For example,

[diffusion parameters | habitat covariates]

48

Page 49: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Empirical Hierarchical Model (EHM)

• [data | process, parameters]: For example,

[bird-count observations | true bird counts, data parameters]

• [process | parameters]: For example,

[true bird counts | diffusion and growth processes, process params]

• data parameters and process parameters are assumed fixed butunknown. They are typically estimated based on the marginaldistribution,

[data | parameters]

This framework is common in traditional state-space models whereone might use an E-M algorithm for parameter estimation.

49

Page 50: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Inference for Hierarchical Statistical Models

BHM: Use Bayes’ Theorem to derive the posterior distribution,

[process, parameters | data]∝ Data Model × Process Model × Parameter Model

The normalizing constant is [data]

EHM: Use Bayes’ Theorem to derive the predictive distribution,

[process | data, parameters]∝ Data Model × Process Model

The normalizing constant is [data | parameters]. The unknownparameters are replaced with estimates.

50

Page 51: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

General Dynamic Spatio-Temporal Model (DSTM)

51

Page 52: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

General DSTM (Data Models)

The general DSTM data model typically makes the sameassumption as in generalized linear mixed models(GLMMs): conditioned on the mean response, theobservations are independent. This makes a dramaticsimplification in the case of non-Gaussian likelihoods.

In the context of DSTMs, conditioned on thespatio-temporal process, the observations are assumed tobe independent. The focus is then on modeling this latentspatio-temporal process.

52

Page 53: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

General DSTM (Data Models)

In most cases, a transformation of the underlying latentspatio-temporal process is assumed to be conditionally Gaussian –this is then where we put our modeling effort.

E.g., one could imagine this corresponding to the underlying intensityof a spatio-temporal (log-Gaussian Cox) point process or the logit ofthe probability of presence in an occupancy model.

We consider this conditional Gaussian latent process approach in thistutorial.

NOTE: although this GLMM perspective is quite general andeffective, there are some alternative approaches (e.g., spatio-temporalauto-logistic models (Zheng and Zhu, 2008); spatio-temporalstochastic agent-based models (Hooten and Wikle, 2010), etc.).

53

Page 54: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Statistical DSTMs: Process Modeling

Spatio-temporal dynamics are due to the interactionof the process across space and time and/or acrossscales of variability

I Some types of interaction make sense for some processes,and some don’t (e.g., process knowledge should not beignored if available)

I Statisticians have often ignored such knowledge!

Dimensionality can prevent the (efficient) estimationof model parameters, e.g., M(·) or M

I Requires sensible science-based parameterizations and/ordimension reduction; sparse structures

I Hierarchical representations can help here as well

54

Page 55: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

First-Order Linear DSTM Process Revisted

55

Page 56: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

First-Order Linear DSTM Process

Linear spatio-temporal processes often exhibit advective and diffusivebehavior:

“width” (decay rate) of thetransition operatorneighborhood controls the rateof spread (diffusion)

degree of “asymmetry” in thetransition operator controls thespeed and direction ofpropagation (advection)

“long range dependence” can beaccommodated by “multimodal”operators and/or heavy tails

This suggests ways thatwe might parameterize thetransition operator and/orinduce sparse structure.

56

Page 57: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Basic Hierarchical Linear DSTM

Data:Zt = Ht(θh,1)Yt + εt , εt ∼ Gau(0,R(θh,2))

Process:

Yt = M(θm,1)Yt−1 + ηt , ηt ∼ Gau(0,Q(θm,2))

Parameters:

θh,1, θh,2, θm,1, θm,2

These parameters may be estimated

empirically, or they can be given prior

distributions, such as Gaussian random process

priors (that may depend on other variables),

and they can easily be allowed to vary with

time and/or space so as to borrow strength.

57

Page 58: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Radar Nowcasting Motivation! (September 25, 2010)

Can we predict in near real-time when this storm will arrive?

58

Page 59: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Example: Radar Nowcasting (Sydney, pre 2000 Olympics)

34

Posterior Mean: Kernels

35

Sta$s$cal  model  mo$vated  by  an  IDE    (linear  advec$on-­‐diffusion)  process  with  spa$ally  varying  parameters.  

Xu,  Wikle,  and  Fox  (2005)  

Data  

Implied  Propaga$on  by  Post.  Params.  

Mean   Post.  Real.   Std.Dev.  

Forec.  

59

Page 60: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Low Rank (Spectral) Reduction on DynamicsIt can be useful to parameterize the science-based dynamicalspatio-temporal process in terms of a reduced-rank basis function(spectral) expansion:

Yt = µ+ Φαt + Ψβt

αt = Mααt−1 + ηα,t .

where Φ is a basis function matrix and αt are associated expansioncoefficients.

In this case, either the dimension of the dynamical process αt ismuch lower than n, reducing the number of parameters in Mα andQα ≡ cov(ηα,t), and/or the reduction acts as a decorrelator, whichreduces the complexity of these matrices.

One can still get science-based parameterizations when working in“spectral” space. In particular, many PDEs and IDEs are amenable tospectral and Galerkin-based representations (e.g., see C&W 2011, pp.396-402). 60

Page 61: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

A Word About Spatial Basis Functions

In general: wt(s) ≈∑pk=1 φk(s)αt(k), where φk(s), αt(k) are the

spatial basis functions and associated expansion coefficients,respectively.

Currently, it is very fashionable to consider such expansions inspatial statistics for “big data”

Many choices: e.g., orthogonal polynomials, wavelets, splines,Wendland, Galerkin, empirical orthogonal functions (EOFs),discrete kernel convolutions, “factor” loadings, “predictiveprocesses”, Moran’s I bases, etc.Basis Function Decisions:

I Fixed or “estimated” (parameterized);I reduced rank (p << n), complete (p = n), or overcomplete (p > n);I expansion coefficients in physical space or “spectral” space;I discrete or continuous space

61

Page 62: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

A Word About Basis Functions for Spatial Processes(cont.)

There is very little guidance on which bases to select!I people have their favoritesI in most spatial cases, it probably doesn’t matter much!

For linear dynamical processes, it can matter: Why?I if the bases are estimated, there is potential confounding

between the dynamics on the coefficients and the bases

I dimensionality of the coefficients may impact the ability toestimate parameters in the dynamic model withoutadditional information

For nonlinear spatio-temporal dynamics, the choice ofbasis function is even more critical.

I One must account for scale interaction

62

Page 63: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Nonlinear Spatio-Temporal Processes

Few environmental/ecological processes are linear (e.g.,density-dependent growth, nonlinear advection, repulsion,shock waves, infection, predation, etc.)

Nonlinear dynamical behavior arises from thecomplicated interactions across spatio-temporalscales of variability and interactions across multipleprocesses!

Examples abound in mechanistic and process modelsacross many disciplines

63

Page 64: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Nonlinear Spatio-Temporal Processes: Examples

64

Page 65: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Nonlinear Spatio-Temporal Processes: Examples (cont.)

65

Page 66: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Nonlinear Spatio-Temporal Processes: Examples (cont.)

66

Page 67: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Commonaltiy?

What do all of these processes have in common?

Quadratic nonlinearity!

This suggests a class of useful statistical models fornonlinear DSTM processes:

General Quadratic Nonlinearity (GQN)

67

Page 68: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

General Quadratic Nonlinearity (GQN)(Wikle and Hooten, 2010)

68

Page 69: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

GQN (cont.)

Even with Gaussian conditional noise, ηt(s), the jointdistribution of {Yt(si) : i = 1, . . . , n} is not, ingeneral, Gaussian.

Major Problem: There are too many parameters toestimate in typical spatio-temporal applicationswithout extra information!

As with the linear DSTM, we can consider:I mechanistically-motivated parameterizations,I reduced-rank spectral representations,I shrinkage priors

Consider the following illustrative example.

69

Page 70: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Mechanistically-Motivated Example: ECD Invasive SpeciesRevisited

Invasive  Species:  phases  of  a  successful  invasion  

1. Introduc5on  2. Establishment  3. Range  Expansion  4. Satura5on  

Popula5on  Growth  Model  

Spa5al  Movement  Model  

Example:  Eurasian  Collared  Dove                (Streptopelia  decaocto)  

• Invaded  Europe  in  the  1930s  • Introduced  to  S.  Florida  in  mid-­‐1980s  • Data  collected  through  N.  American  Breeding  Bird  Survey  (BBS)    • We  considered  gridded  average  counts  

70

Page 71: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

North American Breeding Bird Survey (BBS): ECD

71

Page 72: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Data Model

Zt(si )|Yt(si ), θ ∼ ind . Bin(Yt(si ), θ) , i = 1, . . . ,m, t = 1, . . . ,T .

• Zt(si ) is the observed ECD count for route i and year t; Yt(si ) is thetrue but unknown abundance of ECDs at location si in year t.

• Since we only observed a subset of the number of doves present at agiven site/time, θ represents the probability of detecting an ECDwhen it is there.

• In general, Yt(si ) and θ are not both identifiable without additionalinformation (e.g., multiple surveys, capture-recapture, etc.).

• We were able to use strong prior information about θ from a detailedstudy on a related species (Mourning Dove) that was thought toshare similar detectability characteristics.

72

Page 73: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Process Model

True ECD population (i.e., abundance):Yt(si)

Defining Yt ≡ (Yt(s1), . . . ,Yt(sm))′ (at m observation locations), weassume that

Yt |λt ∼ ind . Poi(Hλt), t = 1, 2, . . . ,

where the n-dimensional vector,

λt ≡ (λt(s1), . . . , λt(sm), λt(sm+1), . . . , λt(sn))′

corresponds to the “true intensity” at observation locations and theadditional prediction locations {sm+1, . . . , sn}.

The matrix H is an incidence matrix that relates the true processYt(·) at observation locations, to the intensity process λt at alllocations of interest.

73

Page 74: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Process Model (cont.)This is motivated by a “matrix model” framework from populationdynamics with random parameters:

λt = Mλt−1

= B(τ )G(λt−1;θG )λt−1 , t = 2, 3, . . . ,

where the propagator matrix M is comprised of two distinct n × nmatrices.

• G(λt−1;θG ) is a diagonal matrix that accommodates growthover time and is dependent on the previous state λt−1 and(Ricker) growth parameters θG :

Gii(λt−1(si); θG1 , θG2 ) ≡ exp

{θG1

(1− λt−1(si)

θG2

)}, i = 1, . . . , n ,

where θG1 and θG2 are the growth and carrying-capacityparameters, respectively.

74

Page 75: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Process Model (cont.)

Recall,

λt = B(τ )G(λt−1;θG )λt−1, t = 2, 3, . . . .

• B(τ ) accommodates dispersal of the population and isdependent on dispersal parameters τ in a Gaussian dispersalkernel. The (i , j)-th element of this matrix is given by:

Bij(τ ) ∝ exp

{− d2

ij

τ(si)

},

where d2ij is the distance between location si and sj , and

τ ≡ (τ(s1), . . . , τ(sn))′ are spatially varying dispersal coefficients.

75

Page 76: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Process Model (cont.)

Why is this GQN?

For i = 1, . . . , n:

λt(si) =n∑

j=1

Bij(τ )λt(sj) exp

{θG1

(1− λt−1(sj)

θG2

)}.

But, notice that most of the interaction parameters are pre-specified

to be zero (thus, there are only O(n2) parameters) and these

non-zero parameters are highly parameterized in terms of the τ

spatial process and the growth parameters. That is, there are just a

few controlling parameters, so the effective number of parameters is

much less than O(n2).76

Page 77: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Parameter Models

θ ∼ Beta(aθ, bθ)

log(τ ) ∼ Gau(0,Στ ) [ spatial random field ]

log(λ1) ∼ Gau(0,Σλ) [ spatial random field ]

θG1 ∼ Gau(µ1, σ21)

θG2 ∼ IG (a2, b2)

As detailed in Hooten et al. (2007), considerable care went into thechoice of the hyperparameters for these distributions, includinginformation from previous studies and expert knowledge.

Note that in this model, the randomness in the dynamics comes fromthe initial condition (λ1) and the parameters in the evolutionmatrices, τ and θG .

77

Page 78: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Implementation

• Estimation and forecasting was implemented on a gridof points across the eastern two-thirds of the USA

• The Markov chain Monte Carlo (MCMC) algorithmwas a combination of Gibbs and Metropolis-Hastingssteps

• MCMC: 200,000 samples with 20,000 burn-in

• Data were available from 1986 - 2003, which was usedfor estimation.

• “Out-of-sample” forecasts were made for 2004-2020,based on the 1986 - 2003 data.

78

Page 79: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Posterior Means: In-Sample

79

Page 80: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Posterior Means: Out-of-Sample

80

Page 81: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Posterior Credible Intervals (95%)Two Locations: S. Florida, N. Utah

81

Page 82: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Posterior Inference on Dispersion ParametersDispersion parameters (τ ): Posterior mean (top),Posterior standard deviation (bottom)

82

Page 83: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Epilogue to ECD Analysis

• This analysis was conducted in the mid-2000s, based on datafrom 1986 through 2003.

• As of 2013, surveys show that ECD sightings are now relativelyfrequent throughout the continental US, with the exception ofthe northeast.

• The forecast made in our analysis was reasonable, if somewhatconservative in the speed of the invasion.

• Presumably, the forecast could have been improved if ecologicallyrelevant covariates were used in the model for dispersalparameters.

83

Page 84: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Reduced Rank QN, Shrinkage, and Informative Priors

The important dynamics of many ecological and environmentalsystems exist on a lower-dimensional manifold. This suggests thatreduced rank dynamics may be appropriate (i.e., the state dimensionis p << n). Typically, there are still too many transition parametersto estimate without further restriction.

Mechanistically-motivated choices can be made to reduce theparameter space (e.g., certain nonlinear scale interactions are lesslikely for some processes; e.g., Gladish and Wikle, 2014)

Use shrinkage priors on transition parameters (e.g., stochasticsearch variable selection; e.g., Wikle and Holan, 2011)

Use priors derived from mechanistic model output (e.g., Leeds etal. 2013)

Consider the following example illustrating the latter.

84

Page 85: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Ocean  Biogeophysical  Coupling  NASA  AMSR:  Sea  Surface  Temperature  (scien>fic  visualiza>on  studio)  

NASA  SEAWIFS:  Ocean  Color  (scien>fic  visualiza>on  studio)  

Complicated  mul>variate  process  with  variability  across  many  spa>o-­‐temporal  scales.    Interac>ons:  •   Between  processes  •   Within  processes    

Proxy  for  ocean  primary  produc1on      (i.e.,  chlorophyll/phytoplankton)  

85

Page 86: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Ocean  Color  Observa.ons  

Coastal  Gulf  of  Alaska    SeaWiFS  Ocean  Color  Satellite    Observa.ons  (8  day  averages)  

“Gappy”  and  substan.al  measurement  uncertainty!  We  seek  to  predict  at  missing  loca.ons  and  filter  obs  error.  

(ocean  color:  surrogate  for  phytoplankton)  

86

Page 87: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Complicated  Model  Components  •  Lower  Trophic  Ecosystem    

–  Essen1ally  a  complicated  mul1component  predator-­‐prey  system  influenced  by  the  environment              (highly  nonlinear)  

•  Physical  Ocean  –  Navier-­‐Stokes  fluid  dynamic  process  across  mul1ple  state  variables  (highly  nonlinear)  

Coupled!  

(nonlinear)  

The  process  of  interest  is  mul;variate,  nonlinear  and  spa;o-­‐temporal.    

Sea  Surface  Height  and  Currents  

87

Page 88: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Physical-­‐Biological  Interface  Sample  output  from  a  coupled  ocean-­‐ecology  model  in  the  coastal  Gulf  

of  Alaska  for  May  1,  2001  (Fiechter  et  al.  2008)  

Sea  surface  height  (SSH)    and  currents  

Chlorophyll  concentraHon      and  bathymetry  

Note:  we  can  learn  about  the  biology  by  knowing  something  about  the  physics!  

(A  DeterminisHc  Model)  

Strong  associa9on!  

88

Page 89: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

•  Data  Assimila*on:  Combine  primary  produc0on  data  and  mechanis0c  computer  model  for  a  coupled  ocean  and  ecosystem  model  (ROMS-­‐NPZDFe;  Fiechter  et  al.  2009)  

•  Surrogate:  quadra0c  nonlinear  emulator  for  coupled  model:    Phytoplankton,  SSH  (sea  surface  height),  and  SST  (sea  surface  temperature)  model  output  

•  Predict/Assimilate:  Primary  produc0on  given  high-­‐dimensional  ocean  color  (SeaWiFS)  satellite  data  and  ocean  model  physical  output  

EXAMPLE:    Spa0o-­‐temporal  predic0on  of  primary  produc0on  (chlorophyll)  in  the  Coastal  Gulf  of  Alaska  (GOGA)  

Gulf  of  Alaska  

• Train  based  on  4  years  (1998-­‐2001),  8  day  averages  

• Predict/Assimilate  for  2002  

89

Page 90: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Coupled  Dynamics:  Example  from  Coupled  Ocean  Model  Phytoplankton   SSH   SST  

Example  training  data    Time:  consecu@ve  8-­‐day  periods  

90

Page 91: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Hierarchical  Reduced-­‐Rank  GQN    Emulator-­‐Assisted  DSTM    

KEY  POINT:  means  are  from  a  GQN  parametric  staEsEcal  emulator;  esEmated  “off-­‐line”;  or,  alternaEvely,  these  can  inform  priors  on  a  SSVS  prior  probability  of  inclusion.  

Zt = Ht�↵t + Ht⇥�t + ✏t, ✏t ⇠ Gau(0,Rt)

[Rt,Q, ⌧ ]

�t ⇠ Gau(0,diag(⌧ ))

↵t = m(↵t�1;✓) + ⌘t, ⌘t ⇠ Gau(0,Q)

✓ ⇠ (✓,⌃✓)

Yt = �↵t + ⇥�tNote:  

*  m()  quadraEc  nonlinear  model  

(Leeds  et  al.  2013)  

91

Page 92: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Ocean  Ecosystem  Example  •  In  this  case:    

   •  State  Rank  Reduc9on:  O(105)  to  O(10)  (EOFs  from  the  coupled  ROMS-­‐NPZDFe  output  for  1998-­‐2001;    7  EOFs;  97.5%  of  the  varia9on)  

•  Nonlinear  surrogate:  quadra9c  nonlinear  model    –  the  non-­‐dynamic  small-­‐scale  components  were  based  on  the  next  10  singular  vectors  (over  99%  of  varia9on  in  model  output)  

Zt =

0@

Z1,t

Z2,t

Z3,t

1A mi,t(i = 1, 2, 3) - dimensional data vectors

for Chlorophyll, SSH, SST

Yt =

0@

Y1,t

Y2,t

Y3,t

1A pi(i = 1, 2, 3) - dimensional reduced rank

process vectors for Chlorophyll, SSH, SST

(Work  in  log  space  for  Chlorophyll)  Data  vector  

Process  vector  

92

Page 93: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Results:  log(CHL)  

Data  (SeaWiFS)  

Posterior  Mean  

Posterior    STD  

93

Page 94: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

Results  (cont.)  

94

Page 95: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

ConclusionThere is much work to be done in the development of spatio-temporalstatistical models – from a theory, computation, and applicationperspective. Some important things I didn’t talk about here (or, onlymentioned briefly); most of these are very active areas of research:

Sampling modelsComputationMultivariate modelsAreal dataSpatio-temporal point processesChange-of-supportModel evaluation and “selection”Agent (Individual)-based modelsGravity models, flow models, functional modelsSampling network designLinkage across modelsSpatio-temporal confounding

95

Page 96: Statistics for Spatio-Temporal Data (Tutorial) Christopher ...€¦ · Statistics for Spatio-Temporal Data (Tutorial) Christopher K. Wikle Department of Statistics University of Missouri

THANK YOU!

The material in this tutorialwas based loosely on the2011 John Wiley & Sonsbook by Noel Cressie andChris Wikle. Mostreferences given on theslides can be found in thebook’s bibliography; or,send me an email at([email protected]) and Iwill gladly send you thereference.

96


Recommended