+ All Categories
Home > Documents > by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Date post: 30-Dec-2015
Category:
Upload: cyrus-livingston
View: 21 times
Download: 1 times
Share this document with a friend
Description:
Sugar Cane Production in Puerto Rico, 1958/59-1973/74: A Comparison of Four Model Specifications for Describing Small Heterogeneous Space-Time Datasets. by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences. ABSTRACT. - PowerPoint PPT Presentation
Popular Tags:
35
Production in Production in Puerto Rico, Puerto Rico, 1958/59-1973/74: A 1958/59-1973/74: A Comparison of Four Comparison of Four Model Model Specifications for Specifications for Describing Small Describing Small Heterogeneous Heterogeneous Space-Time Space-Time Datasets Datasets by by Daniel A. Griffith Daniel A. Griffith Ashbel Smith Ashbel Smith Professor of Professor of Geospatial Geospatial Information Sciences Information Sciences
Transcript
Page 1: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Sugar Cane Production Sugar Cane Production in Puerto Rico, 1958/59-in Puerto Rico, 1958/59-1973/74: A Comparison 1973/74: A Comparison of Four Model of Four Model Specifications for Specifications for Describing Small Describing Small Heterogeneous Space-Heterogeneous Space-Time DatasetsTime Datasets

bybyDaniel A. GriffithDaniel A. Griffith

Ashbel Smith Professor Ashbel Smith Professor of Geospatial Information of Geospatial Information

SciencesSciences

Page 2: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

ABSTRACTABSTRACTResearchers increasingly are accounting for heterogeneity in their empirical analyses. When data form a short time series—too short to utilize an ARIMA model—a random effect term can be employed to account for serial correlation. When data also are georeferenced, forming a space-time dataset, a random effect term can be included that is spatially structured in order to account for spatial autocorrelation, too. But space-time heterogeneity can be accounted for in various ways, including specifications involving recently developed spatial filtering methodology. This paper summarizes comparisons of four model specifications—simple pooled space-time; sequential, comparative statics; temporally varying coefficients with a spatially unstructured random effect; and, temporally varying coefficients with a spatially structured random effect—illustrating implementations with annual sugar cane production data for the 73 municipalities of Puerto Rico during 1958/59-1973/74. Covariates whose importance is assessed include elevation and distance from the primate city.

Page 3: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Panel data versus space-time dataPanel data are a form of longitudinal data, and can

be a cross-section (i.e., the spatial dimension) of individuals (e.g., farms) that are surveyed periodically over a given time horizon.

With repeated observations of the same individuals, panel data permit a researcher to study the dynamics of change with short time series.

A main advantage of panel data: controlling for unobserved heterogeneity (the fundamental complication of non-experimental data collection)

BUT longitudinal data need not involve the same individuals: if a sample is not the same, observed changes also may result from sampling error

Page 4: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Spatial filtering

A given random variable can be decomposed into a spatial component and an aspatial component: impulse-response function approach (based upon the autoregressive model), Getis approach (based on the K function), eigenfunction spatial filtering approach.

The spatial component relates to spatial autocorrelation

Page 5: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

High Peak district biomass index:ratio of remotely sensed data spectral

bands B3 and B4

Spatially autocorrelated Geographically random

Page 6: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Defining spatial autocorrelation

Auto: self

Correlation: degree of relative correspondence

Positive: similar values cluster together on a map

Negative: dissimilar valuesCluster together on a map

Page 7: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Spatial auto-correlation

n

)x(x

n

)y(y

)/nx)(xy1(y

n

1i

2i

n

1i

2i

n

1iii

n

)y(y

n

)y(y

c/)y)(yy(yc

n

1i

2i

n

1i

2i

n

1i

n

1i

n

1jij

n

1jjiij

from r to MC

Page 8: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Constructing eigenfunctions for filtering spatial autocorrelation out of georeferenced variables:

Moran Coefficient = (n/1T C1)x

YT(I – 11T/n)C (I – 11T/n)Y/ YT(I – 11T/n)Y

the eigenfunctions come from

(I – 11T/n)C (I – 11T/n)

Page 9: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Eigenvectors for spatial filter construction

The first eigenvector, say E1, is the set of real number numerical values that has the largest MC achievable by any set for the spatial arrangement defined by the geographic connectivity matrix C. The second eigenvector is the set of values that has the largest achievable MC by any set that is uncorrelated with E1. The third eigenvector is the third such set of values. And so on. This sequential construction of eigenvectors continues through En, the set of values that has the largest negative MC achievable by any set that is uncorrelated with the preceding (n-1) eigenvectors.

Page 10: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Useful citation

Page 11: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Random effects model

is a random observation effect (differences among individual observational units)

is a time-varying residual error (links to change over time)

The composite error term is the sum of the two.

) , f( εξXβY ξ

ε

Page 12: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Random effects model: normally distributed intercept term

• ~ N(0, ) and uncorrelated with covariates

• supports inference beyond the nonrandom sample analyzed

• simplest is where intercept is allowed to vary across areal units (repeated observations are individual time series)

• The random effect variable is integrated out (with numerical methods) of the likelihood fcn

• accounts for missing variables & within unit correlation (commonality across time periods)

2σξ

Page 13: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Sugar cane production in Puerto Rico• Began in the 1530s• Experienced a sharp decline during 1580-1650• Introduction of slave labor resulted in considerable

expansion during 1765-1823• By 1828, sugar exports were sizeable• Spanish monarchy discouraging expansion

throughout much of the 1800s• United States took possession of the island in

1899, fully developing the long-demanded railroad on the island and channeling considerable investment into sugar cane production, achieving maximum expansion in the 1920

• Production peaked around 1950

Page 14: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Island-wide time series

, )I0.81524(I1.68I7.231000

tons0.15060LN

1.68I7.231000

tons0.84940LN0.009811.68I7.23

1000

tonsN̂L

1tt2-t2t

1-t1t

tt

US intervention

Page 15: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

1924 sugar cane railroad

Finally started by the Spanish Crown, but aggressively completed by US investors

Page 16: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Covariates of sugar cane production

elevationelevation distance from San Juandistance from San Juan

covariate spatial filterscovariate spatial filters

Page 17: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Model specifications

1974

1959tiSJ,ti,tdist,

1974

1959t

iti,t,elev

1974

1959tti,t0,

ti,

ti, dIβelevIβIβp100

pLN

1974

1959tiSJ,ti,tdist,

1974

1959t

iti,t,elevt0ti,

ti, dIβelevIβ1958)(Tββp100

pLN

I-A: initialI-A: initial

I-B: with linear time trendI-B: with linear time trend

i

1974

1959tiSJ,ti,tdist,

1974

1959t

iti,t,elevt0ti,

ti, εdIβelevIβ1958)(Tββp100

pLN

II: with random effectII: with random effect

Page 18: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

18

1j

1974

1959tji,ti,t,E

1974

1959tiSJ,ti,tdist,

1974

1959t

iti,t,elevt0ti,

ti,

eIβ

dIβelevIβ1958)(Tββp100

pLN

j

III: with spatial filterIII: with spatial filter

i

18

1j

1974

1959tji,ti,t,E

1974

1959tiSJ,ti,tdist,

1974

1959t

iti,t,elevt0ti,

ti,

εeIβ

dIβelevIβ1958)(Tββp100

pLN

j

IV: with spatially structured random effectIV: with spatially structured random effect

Page 19: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Sugar cane production:1958/59-1973/74

1958/591958/591963/641963/64

1968/691968/69 1973/741973/74

ScaleDark red: high

Dark green: low

Page 20: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Year covariates Deviance Pseudo-R2 MC for % Residual MC

1958/59

Time-based intercept,

mean elevation, Distance from San

Juan

1565 0.503 0.31968 0.04912

1959/60 1503 0.527 0.33317 0.05521

1960/61 1561 0.540 0.35751 0.06663

1961/62 1543 0.559 0.38669 0.08844

1962/63 1490 0.576 0.41571 0.10887

1963/64 1544 0.579 0.42272 0.10598

1964/65 1467 0.599 0.46101 0.12206

1965/66 1523 0.586 0.48383 0.16313

1966/67 1610 0.571 0.49018 0.18957

1967/68 1601 0.545 0.47420 0.17009

1968/69 1259 0.620 0.53851 0.17194

1969/70 1273 0.574 0.47448 0.13531

1970/71 1149 0.518 0.43049 0.18262

1971/72 1164 0.548 0.43207 0.12463

1972/73 1146 0.477 0.42875 0.19466

1973/74 899 0.566 0.39513 0.04261

Page 21: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Year

Spatially unstructured Spatially structured

Deviance statistic

Pseudo-R2 Residual MC Selected vectors Deviance statistic

Pseudo-R2 Residual MC

58/59 473 0.881 0.33271E3, E4, E6, E7,

E8, E13, E18

378 0.957 -0.02771

59/60 403 0.906 0.34707E3, E4, E6, E7,

E8, E13, E18

321 0.975 -0.07181

60/61 368 0.938 0.31433E1, E3, E4, E6,

E7, E8, E13, E18

326 0.982 -0.03271

61/62 303 0.961 0.33815E3, E4, E6, E7,

E11

279 0.988 0.03076

62/63 261 0.983 0.19217 E4 252 0.992 0.17739

63/64 281 0.986 0.14692 E1, E4 271 0.993 0.09181

64/65 263 0.984 0.17054 E3, E4 254 0.989 0.07083

65/66 266 0.986 0.22023 E3 254 0.988 0.04146

Mixed binomial regression: time varying covariate coefficients, spatially unstructured and structured

random effects

Page 22: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Year

Spatially unstructured Spatially structured

Devi-ance

Pseudo-R2 Residual MC Selected vectors Devi-ance

Pseudo-R2 Residual MC

66/67 302 0.977 0.33270 E3, E6, E8 273 0.985 0.09299

67/68 329 0.964 0.28672 E1, E3, E4, E6, E8 290 0.976 0.03851

68/69 320 0.966 0.30690E1, E3, E4, E5, E6, E8,

E12, E13, E14, E16

218 0.981 -0.08747

69/70 310 0.956 0.19651E1, E2, E3, E4, E6, E8,

E11, E16, E18

250 0.976 -0.03816

70/71 339 0.914 0.34359E1, E3, E4, E6, E7, E8,

E11, E15, E18

181 0.979 -0.04857

71/72 384 0.893 0.14420

E1, E2, E3, E4, E5, E6,

E8, E9, E10, E11, E12,

E16, E17, E18

207 0.965 -0.12290

72/73 427 0.806 0.24568

E1, E2, E3, E4, E6, E8,

E9, E10, E11, E12, E13,

E16, E17, E18

158 0.964 -0.13529

73/74 347 0.906 0.07071E1, E2, E3, E4, E6, E8,

E9, E10, E11, E12, E18

167 0.945 -0.07292

Page 23: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Spatial filters for space-time spatially structured random effects

1958/591958/59MC = 0.77, GR = 0.30MC = 0.77, GR = 0.30

1963/641963/64MC = 0.93, GR = 0.18MC = 0.93, GR = 0.18

1968/691968/69MC = 0.86, GR = 0.18MC = 0.86, GR = 0.18

1973/741973/74MC = 0.94, GR = 0.22MC = 0.94, GR = 0.22

Page 24: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

(normally distributed) random intercept: areal unit specific across all years

feature Spatially unstructured Added to spatial structure

Sample mean -0.00864 -0.00665

Sample variance 1.63044 1.63797

Moran Coefficient (MC) 0.08672 0.08778

Geary Ratio (GR) 1.10196 1.09907

P(Shapiro-Wilk) < 0.0001 (4 lower tail outliers)

< 0.0001 (4 lower tail outliers)

Correlations with covariates

(-0.17873, 0.32086) (-0.17833, 0.32095)

Page 25: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Time series plots: intercept &

covariate binomial regression coefficients

interceptintercept

● simple pooled model■ comparative static model

♦ model with a spatially unstructured random effect ▲mixed model with spatially structured random effect

mean elevationmean elevation distancedistance

Page 26: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Time series plots: covariate

binomial regression coefficient

standard errors mean elevationmean elevation

distancedistance

● simple pooled model■ comparative static model♦ model with a spatially

unstructured random effect ▲ mixed model with spatially

structured random effect

Page 27: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Residual serial correlation

The random effects estimator approximates the degree of serial correlation (or its importance in the model), and hence allows the computation of corrected estimates.

The 73 residual Durbin-Watson statistics have a range of (0.140, 2.513), with a mean of 0.836 and a standard deviation of 0.546.

Determining significance here is complicated because of small T, inclusion of a random effects term, and variable SF eigenvecvtor #s

Page 28: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Graphical portrayal of DWs

GLM residuals (heuristic using 4 dfs lost)

0 – 0.74 1.93 – 2.08 3.26 – 4

0.74 – 1.93 2.07 – 3.26

undecided

positive serial correlation

Page 29: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Summary of results

Page 30: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

STAR-binomial specification

ti,

73

1j 1tj,ijs

1ti,T

ielev

idisttti,

εarea

scwρ

area

scρ

elevβ

distβμarea

scLN

time

space

space-time

Page 31: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Pseud- & quasi-likelihood estimation

885.0R-pseudo

3.38ρ̂

7.75ρ̂

0.19β̂

0.05β̂

0.03T1.40μ̂

2

s

T

elev

dist

T

Page 32: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

Extra binomial variation remains

1958/59 1565 473 378

1959/60 1503 403 321

1960/61 1561 368 326

1961/62 1543 303 279

1962/63 1490 261 252

1963/64 1544 281 271

1964/65 1467 263 254

1965/66 1523 266 254

1966/67 1610 302 273

1967/68 1601 329 290

1968/69 1259 320 218

1969/70 1273 310 250

1970/71 1149 339 181

1971/72 1164 384 207

1972/73 1146 427 158

1973/74 899 347 167

● pineapple production■ milk production♦ sugar cane production ▲ tobacco production

Page 33: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

implications

1. spatial autocorrelation appears to be a source of part of the overdispersion

2. random effects (e.g., missing covariates) appear to be a source of part of the overdispersion

3. land use competition may be a source of part of the overdispersion

4. spatial filters for mean elevation and distance have six eigenvectors in common; of these, one is shared with most of the annual comparative static spatial filters, and two with most of the spatially structured random effect term spatial filters

Page 34: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

5. the components of spatial autocorrelation in sugar cane production vary over time

6. a spatially unstructured random effect term that seeks to account for serial correlation in multiple short time series can better highlight latent spatial autocorrelation

7. a spatial filter can effectively structure a random effect term

8. failure to include a spatially structured random effect term can result in biased parameter estimates (largely because of the nonlinear nature of the model specification)

9. spatial and temporal autocorrelation interact in a complex way

Page 35: by Daniel A. Griffith Ashbel Smith Professor of Geospatial Information Sciences

THE ENDTHE END


Recommended