Hidden Regular Variation in Joint Tail Modeling with...

Post on 23-Mar-2020

1 views 0 download

transcript

Hidden Regular Variation in Joint TailModeling with Likelihood Inference via the

MCEM Algorithm

Dan CooleyGrant Weller

Department of StatisticsColorado State University

Funding:NSF-DMS-0905315

Weather and Climate Impacts Assessment Program (NCAR)2011-2012 SAMSI UQ program

1

Motivating Example: Daily Air Pollution, Leeds UK

0 100 200 300 400 500

0100

200

300

400

500

Daily max pollution at Leeds, UK

NO2

SO2

A1A2A3

0 100 200 300 4000

100

200

300

400

Frechet Scale

NO2

SO2

Data exhibit asymptotic independence (Heffernan and Tawn,2004).

2

Outline

• Hidden Regular Variation

• Sum Characterization of HRV

• Estimation via MCEM

• Application: air pollution data

3

When Multivariate Regular Variation Fails

Multivariate Regular Variation:

tP[R

b(t)> r,W ∈ B

]v−→ r−αH(B).

In some cases, the angular measure H degenerates on someregions of N, masking sub-asymptotic dependence features.

Example: asymptotic independence in d = 2:

limz→z+

P(Z1 > z|Z2 > z) = 0.

• H consists of point masses at {0} and {1} (using ‖ · ‖1)

• e.g. bivariate Gaussian with correlation ρ < 1

Normalization by b(t) kills off sub-asymptotic dependencestructure.

4

Hidden Regular Variation(Resnick, 2002)

A regular varying random vector Z exhibits hidden regularvariation on a subcone C0 ⊂ C if ν(C0) = 0 and there exists{b0(t)}, b0(t)→∞ with b0(t)/b(t)→ 0 s.t.

tP[

Z

b0(t)∈ ·

]v−→ ν0(·)

as t→∞ in M+(C0).

• Scaling: ν0(tA) = t−α0ν0(A) for measurable A ∈ C0, α0 ≥ α• ν0 is Radon but not necessarily finite.

Equivalently,

tP[R

b0(t)> r,W ∈ B

]v−→ r−α0H0(B)

for B a Borel set of N0 = C0 ∩N (e.g. N0 = (0,1)).

H0 is called the hidden angular measure.

5

Example: bivariate Gaussian

Consider Z with Frechet margins and Gaussian dependence,ρ ∈ [0,1). Recall ν places mass only on the axes of C.

Define η = (1+ρ)/2, the coefficient of tail dependence (Led-ford and Tawn, 1997).

• Z exhibits hidden regular variation of order α0 = 1/η

• The density of the hidden measure ν0 can be written

ν0(dr × dw) =1

ηr−(1+1/η)dr ×

1

4η{w(1− w)}−1/2η−1dw︸ ︷︷ ︸

H0(dw)

H0 is infinite on (0,1).

6

Tail Equivalence(Maulik and Resnick, 2004)

Two random vectors X and Y are tail equivalent on the coneC∗ if

tP[

X

b∗(t)∈ ·

]v−→ ν(·) and tP

[Y

b∗(t)∈ ·

]v−→ cν(·)

as t→∞ in M+(C∗) for c > 0.

‘Extremes of X and Y samples taken in C∗ will have the sameasymptotic properties.’

7

Mixture Characterization of HRV(Maulik and Resnick, 2004)

Suppose Z is regular varying on C with hidden regular variationon C0:

tP[

Z

b(t)∈ ·

]v−→ ν(·) in M+(C) and

tP[

Z

b0(t)∈ ·

]v−→ ν0(·) in M+(C0)

with ν(C0) = 0 and b0(t)/b(t)→ 0 as t→∞.

• Let Y be RV (α) with support only on C \ C0.

• Let V = R0θ0, R0 ∼ FR0(t) = 1/b→(t) and θ0 ∼ H0, finite.

• Then Z is tail equivalent to a mixture of Y and V on bothC and C0.

Works because Y’s support doesn’t mess with the HRV.

8

0 20 40 60 80 100

020

4060

80100

Y[,1]

Y[,2]

0 20 40 60 80 100

020

4060

80100

V[,1]

V[,2]

0 20 40 60 80 100

020

4060

80100

YV[,1]

YV[,2]

9

Construction of Y + V

Define Y = RW, with P(R > r) ∼ 1/b←(r) and W drawn fromlimiting angular measure H. Notice that Y has support onlyon C \ C0.

Let V ∈ [0,∞)d be regular varying on C0 with limit measureν0:

tP[

V

b0(t)∈ ·

]v−→ ν0(·) in M+(C0).

Further assume that on C,

P(‖V‖ > r) ∼ cr−α∗

as r →∞, with c > 0 and

α∗ > α ∨ (α0 − α).

Assume R, W, V are independent.

10

Tail Equivalence Result

Then

tP[Y + V

b(t)∈ ·

]v−→ ν(·) in M+(C)

(Jessen and Mikosch, 2006).

Furthermore, tail equivalence (Maulik and Resnick, 2004)also holds on C0:

Theorem. With Y and V as defined above,

tP[Y + V

b0(t)∈ ·

]v−→ ν0(·) in M+(C0).

View Z as a sum of ‘first-order’ Y and ‘second-order’ V.

The sum Y + V is tail equivalent to Z on both C and C0.

11

Simulation when ν0 is finite.

0 200 400 600 800 1000 1200

0200

400

600

800

1000

1200

Y

Y1

Y2 +

0 200 400 600 800 1000 1200

0200

400

600

800

1000

1200

V

V1

V2 =

0 200 400 600 800 1000 1200

0200

400

600

800

1000

1200

Y +V

Y1 +V1

Y2+V2

No point falls exactly on an axis.

12

Infinite Measure Example: Bivariate Gaussian

Z has Frechet margins and Gaussian dependence (ρ < 1).Recall: H0 is infinite on N0 = (0,1).

Poses difficulty near the axes of C.

Proposed construction of V:

• Restrict to Cε0 = C0 ∩Nε0, where Nε

0 = [ε,1− ε] forε ∈ (0,1/2).

• Simulate W0 from probability density H0(dw)/H0(Nε0)

• Let R0 follow a Pareto distribution with α = 1/η

• V = [R0W0, R0(1−W0)]T

Y + V is tail equivalent to Z on C and Cε0.

13

Sum representation of bivariate Gaussian

Example with ρ = 0.5 (n = 2500):

0 200 400 600 800

0200

400

600

800

Z

z1

z 2

0 200 400 600 800

0200

400

600

800

Y +V (ε = 0.01)

Y1 +V1

Y2+V2

0 200 400 600 800

0200

400

600

800

Y +V (ε = 0.1)

Y1 +V1

Y2+V2

For any set completely contained in Cε0 we achieve the correctlimit measure ν0.

Choice of ε involves a trade-off between:• Size of the subcone on which tail equivalence holds

• Threshold at which Y + V is a useful approximation

• Biases due to choice of ε calculated.

14

Inference via the EM Algorithm

Observe realizations from Z, tail equivalent to Y+V. Assumeparametric forms and perform ML inference via EM.

If we assume Z = Y + V,

log f(z; θ) =∫

log f(z,y,v; θ)f(y,v|z; θ(k))dydv

−∫

log f(y,v|z; θ)f(y,v|z; θ(k))dydv

:= Q(θ|θ(k))−H(θ|θ(k)).

Here: Z and Y + V are only tail equivalent; θ governs tailbehavior of Y and V. Requires a modification of the EMsetup.

15

EM for Extremes

Consider distributions with densities gY(y; θ) and gV(v; θ)which are tail equivalent to the true distributions; i.e.,

gY(y; θ) ∼= fY(y; θ) for ‖y‖ > r∗YgV(v; θ) ∼= fV(v; θ) for ‖v‖ > r∗V,

Complete likelihood is based on limiting Poisson point pro-cesses for Y and V.

• E step: expectation is taken with respect to g(y,v|z; θ).

• M step: maximization is taken over only ‘large’ y and v.

We showH(θ(k)|θ(k))−H(θ|θ(k)) ≥ 0

using Jensen’s inequality.

16

MCEM

Natural framework for MCEM.

At the E step of the (k + 1)th iteration, simulate from

gY(y; θ(k))gV(z− y; θ(k)) ∝ g(y,v|z; θ(k))

for all z and use the simulated realizations to compute

Qm(θ|θ(k)) =1

m

m∑j=1

`(θ; z,yj,vj).

employing Poisson point process likelihoods for large realiza-tions of Y and V.

Key idea: likelihood only depends on θ for ‘large’ y and v!

Uncertainty estimates obtained via Louis’ method.

17

Example w/ Infinite Hidden Measure

Simulate n = 10000 realizations from a bivariate Gaussiandistribution with correlation ρ, transform marginals to unitFrechet.

Tail equivalent on C and Cε0 to Y + V, where V has angularmeasure

H0(dw) =1

4η{w(1− w)}−1/2η−1dw.

Aim: estimate η = (1 + ρ)/2 from the ε-restricted model.

• Must select both ε and r∗V

• Trade-off in finite sample estimation problems

18

Infinite Hidden Measure Results

Shown for η = 0.75 (ρ = 0.5)

rV

η

22 45 100 200

0.65

0.70

0.75

0.80

0.85

0.90

ε = 0.185ε = 0.2ε = 0.215ε = 0.23

Mean estimates of η

rV

Cov

erag

e R

ate

22 45 100 2000.0

0.2

0.4

0.6

0.8

1.0

ε = 0.185ε = 0.2ε = 0.215ε = 0.23

Coverage rates of 95% CI

19

Air Pollution Data

0 100 200 300 400 500

0100

200

300

400

500

Daily max pollution at Leeds, UK

NO2

SO2

A1A2A3

0 100 200 300 4000

100

200

300

400

Frechet Scale

NO2

SO2

• Strong evidence for asymptotic independence

• Aim: estimate risk set probabilities

20

Competing Approaches

Examine three modeling approaches:

1. Assume asymptotic dependence; i.e. that ν(·) places masson the entire cone C. Fit a bivariate logistic angular de-pendence model to largest 10% of observations (in termsof L1 norm). Estimate β = 0.713.

2. Assume asymptotic independence and ignore any possiblehidden regular variation.

3. Assume asymptotic independence and hidden regular vari-ation. Fit the ε-restricted infinite hidden measure modelvia MCEM. Select r∗V = 7.5 and ε = 0.3. Estimate η =0.748.

21

Results - risk set estimates

Model P(Z ∈ A1) Expected # p-val1 (asy. dep.) 0.0297 59.04 0.480

2 (asy. indep.) 0.0120 23.86 8.17× 10−5

3 (Y + V) 0.0261 51.89 0.210Empirical 0.0292 58 −

22

Results - risk set estimates

Model P(Z ∈ A2) Expected # p-val1 (asy. dep.) 0.0044 8.74 0.132

2 (asy. indep.) 0.0002 0.40 0.0093 (Y + V) 0.0018 3.58 0.274Empirical 0.0025 5 −

23

Results - risk set estimates

Model P(Z ∈ A3) Expected # p-val1 (asy. dep.) 0.0010 1.99 0.130

2 (asy. indep.) 0 0 13 (Y + V) 0.0002 0.40 0.704Empirical 0 0 −

24

Summary

This work introduces a sum representation for regular varyingrandom vectors possessing hidden regular variation.

• Useful representation for finite samples

• Asymptotically justified by tail equivalence result

• Difficulty arises when H0 is infinite - restrict to a compactcone to simulate V

• Likelihood estimation via modified MCEM algorithm

• Captures tail dependence in the presence of asymptoticindependence

• Improved estimation of tail risk set probabilites

25

References

Heffernan, J. E. and Tawn, J. A. (2004). A conditional approach for multivariateextreme values. Journal of the Royal Statistical Society, Series B, 66:497–546.

Jessen, A. and Mikosch, T. (2006). Regularly varying functions. University of Copen-hagen, laboratory of Actuarial Mathematics.

Ledford, A. and Tawn, J. (1997). Modelling dependence within joint tail regions. Journalof the Royal Statistical Society, Series B, B:475–499.

Maulik, K. and Resnick, S. (2004). Characterizations and examples of hidden regularvariation. Extremes, 7(1):31–67.

Resnick, S. (2002). Hidden regular variation, second order regular variation and asymp-totic independence. Extremes, 5(4):303–336.

Weller, G. and Cooley, D. (2013). A sum decomposition for hidden regular variation injoint tail modeling with likelihood inference via the mcem algorithm. Submitted.

26