Hidden Regular Variation in Joint TailModeling with Likelihood Inference via the
MCEM Algorithm
Dan CooleyGrant Weller
Department of StatisticsColorado State University
Funding:NSF-DMS-0905315
Weather and Climate Impacts Assessment Program (NCAR)2011-2012 SAMSI UQ program
1
Motivating Example: Daily Air Pollution, Leeds UK
0 100 200 300 400 500
0100
200
300
400
500
Daily max pollution at Leeds, UK
NO2
SO2
A1A2A3
0 100 200 300 4000
100
200
300
400
Frechet Scale
NO2
SO2
Data exhibit asymptotic independence (Heffernan and Tawn,2004).
2
Outline
• Hidden Regular Variation
• Sum Characterization of HRV
• Estimation via MCEM
• Application: air pollution data
3
When Multivariate Regular Variation Fails
Multivariate Regular Variation:
tP[R
b(t)> r,W ∈ B
]v−→ r−αH(B).
In some cases, the angular measure H degenerates on someregions of N, masking sub-asymptotic dependence features.
Example: asymptotic independence in d = 2:
limz→z+
P(Z1 > z|Z2 > z) = 0.
• H consists of point masses at {0} and {1} (using ‖ · ‖1)
• e.g. bivariate Gaussian with correlation ρ < 1
Normalization by b(t) kills off sub-asymptotic dependencestructure.
4
Hidden Regular Variation(Resnick, 2002)
A regular varying random vector Z exhibits hidden regularvariation on a subcone C0 ⊂ C if ν(C0) = 0 and there exists{b0(t)}, b0(t)→∞ with b0(t)/b(t)→ 0 s.t.
tP[
Z
b0(t)∈ ·
]v−→ ν0(·)
as t→∞ in M+(C0).
• Scaling: ν0(tA) = t−α0ν0(A) for measurable A ∈ C0, α0 ≥ α• ν0 is Radon but not necessarily finite.
Equivalently,
tP[R
b0(t)> r,W ∈ B
]v−→ r−α0H0(B)
for B a Borel set of N0 = C0 ∩N (e.g. N0 = (0,1)).
H0 is called the hidden angular measure.
5
Example: bivariate Gaussian
Consider Z with Frechet margins and Gaussian dependence,ρ ∈ [0,1). Recall ν places mass only on the axes of C.
Define η = (1+ρ)/2, the coefficient of tail dependence (Led-ford and Tawn, 1997).
• Z exhibits hidden regular variation of order α0 = 1/η
• The density of the hidden measure ν0 can be written
ν0(dr × dw) =1
ηr−(1+1/η)dr ×
1
4η{w(1− w)}−1/2η−1dw︸ ︷︷ ︸
H0(dw)
H0 is infinite on (0,1).
6
Tail Equivalence(Maulik and Resnick, 2004)
Two random vectors X and Y are tail equivalent on the coneC∗ if
tP[
X
b∗(t)∈ ·
]v−→ ν(·) and tP
[Y
b∗(t)∈ ·
]v−→ cν(·)
as t→∞ in M+(C∗) for c > 0.
‘Extremes of X and Y samples taken in C∗ will have the sameasymptotic properties.’
7
Mixture Characterization of HRV(Maulik and Resnick, 2004)
Suppose Z is regular varying on C with hidden regular variationon C0:
tP[
Z
b(t)∈ ·
]v−→ ν(·) in M+(C) and
tP[
Z
b0(t)∈ ·
]v−→ ν0(·) in M+(C0)
with ν(C0) = 0 and b0(t)/b(t)→ 0 as t→∞.
• Let Y be RV (α) with support only on C \ C0.
• Let V = R0θ0, R0 ∼ FR0(t) = 1/b→(t) and θ0 ∼ H0, finite.
• Then Z is tail equivalent to a mixture of Y and V on bothC and C0.
Works because Y’s support doesn’t mess with the HRV.
8
0 20 40 60 80 100
020
4060
80100
Y[,1]
Y[,2]
0 20 40 60 80 100
020
4060
80100
V[,1]
V[,2]
0 20 40 60 80 100
020
4060
80100
YV[,1]
YV[,2]
9
Construction of Y + V
Define Y = RW, with P(R > r) ∼ 1/b←(r) and W drawn fromlimiting angular measure H. Notice that Y has support onlyon C \ C0.
Let V ∈ [0,∞)d be regular varying on C0 with limit measureν0:
tP[
V
b0(t)∈ ·
]v−→ ν0(·) in M+(C0).
Further assume that on C,
P(‖V‖ > r) ∼ cr−α∗
as r →∞, with c > 0 and
α∗ > α ∨ (α0 − α).
Assume R, W, V are independent.
10
Tail Equivalence Result
Then
tP[Y + V
b(t)∈ ·
]v−→ ν(·) in M+(C)
(Jessen and Mikosch, 2006).
Furthermore, tail equivalence (Maulik and Resnick, 2004)also holds on C0:
Theorem. With Y and V as defined above,
tP[Y + V
b0(t)∈ ·
]v−→ ν0(·) in M+(C0).
View Z as a sum of ‘first-order’ Y and ‘second-order’ V.
The sum Y + V is tail equivalent to Z on both C and C0.
11
Simulation when ν0 is finite.
0 200 400 600 800 1000 1200
0200
400
600
800
1000
1200
Y
Y1
Y2 +
0 200 400 600 800 1000 1200
0200
400
600
800
1000
1200
V
V1
V2 =
0 200 400 600 800 1000 1200
0200
400
600
800
1000
1200
Y +V
Y1 +V1
Y2+V2
No point falls exactly on an axis.
12
Infinite Measure Example: Bivariate Gaussian
Z has Frechet margins and Gaussian dependence (ρ < 1).Recall: H0 is infinite on N0 = (0,1).
Poses difficulty near the axes of C.
Proposed construction of V:
• Restrict to Cε0 = C0 ∩Nε0, where Nε
0 = [ε,1− ε] forε ∈ (0,1/2).
• Simulate W0 from probability density H0(dw)/H0(Nε0)
• Let R0 follow a Pareto distribution with α = 1/η
• V = [R0W0, R0(1−W0)]T
Y + V is tail equivalent to Z on C and Cε0.
13
Sum representation of bivariate Gaussian
Example with ρ = 0.5 (n = 2500):
0 200 400 600 800
0200
400
600
800
Z
z1
z 2
0 200 400 600 800
0200
400
600
800
Y +V (ε = 0.01)
Y1 +V1
Y2+V2
0 200 400 600 800
0200
400
600
800
Y +V (ε = 0.1)
Y1 +V1
Y2+V2
For any set completely contained in Cε0 we achieve the correctlimit measure ν0.
Choice of ε involves a trade-off between:• Size of the subcone on which tail equivalence holds
• Threshold at which Y + V is a useful approximation
• Biases due to choice of ε calculated.
14
Inference via the EM Algorithm
Observe realizations from Z, tail equivalent to Y+V. Assumeparametric forms and perform ML inference via EM.
If we assume Z = Y + V,
log f(z; θ) =∫
log f(z,y,v; θ)f(y,v|z; θ(k))dydv
−∫
log f(y,v|z; θ)f(y,v|z; θ(k))dydv
:= Q(θ|θ(k))−H(θ|θ(k)).
Here: Z and Y + V are only tail equivalent; θ governs tailbehavior of Y and V. Requires a modification of the EMsetup.
15
EM for Extremes
Consider distributions with densities gY(y; θ) and gV(v; θ)which are tail equivalent to the true distributions; i.e.,
gY(y; θ) ∼= fY(y; θ) for ‖y‖ > r∗YgV(v; θ) ∼= fV(v; θ) for ‖v‖ > r∗V,
Complete likelihood is based on limiting Poisson point pro-cesses for Y and V.
• E step: expectation is taken with respect to g(y,v|z; θ).
• M step: maximization is taken over only ‘large’ y and v.
We showH(θ(k)|θ(k))−H(θ|θ(k)) ≥ 0
using Jensen’s inequality.
16
MCEM
Natural framework for MCEM.
At the E step of the (k + 1)th iteration, simulate from
gY(y; θ(k))gV(z− y; θ(k)) ∝ g(y,v|z; θ(k))
for all z and use the simulated realizations to compute
Qm(θ|θ(k)) =1
m
m∑j=1
`(θ; z,yj,vj).
employing Poisson point process likelihoods for large realiza-tions of Y and V.
Key idea: likelihood only depends on θ for ‘large’ y and v!
Uncertainty estimates obtained via Louis’ method.
17
Example w/ Infinite Hidden Measure
Simulate n = 10000 realizations from a bivariate Gaussiandistribution with correlation ρ, transform marginals to unitFrechet.
Tail equivalent on C and Cε0 to Y + V, where V has angularmeasure
H0(dw) =1
4η{w(1− w)}−1/2η−1dw.
Aim: estimate η = (1 + ρ)/2 from the ε-restricted model.
• Must select both ε and r∗V
• Trade-off in finite sample estimation problems
18
Infinite Hidden Measure Results
Shown for η = 0.75 (ρ = 0.5)
rV
η
22 45 100 200
0.65
0.70
0.75
0.80
0.85
0.90
ε = 0.185ε = 0.2ε = 0.215ε = 0.23
Mean estimates of η
rV
Cov
erag
e R
ate
22 45 100 2000.0
0.2
0.4
0.6
0.8
1.0
ε = 0.185ε = 0.2ε = 0.215ε = 0.23
Coverage rates of 95% CI
19
Air Pollution Data
0 100 200 300 400 500
0100
200
300
400
500
Daily max pollution at Leeds, UK
NO2
SO2
A1A2A3
0 100 200 300 4000
100
200
300
400
Frechet Scale
NO2
SO2
• Strong evidence for asymptotic independence
• Aim: estimate risk set probabilities
20
Competing Approaches
Examine three modeling approaches:
1. Assume asymptotic dependence; i.e. that ν(·) places masson the entire cone C. Fit a bivariate logistic angular de-pendence model to largest 10% of observations (in termsof L1 norm). Estimate β = 0.713.
2. Assume asymptotic independence and ignore any possiblehidden regular variation.
3. Assume asymptotic independence and hidden regular vari-ation. Fit the ε-restricted infinite hidden measure modelvia MCEM. Select r∗V = 7.5 and ε = 0.3. Estimate η =0.748.
21
Results - risk set estimates
Model P(Z ∈ A1) Expected # p-val1 (asy. dep.) 0.0297 59.04 0.480
2 (asy. indep.) 0.0120 23.86 8.17× 10−5
3 (Y + V) 0.0261 51.89 0.210Empirical 0.0292 58 −
22
Results - risk set estimates
Model P(Z ∈ A2) Expected # p-val1 (asy. dep.) 0.0044 8.74 0.132
2 (asy. indep.) 0.0002 0.40 0.0093 (Y + V) 0.0018 3.58 0.274Empirical 0.0025 5 −
23
Results - risk set estimates
Model P(Z ∈ A3) Expected # p-val1 (asy. dep.) 0.0010 1.99 0.130
2 (asy. indep.) 0 0 13 (Y + V) 0.0002 0.40 0.704Empirical 0 0 −
24
Summary
This work introduces a sum representation for regular varyingrandom vectors possessing hidden regular variation.
• Useful representation for finite samples
• Asymptotically justified by tail equivalence result
• Difficulty arises when H0 is infinite - restrict to a compactcone to simulate V
• Likelihood estimation via modified MCEM algorithm
• Captures tail dependence in the presence of asymptoticindependence
• Improved estimation of tail risk set probabilites
25
References
Heffernan, J. E. and Tawn, J. A. (2004). A conditional approach for multivariateextreme values. Journal of the Royal Statistical Society, Series B, 66:497–546.
Jessen, A. and Mikosch, T. (2006). Regularly varying functions. University of Copen-hagen, laboratory of Actuarial Mathematics.
Ledford, A. and Tawn, J. (1997). Modelling dependence within joint tail regions. Journalof the Royal Statistical Society, Series B, B:475–499.
Maulik, K. and Resnick, S. (2004). Characterizations and examples of hidden regularvariation. Extremes, 7(1):31–67.
Resnick, S. (2002). Hidden regular variation, second order regular variation and asymp-totic independence. Extremes, 5(4):303–336.
Weller, G. and Cooley, D. (2013). A sum decomposition for hidden regular variation injoint tail modeling with likelihood inference via the mcem algorithm. Submitted.
26