Date post: | 16-Jan-2017 |
Category: |
Data & Analytics |
Upload: | hellebore-capital-limited |
View: | 143 times |
Download: | 0 times |
IntroductionStatistical distances
Optimal Transport vs. Fisher-Rao distancebetween Copulas
IEEE SSP 2016
G. Marti, S. Andler, F. Nielsen, P. Donnat
June 28, 2016
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Clustering of Time Series
We need a distance Dij between time series xi and xj
If we look for ‘correlation’, Dij is a decreasing function of ρij ,a measure of ‘correlation’
Several choices are available for ρij . . .
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Copulas
Sklar’s Theorem:
F (xi , xj) = Cij(Fi (xi ),Fj(xj))
Cij , the copula, encodes the dependence structureFrechet-Hoeffding bounds:
max{ui + uj − 1, 0} ≤ Cij(ui , uj) ≤ min{ui , uj}
(left) lower-bound, (mid) independence, (right) upper-bound copulas
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Copulas - Gaussian Example
Gaussian copula: CGaussR (ui , uj) = ΦR(Φ−1(ui ),Φ
−1(uj))
The distribution is parametrized by a correlation matrix R.
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
The Target/Forget (copula-based) Dependence Coefficient
Dependence is measured as the relative distance from independence tothe nearest target-dependence: comonotonicity or counter-monotonicity
Which distances are appropriate between copulas for the task ofclustering (copulas and time series)?
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Definitions - Fisher-Rao geodesic distance
Metrization of the paramater space {θ ∈ Rd |∫p(X ; θ)dx = 1}.
Consider the metric gjk(θ) = −∫ ∂2 log p(x ,θ)
∂θj∂θkp(x , θ)dx ,
the infinitesimal length ds(θ) =√
(∇θ)>G (θ)∇θ,
the Fisher-Rao geodesic distance
FR(θ1, θ2) =
∫ θ2
θ1
ds(θ).
f -divergences induce infinitesimal length proportional toFisher-Rao infinitesimal length:
Df (θ‖θ + dθ) =1
2(∇θ)>G (θ)∇θ.
Thus, they have the same local behaviour [1].
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Definitions - Optimal Transport distances
Wasserstein metric
Wp(µ, ν)p = infγ∈Γ(µ,ν)
∫M×M
d(x , y)pdγ(x , y)
Image from Optimal Transport for Image Processing, Papadakis
Other transportation distances: regularized discrete optimaltransport [3], Sinkhorn distances [2], . . .
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Geometry of covariances
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Distances between Gaussian copulas
Copulas C1,C2,C3 encoding a correlation of 0.5, 0.99, 0.9999 respectively;Which pair of copulas is the nearest?- For Fisher-Rao, Kullback-Leibler, Hellinger and related divergences:D(C1,C2) ≤ D(C2,C3);- For Wasserstein: W2(C2,C3) ≤W2(C1,C2)
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Distances as a function of (ρ1, ρ2)
Distance heatmap and surface as a function of (ρ1, ρ2)
for Fisher-Rao for Wasserstein W2
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Distances impact on clustering
Datasets of bivariate time series are generated from six Gaussian copulaswith correlation .1, .2, .6, .7, .99, .9999
Distance heatmaps for Fisher-Rao (left), W2 (right); Using Wardclustering, Fisher-Rao yields clusters of copulas with correlations{.1, .2, .6, .7}, {.99}, {.9999}, W2 yields {.1, .2}, {.6, .7}, {.99, .9999}
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Fisher metric and the Cramer–Rao lower bound
Cramer–Rao lower bound (CRLB)
The variance of any unbiased estimator θ of θ is bounded by thereciprocal of the Fisher information G (θ):
var(θ) ≥ 1
G (θ).
In the bivariate Gaussian copula case,
var(ρ) ≥ (ρ− 1)2(ρ+ 1)2
3(ρ2 + 1).
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Fisher metric and the Cramer–Rao lower bound
We consider the set of 2× 2 correlation matrices C =
(1 θθ 1
)parameterized by θ.
Let x =
(x1x2
)∈ R2.
f (x ; θ) = 1
2π
√1−θ2
exp(− 1
2x>C−1x
)= 1
2π
√1−θ2
exp
(− 1
2(1−θ2)(x2
1 + x22 − 2θx1x2)
)log f (x ; θ) = − log(2π
√1− θ2)− 1
2(1−θ2)(x2
1 + x22 − 2θx1x2)
∂2 log f (x ;θ)
∂θ2 = − θ2+1(θ2−1)2 −
x21
2(θ+1)3 +x21
2(θ−1)3 −x22
2(θ+1)3 +x22
2(θ−1)3 −x1x2
(θ+1)3 −x1x2
(θ−1)3
Then, we compute∫∞−∞
∂2 log f (x ;θ)
∂θ2 f (x ; θ)dx .
Since E[x1] = E[x2] = 0, E[x1x2] = θ, E[x21 ] = E[x2
2 ] = 1, we get∫∞−∞
∂2 log f (x ;θ)
∂θ2 f (x ; θ)dx =
− θ2+1(θ2−1)2 −
12(θ+1)3 + 1
2(θ−1)3 −1
2(θ+1)3 + 12(θ−1)3 −
θ(θ+1)3 −
θ(θ−1)3 = − 3(θ2+1)
(θ−1)2(θ+1)2
Thus,
G(θ) =3(θ2 + 1)
(θ − 1)2(θ + 1)2.
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Fisher metric and the Cramer–Rao lower bound
In the bivariate Gaussian copula case,
var(ρ) ≥ (ρ− 1)2(ρ+ 1)2
3(ρ2 + 1).
Recall that locally Fisher-Rao and the f -divergences are aquadratic form of the Fisher metric (∇θ)>G (θ)∇θ. So, thediscriminative power of these distances is well calibrated withrespect to statistical uncertainty. For this purpose, they induce theappropriate curvature on the parameter space.
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Properties of these distances
In addition, for clustering we prefer OT since:
in a parametric setting:
Fisher-Rao and f -divergences are defined on density manifolds,but some important copulas (such as the Frechet-Hoeffdingupper bound) do not belong to these manifolds;Thus, in case of closed-form formulas (such as in the Gaussiancase), they are ill-defined for these copulas (for perfectdependence, covariance is not invertible)
in a non-parametric/empirical setting:
f -divergences are defined for absolutely continuous measures,thus require a pre-processing KDEthey are not aware of the support geometry, thus badly handlenoise on the support
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Barycenters
OT is defined for both discrete/empirical and continuous measuresand is support-geometry aware:
0 0.5 10
0.5
1
0.0000
0.0015
0.0030
0.0045
0.0060
0.0075
0.0090
0.0105
0.0120
0 0.5 10
0.5
1
0.0000
0.0015
0.0030
0.0045
0.0060
0.0075
0.0090
0.0105
0.0120
0 0.5 10
0.5
1
0.0000
0.0008
0.0016
0.0024
0.0032
0.0040
0.0048
0.0056
0 0.5 10
0.5
1
0.0000
0.0015
0.0030
0.0045
0.0060
0.0075
0.0090
0.0105
0.0120
0 0.5 10
0.5
1
0.0000
0.0015
0.0030
0.0045
0.0060
0.0075
0.0090
0.0105
0.0120
5 copulas describing the dependence between X ∼ U([0, 1]) andY ∼ (X ± εi )2, where εi is a constant noise specific for each distribution
0 0.5 10
0.5
1Wasserstein barycenter copula
0.0000
0.0004
0.0008
0.0012
0.0016
0.0020
0.0024
0.0028
0.0032
Barycenter of the 5 copulas for a divergence and OT
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Future Research
Develop further geometries of copulas
using Optimal Transport: show that dependence-clustering oftime series is improved over standard correlationsusing f -divergences: detect efficiently dependence-regimeswitching in multivariate time series (cf. Frederic Barbaresco’swork on radar signal processing)
Numerical experiments and code:
https://www.datagrapple.com/Tech/fisher-vs-ot.html
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas
IntroductionStatistical distances
Shun-ichi Amari and Andrzej Cichocki.Information geometry of divergence functions.Bulletin of the Polish Academy of Sciences: TechnicalSciences, 58(1):183–195, 2010.
Marco Cuturi.Sinkhorn distances: Lightspeed computation of optimaltransport.In Advances in Neural Information Processing Systems, pages2292–2300, 2013.
Sira Ferradans, Nicolas Papadakis, Julien Rabin, Gabriel Peyre,and Jean-Francois Aujol.Regularized discrete optimal transport.Springer, 2013.
Gautier Marti Optimal Transport vs. Fisher-Rao distance between Copulas