+ All Categories
Home > Documents > Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM...

Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM...

Date post: 06-Dec-2014
Category:
Upload: bigmc
View: 636 times
Download: 0 times
Share this document with a friend
Description:
talk by S. Allassonnière (CMAP, Ecole Polytechnique) at BigMC seminar, 12/01/2011
100
Anisotropic Metropolis adjusted Langevin algorithm: Convergence and utility in stochastic EM algorithm. St´ ephanie Allassonni` ere CMAP, ´ Ecole Polytechnique BigMC, January 2012 Join work with Estelle Kuhn (INRA, France) St´ ephanie Allassonni` ere (CMAP) AMALA BigMC, January 2012 1 / 42
Transcript
Page 1: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis adjusted Langevin algorithm:Convergence and utility in stochastic EM algorithm.

Stephanie Allassonniere

CMAP, Ecole Polytechnique

BigMC, January 2012

Join work with Estelle Kuhn (INRA, France)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 1 / 42

Page 2: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Introduction

Introduction:

Where does the problem came from?

Image analysis: Compare two observations via the quantification ofthe deformation from one to the other (D’Arcy Thompson, 1917)

Registration

/ Variance

Each element of a population is a smooth deformation of a template

Template estimation

/ Mean

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 2 / 42

Page 3: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Introduction

Introduction:

Where does the problem came from?

Image analysis: Compare two observations via the quantification ofthe deformation from one to the other (D’Arcy Thompson, 1917)

Registration

/ Variance

Each element of a population is a smooth deformation of a template

Template estimation

/ Mean

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 2 / 42

Page 4: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Introduction

Introduction:

Where does the problem came from?

Image analysis: Compare two observations via the quantification ofthe deformation from one to the other (D’Arcy Thompson, 1917)

Registration

/ Variance

Each element of a population is a smooth deformation of a template

Template estimation

/ Mean

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 2 / 42

Page 5: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Introduction

Introduction:

Where does the problem came from?

Image analysis: Compare two observations via the quantification ofthe deformation from one to the other (D’Arcy Thompson, 1917)

Registration

/ Variance

Each element of a population is a smooth deformation of a template

Template estimation / Mean

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 2 / 42

Page 6: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Introduction

Introduction:

Where does the problem came from?

Image analysis: Compare two observations via the quantification ofthe deformation from one to the other (D’Arcy Thompson, 1917)

Registration / Variance

Each element of a population is a smooth deformation of a template

Template estimation / Mean

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 2 / 42

Page 7: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Introduction

Introduction:

Where does the problem came from?

Deformable template model: (u = voxel, vu its position)

y(u) = I0(vu −m(vu)) + σε(u) ,

Template I0 and geometry Law(m) estimation

High dimensional setting, Low sample size

Considering the LDDMM framework through the shooting equations

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 3 / 42

Page 8: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Introduction

Introduction:

Where does the problem came from?

Deformable template model: (u = voxel, vu its position)

y(u) = I0(vu −m(vu)) + σε(u) ,

Template I0 and geometry Law(m) estimation

High dimensional setting, Low sample size

Considering the LDDMM framework through the shooting equations

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 3 / 42

Page 9: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Introduction

Introduction:

Where does the problem came from?

Deformable template model: (u = voxel, vu its position)

y(u) = I0(vu −m(vu)) + σε(u) ,

Template I0 and geometry Law(m) estimation

High dimensional setting, Low sample size

Considering the LDDMM framework through the shooting equations

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 3 / 42

Page 10: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Introduction

Introduction:

Where does the problem came from?

Deformable template model: (u = voxel, vu its position)

y(u) = I0(vu −m(vu)) + σε(u) ,

Template I0 and geometry Law(m) estimation

High dimensional setting, Low sample size

Considering the LDDMM framework through the shooting equations

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 3 / 42

Page 11: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Introduction

Introduction:

Where does the problem came from?

Deformable template model: (u = voxel, vu its position)

y(u) = I0(vu −m(vu)) + σε(u) ,

Template I0 and geometry Law(m) estimation

High dimensional setting, Low sample size

Considering the LDDMM framework through the shooting equations

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 3 / 42

Page 12: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Introduction

Outline:

1. AMALA: simulation of random variables in high dimension

Anisotropic MALA descriptionConvergence property

2. AMALA within stochastic algorithm for parameter estimation

Maximum likelihood estimation for incomplete datasettingAMALA-SAEMConvergence properties

3. Experiments

BME-Template model: small deformation settingBME-Template model: LDDMM setting

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 4 / 42

Page 13: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Introduction

Outline:

1. AMALA: simulation of random variables in high dimension

Anisotropic MALA descriptionConvergence property

2. AMALA within stochastic algorithm for parameter estimation

Maximum likelihood estimation for incomplete datasettingAMALA-SAEMConvergence properties

3. Experiments

BME-Template model: small deformation settingBME-Template model: LDDMM setting

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 5 / 42

Page 14: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA)

Introduction:

General setting:

Simulation of random variable in high dimension settings: → GibbsSampler not useful

Metropolis Adjusted Langevin Algorithm (MALA)

Target distribution: πAt iteration k of this algorithm, Xk the current valueSimulate Xc w.r.t. N (Xk + δD(Xk), δIdd)where D(x) = b

max(b,|∇ log π(x)|)∇ log π(x).

Update Xk+1 = Xc with probability

α(Xk ,Xc) = min(

1, π(Xc )qMALA(Xc ,Xk )qMALA(Xk ,Xc )π(Xk )

)and Xk+1 = Xk otherwise.

Problem: isotropic covariance matrix = numerically trapped(α(Xk ,Xc) = 0)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 6 / 42

Page 15: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA)

Introduction:

General setting:

Simulation of random variable in high dimension settings: → GibbsSampler not useful

Metropolis Adjusted Langevin Algorithm (MALA)

Target distribution: πAt iteration k of this algorithm, Xk the current valueSimulate Xc w.r.t. N (Xk + δD(Xk), δIdd)where D(x) = b

max(b,|∇ log π(x)|)∇ log π(x).

Update Xk+1 = Xc with probability

α(Xk ,Xc) = min(

1, π(Xc )qMALA(Xc ,Xk )qMALA(Xk ,Xc )π(Xk )

)and Xk+1 = Xk otherwise.

Problem: isotropic covariance matrix = numerically trapped(α(Xk ,Xc) = 0)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 6 / 42

Page 16: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA)

Introduction:

General setting:

Simulation of random variable in high dimension settings: → GibbsSampler not useful

Metropolis Adjusted Langevin Algorithm (MALA)

Target distribution: πAt iteration k of this algorithm, Xk the current valueSimulate Xc w.r.t. N (Xk + δD(Xk), δIdd)where D(x) = b

max(b,|∇ log π(x)|)∇ log π(x).

Update Xk+1 = Xc with probability

α(Xk ,Xc) = min(

1, π(Xc )qMALA(Xc ,Xk )qMALA(Xk ,Xc )π(Xk )

)and Xk+1 = Xk otherwise.

Problem: isotropic covariance matrix = numerically trapped(α(Xk ,Xc) = 0)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 6 / 42

Page 17: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA)

Introduction:

General setting:

Simulation of random variable in high dimension settings: → GibbsSampler not useful

Metropolis Adjusted Langevin Algorithm (MALA)

Target distribution: π

At iteration k of this algorithm, Xk the current valueSimulate Xc w.r.t. N (Xk + δD(Xk), δIdd)where D(x) = b

max(b,|∇ log π(x)|)∇ log π(x).

Update Xk+1 = Xc with probability

α(Xk ,Xc) = min(

1, π(Xc )qMALA(Xc ,Xk )qMALA(Xk ,Xc )π(Xk )

)and Xk+1 = Xk otherwise.

Problem: isotropic covariance matrix = numerically trapped(α(Xk ,Xc) = 0)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 6 / 42

Page 18: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA)

Introduction:

General setting:

Simulation of random variable in high dimension settings: → GibbsSampler not useful

Metropolis Adjusted Langevin Algorithm (MALA)

Target distribution: πAt iteration k of this algorithm, Xk the current valueSimulate Xc w.r.t. N (Xk + δD(Xk), δIdd)where D(x) = b

max(b,|∇ log π(x)|)∇ log π(x).

Update Xk+1 = Xc with probability

α(Xk ,Xc) = min(

1, π(Xc )qMALA(Xc ,Xk )qMALA(Xk ,Xc )π(Xk )

)and Xk+1 = Xk otherwise.

Problem: isotropic covariance matrix = numerically trapped(α(Xk ,Xc) = 0)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 6 / 42

Page 19: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA)

Introduction:

General setting:

Simulation of random variable in high dimension settings: → GibbsSampler not useful

Metropolis Adjusted Langevin Algorithm (MALA)

Target distribution: πAt iteration k of this algorithm, Xk the current valueSimulate Xc w.r.t. N (Xk + δD(Xk), δIdd)where D(x) = b

max(b,|∇ log π(x)|)∇ log π(x).

Update Xk+1 = Xc with probability

α(Xk ,Xc) = min(

1, π(Xc )qMALA(Xc ,Xk )qMALA(Xk ,Xc )π(Xk )

)and Xk+1 = Xk otherwise.

Problem: isotropic covariance matrix = numerically trapped(α(Xk ,Xc) = 0)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 6 / 42

Page 20: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA)

Introduction:

General setting:

Simulation of random variable in high dimension settings: → GibbsSampler not useful

Metropolis Adjusted Langevin Algorithm (MALA)

Target distribution: πAt iteration k of this algorithm, Xk the current valueSimulate Xc w.r.t. N (Xk + δD(Xk), δIdd)where D(x) = b

max(b,|∇ log π(x)|)∇ log π(x).

Update Xk+1 = Xc with probability

α(Xk ,Xc) = min(

1, π(Xc )qMALA(Xc ,Xk )qMALA(Xk ,Xc )π(Xk )

)and Xk+1 = Xk otherwise.

Problem: isotropic covariance matrix = numerically trapped(α(Xk ,Xc) = 0)

→ Anisotropic Metropolis Adjusted Langevin Algorithm (AMALA)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 6 / 42

Page 21: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Description of the algorithm

How including anisotropy?

Following the magnitude of the gradient

First approximation: independence of directions

Bounded covariance (same as bounded drift)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 7 / 42

Page 22: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Description of the algorithm

How including anisotropy?

Following the magnitude of the gradient

First approximation: independence of directions

Bounded covariance (same as bounded drift)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 7 / 42

Page 23: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Description of the algorithm

How including anisotropy?

Following the magnitude of the gradient

First approximation: independence of directions

Bounded covariance (same as bounded drift)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 7 / 42

Page 24: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Description of the algorithm

Anisotropic Metropolis Adjusted Langevin Algorithm (AMALA)

For all k = 1 : kend Iterates of Markov chain

Sample Xc with respect to

N (Xk + δD(Xk), δΣ(Xk))

with D(Xk) = bmax(b,|∇ log π(Xk )|)∇ log π(Xk) and

Σ(Xk) = Idd + diag(([∇ log π(Xk)]21 ∧ b), ... , ([∇ log π(Xk)]2d ∧ b)

)Compute the acceptance ratio

α(Xk ,Xc) = min

(1,π(Xc)qc(Xc ,Xk)

qc(Xk ,Xc)π(Xk)

)(qc = the pdf of this distribution).

Sample Xk+1 = Xc with probability α(Xk ,Xc) and Xk+1 = Xk withprobability 1− α(Xk ,Xc) = Acceptation/reject

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 8 / 42

Page 25: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Description of the algorithm

Anisotropic Metropolis Adjusted Langevin Algorithm (AMALA)

For all k = 1 : kend Iterates of Markov chain

Sample Xc with respect to

N (Xk + δD(Xk), δΣ(Xk))

with D(Xk) = bmax(b,|∇ log π(Xk )|)∇ log π(Xk) and

Σ(Xk) = Idd + diag(([∇ log π(Xk)]21 ∧ b), ... , ([∇ log π(Xk)]2d ∧ b)

)

Compute the acceptance ratio

α(Xk ,Xc) = min

(1,π(Xc)qc(Xc ,Xk)

qc(Xk ,Xc)π(Xk)

)(qc = the pdf of this distribution).

Sample Xk+1 = Xc with probability α(Xk ,Xc) and Xk+1 = Xk withprobability 1− α(Xk ,Xc) = Acceptation/reject

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 8 / 42

Page 26: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Description of the algorithm

Anisotropic Metropolis Adjusted Langevin Algorithm (AMALA)

For all k = 1 : kend Iterates of Markov chain

Sample Xc with respect to

N (Xk + δD(Xk), δΣ(Xk))

with D(Xk) = bmax(b,|∇ log π(Xk )|)∇ log π(Xk) and

Σ(Xk) = Idd + diag(([∇ log π(Xk)]21 ∧ b), ... , ([∇ log π(Xk)]2d ∧ b)

)Compute the acceptance ratio

α(Xk ,Xc) = min

(1,π(Xc)qc(Xc ,Xk)

qc(Xk ,Xc)π(Xk)

)(qc = the pdf of this distribution).

Sample Xk+1 = Xc with probability α(Xk ,Xc) and Xk+1 = Xk withprobability 1− α(Xk ,Xc) = Acceptation/reject

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 8 / 42

Page 27: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Description of the algorithm

Anisotropic Metropolis Adjusted Langevin Algorithm (AMALA)

For all k = 1 : kend Iterates of Markov chain

Sample Xc with respect to

N (Xk + δD(Xk), δΣ(Xk))

with D(Xk) = bmax(b,|∇ log π(Xk )|)∇ log π(Xk) and

Σ(Xk) = Idd + diag(([∇ log π(Xk)]21 ∧ b), ... , ([∇ log π(Xk)]2d ∧ b)

)Compute the acceptance ratio

α(Xk ,Xc) = min

(1,π(Xc)qc(Xc ,Xk)

qc(Xk ,Xc)π(Xk)

)(qc = the pdf of this distribution).

Sample Xk+1 = Xc with probability α(Xk ,Xc) and Xk+1 = Xk withprobability 1− α(Xk ,Xc) = Acceptation/reject

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 8 / 42

Page 28: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Geometric ergodicity of the chain

Geometric ergodicity of the Markov chain

Condition:

π super-exponential: Smoothness condition on the target distribution

(B1) The density π is positive with continuous first derivative such that:

lim|x|→∞

n(x).∇ log π(x) = −∞ (1)

andlim sup|x|→∞

n(x).m(x) < 0 (2)

where ∇ is the gradient operator in Rd , n(x) = x|x| is the unit vector

pointing in the direction of x and m(x) = ∇π(x)|∇π(x)| is the unit vector in

the direction of the gradient of the stationary distribution at point x .

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 9 / 42

Page 29: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Geometric ergodicity of the chain

Geometric ergodicity of the Markov chain

Result:

Existence of a small set

Π(x ,A) ≥ εν(A)1C(x), ∀x ∈ X and ∀A ∈ B

Drift condition: pulls the chain back into the small set

ΠV (x) ≤ λV (x) + b1C(x) .

Geometric ergodicity

supx∈X

|ΠnV (x)− π(x)|V (x)

≤ Rρn . (3)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 10 / 42

Page 30: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Geometric ergodicity of the chain

Geometric ergodicity of the Markov chain

Result:

Existence of a small set

Π(x ,A) ≥ εν(A)1C(x), ∀x ∈ X and ∀A ∈ B

Drift condition: pulls the chain back into the small set

ΠV (x) ≤ λV (x) + b1C(x) .

Geometric ergodicity

supx∈X

|ΠnV (x)− π(x)|V (x)

≤ Rρn . (3)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 10 / 42

Page 31: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Geometric ergodicity of the chain

Geometric ergodicity of the Markov chain

Result:

Existence of a small set

Π(x ,A) ≥ εν(A)1C(x), ∀x ∈ X and ∀A ∈ B

Drift condition: pulls the chain back into the small set

ΠV (x) ≤ λV (x) + b1C(x) .

Geometric ergodicity

supx∈X

|ΠnV (x)− π(x)|V (x)

≤ Rρn . (3)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 10 / 42

Page 32: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Geometric ergodicity of the chain

Geometric ergodicity of the Markov chain

Result:

Existence of a small set

Π(x ,A) ≥ εν(A)1C(x), ∀x ∈ X and ∀A ∈ B

Drift condition: pulls the chain back into the small set

ΠV (x) ≤ λV (x) + b1C(x) .

Geometric ergodicity

supx∈X

|ΠnV (x)− π(x)|V (x)

≤ Rρn . (3)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 10 / 42

Page 33: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Geometric ergodicity of the chain

Experiments on synthetic data

Target: 10 dimensional Gaussian distribution with zero mean anddiagonal covariance matrix with diagonal coefficients randomly pickedbetween 1 and 2500

Comparison of AMALA and symmetric random walk

500, 000 iterations for each algorithm starting at zero

Mean squared jump distance (MSJD) in stationarity:AMALA 0.1504 - random walk 0.0407.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 11 / 42

Page 34: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Geometric ergodicity of the chain

Experiments on synthetic data

Figure: Autocorrelation functions of the AMALA (red) and the random walk(blue) samplers for four of the ten components of the Gaussian 10 dimensionaldistribution.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 12 / 42

Page 35: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Geometric ergodicity of the chain

Why not using exising MALA-like algorithms?

Optimised MALA-like algorithms are usually adaptive

Good performances in practice

Good theoretical properties

Numerical problem at the first iterations (not yet stationary):convergence time?

Most important: Our goal = parameter estimation

AMALA = one tool inside another algorithm

Adaptive + estimation algorithm = numerical issues: too manydegree of freedom

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 13 / 42

Page 36: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Geometric ergodicity of the chain

Why not using exising MALA-like algorithms?

Optimised MALA-like algorithms are usually adaptive

Good performances in practice

Good theoretical properties

However

Numerical problem at the first iterations (not yet stationary):convergence time?

Most important: Our goal = parameter estimation

AMALA = one tool inside another algorithm

Adaptive + estimation algorithm = numerical issues: too manydegree of freedom

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 13 / 42

Page 37: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Geometric ergodicity of the chain

Why not using exising MALA-like algorithms?

Optimised MALA-like algorithms are usually adaptive

Good performances in practice

Good theoretical properties

However

Numerical problem at the first iterations (not yet stationary):convergence time?

Most important: Our goal = parameter estimation

AMALA = one tool inside another algorithm

Adaptive + estimation algorithm = numerical issues: too manydegree of freedom

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 13 / 42

Page 38: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Anisotropic Metropolis Adjusted Langevin algorithm (AMALA) Geometric ergodicity of the chain

Why not using exising MALA-like algorithms?

Optimised MALA-like algorithms are usually adaptive

Good performances in practice

Good theoretical properties

However

Numerical problem at the first iterations (not yet stationary):convergence time?

Most important: Our goal = parameter estimation

AMALA = one tool inside another algorithm

Adaptive + estimation algorithm = numerical issues: too manydegree of freedom

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 13 / 42

Page 39: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM

Outline:

1. AMALA: simulation of random variables in high dimension

Anisotropic MALA descriptionConvergence property

2. AMALA within stochastic algorithm for parameter estimation

Maximum likelihood estimation for incomplete datasettingAMALA-SAEMConvergence properties

3. Experiments

BME-Template model: small deformation settingBME-Template model: LDDMM setting

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 14 / 42

Page 40: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Maximum likelihood estimation for incomplete data setting

Maximum likelihood estimation for incomplete data setting

y ∈ Rn: observed data

z ∈ Rl : missing data

(y , z) ∈ Rn+l : complete data

P = {f (y , z ; θ), θ ∈ Θ}: family of parametric pdfs on Rn+l

Assumption:∃θ ∈ Θ s.t. the complete data likelihood q(y , z ; θ) = f (y , z ; θ)

Observed likelihood:

g(y ; θ) =

∫f (y , z ; θ)µ(dz). (4)

Given a sample of observations (yi )1≤i≤n = yn1Find: θg in Θ s.t.

θg = arg maxθ∈Θ

g(yn1 ; θ)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 15 / 42

Page 41: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Maximum likelihood estimation for incomplete data setting

Maximum likelihood estimation for incomplete data setting

y ∈ Rn: observed data

z ∈ Rl : missing data

(y , z) ∈ Rn+l : complete data

P = {f (y , z ; θ), θ ∈ Θ}: family of parametric pdfs on Rn+l

Assumption:∃θ ∈ Θ s.t. the complete data likelihood q(y , z ; θ) = f (y , z ; θ)

Observed likelihood:

g(y ; θ) =

∫f (y , z ; θ)µ(dz). (4)

Given a sample of observations (yi )1≤i≤n = yn1Find: θg in Θ s.t.

θg = arg maxθ∈Θ

g(yn1 ; θ)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 15 / 42

Page 42: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Maximum likelihood estimation for incomplete data setting

Maximum likelihood estimation for incomplete data setting

y ∈ Rn: observed data

z ∈ Rl : missing data

(y , z) ∈ Rn+l : complete data

P = {f (y , z ; θ), θ ∈ Θ}: family of parametric pdfs on Rn+l

Assumption:∃θ ∈ Θ s.t. the complete data likelihood q(y , z ; θ) = f (y , z ; θ)

Observed likelihood:

g(y ; θ) =

∫f (y , z ; θ)µ(dz). (4)

Given a sample of observations (yi )1≤i≤n = yn1Find: θg in Θ s.t.

θg = arg maxθ∈Θ

g(yn1 ; θ)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 15 / 42

Page 43: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Maximum likelihood estimation for incomplete data setting

Maximum likelihood estimation for incomplete data setting

y ∈ Rn: observed data

z ∈ Rl : missing data

(y , z) ∈ Rn+l : complete data

P = {f (y , z ; θ), θ ∈ Θ}: family of parametric pdfs on Rn+l

Assumption:∃θ ∈ Θ s.t. the complete data likelihood q(y , z ; θ) = f (y , z ; θ)

Observed likelihood:

g(y ; θ) =

∫f (y , z ; θ)µ(dz). (4)

Given a sample of observations (yi )1≤i≤n = yn1Find: θg in Θ s.t.

θg = arg maxθ∈Θ

g(yn1 ; θ)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 15 / 42

Page 44: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Maximum likelihood estimation for incomplete data setting

Maximum likelihood estimation for incomplete data setting

y ∈ Rn: observed data

z ∈ Rl : missing data

(y , z) ∈ Rn+l : complete data

P = {f (y , z ; θ), θ ∈ Θ}: family of parametric pdfs on Rn+l

Assumption:∃θ ∈ Θ s.t. the complete data likelihood q(y , z ; θ) = f (y , z ; θ)

Observed likelihood:

g(y ; θ) =

∫f (y , z ; θ)µ(dz). (4)

Given a sample of observations (yi )1≤i≤n = yn1Find: θg in Θ s.t.

θg = arg maxθ∈Θ

g(yn1 ; θ)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 15 / 42

Page 45: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Maximum likelihood estimation for incomplete data setting

Maximum likelihood estimation for incomplete data setting

y ∈ Rn: observed data

z ∈ Rl : missing data

(y , z) ∈ Rn+l : complete data

P = {f (y , z ; θ), θ ∈ Θ}: family of parametric pdfs on Rn+l

Assumption:∃θ ∈ Θ s.t. the complete data likelihood q(y , z ; θ) = f (y , z ; θ)

Observed likelihood:

g(y ; θ) =

∫f (y , z ; θ)µ(dz). (4)

Given a sample of observations (yi )1≤i≤n = yn1Find: θg in Θ s.t.

θg = arg maxθ∈Θ

g(yn1 ; θ)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 15 / 42

Page 46: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Maximum likelihood estimation for incomplete data setting

Maximum likelihood estimation for incomplete data setting

y ∈ Rn: observed data

z ∈ Rl : missing data

(y , z) ∈ Rn+l : complete data

P = {f (y , z ; θ), θ ∈ Θ}: family of parametric pdfs on Rn+l

Assumption:∃θ ∈ Θ s.t. the complete data likelihood q(y , z ; θ) = f (y , z ; θ)

Observed likelihood:

g(y ; θ) =

∫f (y , z ; θ)µ(dz). (4)

Given a sample of observations (yi )1≤i≤n = yn1

Find: θg in Θ s.t.θg = arg max

θ∈Θg(yn1 ; θ)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 15 / 42

Page 47: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Maximum likelihood estimation for incomplete data setting

Maximum likelihood estimation for incomplete data setting

y ∈ Rn: observed data

z ∈ Rl : missing data

(y , z) ∈ Rn+l : complete data

P = {f (y , z ; θ), θ ∈ Θ}: family of parametric pdfs on Rn+l

Assumption:∃θ ∈ Θ s.t. the complete data likelihood q(y , z ; θ) = f (y , z ; θ)

Observed likelihood:

g(y ; θ) =

∫f (y , z ; θ)µ(dz). (4)

Given a sample of observations (yi )1≤i≤n = yn1Find: θg in Θ s.t.

θg = arg maxθ∈Θ

g(yn1 ; θ)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 15 / 42

Page 48: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Description of the algorithm

AMALA-SAEM

Incomplete data setting + maximum likelihood estimation = EMalgorithm

General case −→ E step not tractable

Stochastic Approximation EM for convergence properties

with MCMC method for simulation step.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 16 / 42

Page 49: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Description of the algorithm

AMALA-SAEM

Incomplete data setting + maximum likelihood estimation = EMalgorithm

General case −→ E step not tractable

Stochastic Approximation EM for convergence properties

with MCMC method for simulation step.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 16 / 42

Page 50: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Description of the algorithm

AMALA-SAEM

Incomplete data setting + maximum likelihood estimation = EMalgorithm

General case −→ E step not tractable

Stochastic Approximation EM for convergence properties

with MCMC method for simulation step.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 16 / 42

Page 51: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Description of the algorithm

AMALA-SAEM

Incomplete data setting + maximum likelihood estimation = EMalgorithm

General case −→ E step not tractable

Stochastic Approximation EM for convergence properties

with MCMC method for simulation step.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 16 / 42

Page 52: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Description of the algorithm

AMALA-SAEM

Incomplete data setting + maximum likelihood estimation = EMalgorithm

General case −→ E step not tractable

Stochastic Approximation EM for convergence properties

with MCMC method for simulation step.

→ AMALA-SAEM: using AMALA as the MCMC method

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 16 / 42

Page 53: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Description of the algorithm

Description of the algorithm

Assumption: model in the exponential family = all information carried bysufficient statistics S

For k = 1 : kend Iteration of SAEM

Sample zk through a single AMALA step (simulation andacceptation/reject) using current parameter θk−1

Compute the stochastic approximation

sk = sk−1 + γk (S(zk)− sk−1) ,

where (γk)k is a sequence of positive step sizes.

Update the parameterθk = θ(sk).

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 17 / 42

Page 54: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Description of the algorithm

Description of the algorithm

Assumption: model in the exponential family = all information carried bysufficient statistics S

For k = 1 : kend Iteration of SAEM

Sample zk through a single AMALA step (simulation andacceptation/reject) using current parameter θk−1

Compute the stochastic approximation

sk = sk−1 + γk (S(zk)− sk−1) ,

where (γk)k is a sequence of positive step sizes.

Update the parameterθk = θ(sk).

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 17 / 42

Page 55: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Description of the algorithm

Description of the algorithm

Assumption: model in the exponential family = all information carried bysufficient statistics S

For k = 1 : kend Iteration of SAEM

Sample zk through a single AMALA step (simulation andacceptation/reject) using current parameter θk−1

Compute the stochastic approximation

sk = sk−1 + γk (S(zk)− sk−1) ,

where (γk)k is a sequence of positive step sizes.

Update the parameterθk = θ(sk).

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 17 / 42

Page 56: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Description of the algorithm

Description of the algorithm

Assumption: model in the exponential family = all information carried bysufficient statistics S

For k = 1 : kend Iteration of SAEM

Sample zk through a single AMALA step (simulation andacceptation/reject) using current parameter θk−1

Compute the stochastic approximation

sk = sk−1 + γk (S(zk)− sk−1) ,

where (γk)k is a sequence of positive step sizes.

Update the parameterθk = θ(sk).

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 17 / 42

Page 57: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Description of the algorithm

Description of the algorithm

Assumption: model in the exponential family = all information carried bysufficient statistics S

For k = 1 : kend Iteration of SAEM

Sample zk through a single AMALA step (simulation andacceptation/reject) using current parameter θk−1

Compute the stochastic approximation

sk = sk−1 + γk (S(zk)− sk−1) ,

where (γk)k is a sequence of positive step sizes.

Update the parameterθk = θ(sk).

Can require truncation on random boundaries for convergence purposes

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 17 / 42

Page 58: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Convergence properties

Convergence properties

Conditions:

Smoothness of the model (classic conditions for convergence ofstochastic approximation and EM)

Condition for AMALA geometric ergodicity (B1)

Results:

Convergence of (sk) a.s. towards critical point of mean field of theproblem

Convergence of estimated parameters (θk) a.s. towards critical pointof observed likelihood

Central limit theorem for (θk) with rate 1/√γk

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 18 / 42

Page 59: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Convergence properties

Convergence properties

Conditions:

Smoothness of the model (classic conditions for convergence ofstochastic approximation and EM)

Condition for AMALA geometric ergodicity (B1)

Results:

Convergence of (sk) a.s. towards critical point of mean field of theproblem

Convergence of estimated parameters (θk) a.s. towards critical pointof observed likelihood

Central limit theorem for (θk) with rate 1/√γk

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 18 / 42

Page 60: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Convergence properties

Convergence properties

Conditions:

Smoothness of the model (classic conditions for convergence ofstochastic approximation and EM)

Condition for AMALA geometric ergodicity (B1)

Results:

Convergence of (sk) a.s. towards critical point of mean field of theproblem

Convergence of estimated parameters (θk) a.s. towards critical pointof observed likelihood

Central limit theorem for (θk) with rate 1/√γk

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 18 / 42

Page 61: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Convergence properties

Convergence properties

Conditions:

Smoothness of the model (classic conditions for convergence ofstochastic approximation and EM)

Condition for AMALA geometric ergodicity (B1)

Results:

Convergence of (sk) a.s. towards critical point of mean field of theproblem

Convergence of estimated parameters (θk) a.s. towards critical pointof observed likelihood

Central limit theorem for (θk) with rate 1/√γk

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 18 / 42

Page 62: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Convergence properties

Conditions for the SA to converge

Define for any V : X → [1,∞] and any g : X → Rm the norm

‖g‖V = supz∈X

‖g(z)‖V (z)

.

(A1’) S is an open subset of Rm, h : S → Rm is continuous and there existsa continuously differentiable function w : S → [0,∞[ with thefollowing properties.

(i) There exists an M0 > 0 such that

L , {s ∈ S, 〈∇w(s), h(s)〉 = 0} ⊂ {s ∈ S, w(s) < M0} .

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 19 / 42

Page 63: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Convergence properties

Conditions for the SA to converge (2)

(ii) There exists a closed convex set Sa ⊂ S for whichs → s + ρHs(z) ∈ Sa for any ρ ∈ [0, 1] and (z , s) ∈ X × Sa (Sa isabsorbing) and such that for any M1 ∈]M0,∞], the set WM1 ∩ Sa is acompact set of S where WM1 , {s ∈ S, w(s) ≤ M1}.

(iii) For any s ∈ S\L 〈∇w(s), h(s)〉 < 0.(iv) The closure of w(L) has an empty interior.

(A2’) For any s ∈ S, Hs : X → S is measurable and∫‖Hs(z)‖πs(dz) <∞.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 20 / 42

Page 64: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Convergence properties

Conditions for the SA to converge (3)

(A3”) There exist a function V : X → [1,∞] such that{z ∈ X ,V (z) <∞} 6= ∅, constants a ∈]0, 1], p ≥ 2 , r > 0 andq ≥ 1 such that for any compact subset K ⊂ S,

(i)

sups∈K‖Hs‖V <∞ , (5)

sups∈K

(‖gs‖V + ‖Πsgs‖V ) <∞ , (6)

sups,s′∈K

‖s − s ′‖−a{‖gs − gs′‖V q + ‖Πsgs − Πs′gs′‖V q} <∞ , (7)

where for anys ∈ S a solution of the Poisson equationg − Πsg = Hs − πs(Hs) is denoted by gs .

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 21 / 42

Page 65: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Convergence properties

Conditions for the SA to converge (4)

(ii) For any sequence ε = (εk)k≥0 satisfying εk < ε for an ε sufficientlysmall, for any sequence γ = (γk)k≥0, there exist a constant C suchthat and for any z ∈ X ,

sups∈K

supk≥0

Eγz,s[V p(zk)1σ(K)∧ν(ε)≥k

]≤ CV p+r (z) , (8)

where ν(ε) = inf{k ≥ 1, ‖sk − sk−1‖ ≥ εk} andσ(K) = inf{k ≥ 1, sk /∈ K} and the expectation is related to thenon-homogeneous Markov chain ((zk , sk))k≥0 using the step-sizesequence γ = (γk)k≥0.

(A4) The sequences γ = (γk)k≥0 and ε = (εk)k≥0 are non-increasing,

positive and satisfy:∞∑k=0

γk =∞, limk→∞

εk = 0 and

∞∑k=1

{γ2k + γkε

ak + (γkε

−1k )p} <∞, where a and p are defined in (A3”).

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 22 / 42

Page 66: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Convergence properties

Condition for AMALA-SAEM to converge

(M1) The parameter space Θ is an open subset of Rp. The completedata likelihood function is given by:

f (y , z ; θ) = exp {−ψ(θ) + 〈S(z), φ(θ)〉} ,

where S is a Borel function on Rl taking its values in an open subsetS of Rm. Moreover, the convex hull of S(Rl) is included in S, and,for all θ in Θ, ∫

||S(z)||pθ(z)µ(dz) <∞.

(M2) The functions ψ and φ are twice continuously differentiable onΘ.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 23 / 42

Page 67: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Convergence properties

Condition for AMALA-SAEM to converge (2)

(M3) The function s : Θ→ S defined as

s(θ) ,∫

S(z)pθ(z)µ(dz)

is continuously differentiable on Θ.

(M4) The function l : Θ→ R defined as the observed-datalog-likelihood

l(θ) , log g(y ; θ) = log

∫f (y , z ; θ)µ(dz)

is continuously differentiable on Θ and

∂θ

∫f (y , z ; θ)µ(dz) =

∫∂θf (y , z ; θ)µ(dz).

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 24 / 42

Page 68: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Convergence properties

Condition for AMALA-SAEM to converge (3)

(M5) There exists a function θ : S → Θ, such that:

∀s ∈ S, ∀θ ∈ Θ, L(s; θ(s)) ≥ L(s; θ).

Moreover, the function θ is continuously differentiable on S.

(M6) The functions l : Θ→ R and θ : S → Θ are m timesdifferentiable.

(M7)(i) There exists an M0 > 0 such that{

s ∈ S, ∂s l(θ(s)) = 0}⊂ {s ∈ S, −l(θ(s)) < M0} .

(ii) For all M1 > M0, the set ¯Conv(S(Rl)) ∩ {s ∈ S, −l(θ(s)) ≤ M1} is acompact set of S.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 25 / 42

Page 69: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Applying AMALA within SAEM Convergence properties

Condition for AMALA-SAEM to converge (4)

(M8) There exists a polynomial function P of degree 2 such that forall z ∈ X

||S(z)|| ≤ |P(z)| .

(B3) For any compact subset K of S, there exists a polynomialfunction Q of the hidden variable such that

sups∈K|∇z log pθ(s)(z)| ≤ |Q(z)|

.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 26 / 42

Page 70: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation

Outline:

1. AMALA: simulation of random variables in high dimension

Anisotropic MALA descriptionConvergence property

2. AMALA within stochastic algorithm for parameter estimation

Maximum likelihood estimation for incomplete datasettingAMALA-SAEMConvergence properties

3. Experiments

BME-Template model: small deformation settingBME-Template model: LDDMM setting

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 27 / 42

Page 71: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Description of the BME Template model

BME Template model with small deformations

Deformable template model: (u = voxel, vu its position)

y(u) = I0(vu −m(vu)) + σε(u) ,

Parametric template and deformation:

Iα(v) = (Kpα)(v) =kp∑j=1

Kp(v , rp,k)αj and

mz(v) = (Kgz(v) =kg∑j=1

Kg (v , rg ,k)z j .

Generative model:z ∼ ⊗n

i=1N2kg (0, Γg ) | Γg ,

y ∼ ⊗ni=1N|Λ|(mzi Iα, σ

2Id) | z, α, σ2 ,

Bayesian framework → MAP estimator (= penalised MLE)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 28 / 42

Page 72: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Description of the BME Template model

BME Template model with small deformations

Deformable template model: (u = voxel, vu its position)

y(u) = I0(vu −m(vu)) + σε(u) ,

Parametric template and deformation:

Iα(v) = (Kpα)(v) =kp∑j=1

Kp(v , rp,k)αj and

mz(v) = (Kgz(v) =kg∑j=1

Kg (v , rg ,k)z j .

Generative model:z ∼ ⊗n

i=1N2kg (0, Γg ) | Γg ,

y ∼ ⊗ni=1N|Λ|(mzi Iα, σ

2Id) | z, α, σ2 ,

Bayesian framework → MAP estimator (= penalised MLE)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 28 / 42

Page 73: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Description of the BME Template model

BME Template model with small deformations

Deformable template model: (u = voxel, vu its position)

y(u) = I0(vu −m(vu)) + σε(u) ,

Parametric template and deformation:

Iα(v) = (Kpα)(v) =kp∑j=1

Kp(v , rp,k)αj and

mz(v) = (Kgz(v) =kg∑j=1

Kg (v , rg ,k)z j .

Generative model:z ∼ ⊗n

i=1N2kg (0, Γg ) | Γg ,

y ∼ ⊗ni=1N|Λ|(mzi Iα, σ

2Id) | z, α, σ2 ,

Bayesian framework → MAP estimator (= penalised MLE)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 28 / 42

Page 74: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Description of the BME Template model

BME Template model with small deformations

Deformable template model: (u = voxel, vu its position)

y(u) = I0(vu −m(vu)) + σε(u) ,

Parametric template and deformation:

Iα(v) = (Kpα)(v) =kp∑j=1

Kp(v , rp,k)αj and

mz(v) = (Kgz(v) =kg∑j=1

Kg (v , rg ,k)z j .

Generative model:z ∼ ⊗n

i=1N2kg (0, Γg ) | Γg ,

y ∼ ⊗ni=1N|Λ|(mzi Iα, σ

2Id) | z, α, σ2 ,

Bayesian framework → MAP estimator (= penalised MLE)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 28 / 42

Page 75: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation

Training sets

Figure: Left: Training set (inverse video). Right: Noisy training set (inversevideo).

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 29 / 42

Page 76: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation

Estimated templates

Algorithm/Noise level

FAM-EM H.G.-SAEM AMALA-SAEM

No Noise

Noisyof Variance 1

Figure: Estimated templates using different algorithms and two level of noise.The training set includes 20 images per digit.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 30 / 42

Page 77: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the covariance matrix estimation

Estimated geometric variability

Figure: Synthetic samples generated with respect to the BME template model.Left: No noise. Right: Noisy data.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 31 / 42

Page 78: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation CLT empirical proof

Empirical proof of the CLT

Figure: Evolution of the estimation of the noise variance along the SAEMiterations. Left: original data. Right: Noisy training set.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 32 / 42

Page 79: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation CLT empirical proof

Figure: Evolution of the estimation of the noise variance along the SAEMiterations. Test of convergence towards the Gaussian distribution of the estimatedparameters.

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 33 / 42

Page 80: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Medical image template estimation

Corpus callosum data base

Figure: Medical image template estimation: 10 Corpus callosum and spleniumtraining images among the 47 available.

Figure: Grey level mean. FAM-EM estimated template. Hybrid Gibbs - SAEMestimated template.AMALA-SAEM estimation .

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 34 / 42

Page 81: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

BME Template model with LDDMM

Deformable template model: (u = voxel, vu its position)

y(u) = I0(φ−1β(0)(vu)) + σε(u) ,

Parametric template: Iα(v) = (Kpα)(v) =kp∑j=1

Kp(v , rp,k)αj and φ

LDDMM solution of shooting with initial momentum β(0).

Generative model:z ∼ ⊗n

i=1N2kg (0, Γg ) | Γg ,

y ∼ ⊗ni=1N|Λ|(φβ(0)Iα, σ

2Id) | z, α, σ2 ,

Bayesian framework → MAP estimator (= penalised MLE)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 35 / 42

Page 82: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

LDDMM: parametric deformation:

Fix some control points: c(t) = (c1(t), ..., cng (t))

Choose a kernel Kg

Start from an initial momentum β(0) = β1(0), ..., βng (0)

Then, Hamiltonian System → Time evolution of both momenta andcontrol points

dc

dt=

∂H

∂β(c, β) = Kg (c(t))β(t)

dt= −∂H

∂c(c, β) = −1

2∇c(t)K (β(t), β(t))

(9)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 36 / 42

Page 83: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

LDDMM: parametric deformation (2):

Interpolating on any point of the domain:

vt(r) = (Kgβ(t))(r) =

ng∑k=1

Kg (r , ck(t))βk(t) ∀r ∈ D (10)

Deformation = solution of the flow equation: ∂φβ(0)(t)

∂t= vt ◦ φβ(0)(t)

φ0 = Id .

(11)

φβ(0) = φβ(0)(1)

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 37 / 42

Page 84: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Gradient computation

E (ci , βi︸︷︷︸

S0

) =∑

k (I0(φ−11 (yk))− I (yk))2︸ ︷︷ ︸+σ2 Reg(φ1)︸ ︷︷ ︸

L(S0)=β(0)tΓg(q(0),q(0))β(0)

S0 = {(ci , βi )}idS(t)

dt= F (S(t)) S(0) = S0

dy(t)

dt= G (S(t), y(t)) y(1) = y

∇S0E = dS0y(0)T∇y(0)A +∇S0L

∇yk (0)A = 2 (I0(yk(0))− I (yk(1)))∇yk (0)I0

- Momenta decrease image discrepancy

- Control Points attracted by image contours

dη(t)

dt= ∂S(t)G

Tη(t), η(0) = ∇y(0)A

dξ(t)

dt= ∂y(t)G

Tη(t)− dFT ξ(t), ξ(1) = 0

∇S0E = ξ(0) +∇S0L

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 38 / 42

Page 85: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Gradient computation

E (ci , βi︸︷︷︸S0

) =∑

k (I0(φ−11 (yk))− I (yk))2︸ ︷︷ ︸+σ2 Reg(φ1)︸ ︷︷ ︸

L(S0)=β(0)tΓg(q(0),q(0))β(0)

S0 = {(ci , βi )}i

dS(t)

dt= F (S(t)) S(0) = S0

dy(t)

dt= G (S(t), y(t)) y(1) = y

∇S0E = dS0y(0)T∇y(0)A +∇S0L

∇yk (0)A = 2 (I0(yk(0))− I (yk(1)))∇yk (0)I0

- Momenta decrease image discrepancy

- Control Points attracted by image contours

dη(t)

dt= ∂S(t)G

Tη(t), η(0) = ∇y(0)A

dξ(t)

dt= ∂y(t)G

Tη(t)− dFT ξ(t), ξ(1) = 0

∇S0E = ξ(0) +∇S0L

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 38 / 42

Page 86: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Gradient computation

E (ci , βi︸︷︷︸S0

) =∑

k (I0(φ−11 (yk))− I (yk))2︸ ︷︷ ︸+σ2 Reg(φ1)︸ ︷︷ ︸

L(S0)=β(0)tΓg(q(0),q(0))β(0)

S0 = {(ci , βi )}idS(t)

dt= F (S(t)) S(0) = S0

dy(t)

dt= G (S(t), y(t)) y(1) = y

∇S0E = dS0y(0)T∇y(0)A +∇S0L

∇yk (0)A = 2 (I0(yk(0))− I (yk(1)))∇yk (0)I0

- Momenta decrease image discrepancy

- Control Points attracted by image contours

dη(t)

dt= ∂S(t)G

Tη(t), η(0) = ∇y(0)A

dξ(t)

dt= ∂y(t)G

Tη(t)− dFT ξ(t), ξ(1) = 0

∇S0E = ξ(0) +∇S0L

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 38 / 42

Page 87: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Gradient computation

E (ci , βi︸︷︷︸S0

) =∑

k (I0(φ−11 (yk))− I (yk))2︸ ︷︷ ︸+σ2 Reg(φ1)︸ ︷︷ ︸

L(S0)=β(0)tΓg(q(0),q(0))β(0)

S0 = {(ci , βi )}idS(t)

dt= F (S(t)) S(0) = S0

dy(t)

dt= G (S(t), y(t)) y(1) = y

∇S0E = dS0y(0)T∇y(0)A +∇S0L

∇yk (0)A = 2 (I0(yk(0))− I (yk(1)))∇yk (0)I0

- Momenta decrease image discrepancy

- Control Points attracted by image contours

dη(t)

dt= ∂S(t)G

Tη(t), η(0) = ∇y(0)A

dξ(t)

dt= ∂y(t)G

Tη(t)− dFT ξ(t), ξ(1) = 0

∇S0E = ξ(0) +∇S0L

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 38 / 42

Page 88: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Gradient computation

E (ci , βi︸︷︷︸S0

) =∑

k (I0(φ−11 (yk))− I (yk))2︸ ︷︷ ︸

A(yk (0))

+σ2 Reg(φ1)︸ ︷︷ ︸

L(S0)=β(0)tΓg(q(0),q(0))β(0)

S0 = {(ci , βi )}idS(t)

dt= F (S(t)) S(0) = S0

dy(t)

dt= G (S(t), y(t)) y(1) = y

∇S0E = dS0y(0)T∇y(0)A +∇S0L

∇yk (0)A = 2 (I0(yk(0))− I (yk(1)))∇yk (0)I0

- Momenta decrease image discrepancy

- Control Points attracted by image contours

dη(t)

dt= ∂S(t)G

Tη(t), η(0) = ∇y(0)A

dξ(t)

dt= ∂y(t)G

Tη(t)− dFT ξ(t), ξ(1) = 0

∇S0E = ξ(0) +∇S0L

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 38 / 42

Page 89: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Gradient computation

E (ci , βi︸︷︷︸S0

) =∑

k (I0(φ−11 (yk))− I (yk))2︸ ︷︷ ︸

A(yk (0))

+σ2 Reg(φ1)︸ ︷︷ ︸L(S0)=β(0)tΓg(q(0),q(0))β(0)

S0 = {(ci , βi )}idS(t)

dt= F (S(t)) S(0) = S0

dy(t)

dt= G (S(t), y(t)) y(1) = y

∇S0E = dS0y(0)T∇y(0)A +∇S0L

∇yk (0)A = 2 (I0(yk(0))− I (yk(1)))∇yk (0)I0

- Momenta decrease image discrepancy

- Control Points attracted by image contours

dη(t)

dt= ∂S(t)G

Tη(t), η(0) = ∇y(0)A

dξ(t)

dt= ∂y(t)G

Tη(t)− dFT ξ(t), ξ(1) = 0

∇S0E = ξ(0) +∇S0L

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 38 / 42

Page 90: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Gradient computation

E (ci , βi︸︷︷︸S0

) =∑

k (I0(φ−11 (yk))− I (yk))2︸ ︷︷ ︸

A(yk (0))

+σ2 Reg(φ1)︸ ︷︷ ︸L(S0)=β(0)tΓg(q(0),q(0))β(0)

S0 = {(ci , βi )}idS(t)

dt= F (S(t)) S(0) = S0

dy(t)

dt= G (S(t), y(t)) y(1) = y

∇S0E = dS0y(0)T∇y(0)A +∇S0L

∇yk (0)A = 2 (I0(yk(0))− I (yk(1)))∇yk (0)I0

- Momenta decrease image discrepancy

- Control Points attracted by image contours

dη(t)

dt= ∂S(t)G

Tη(t), η(0) = ∇y(0)A

dξ(t)

dt= ∂y(t)G

Tη(t)− dFT ξ(t), ξ(1) = 0

∇S0E = ξ(0) +∇S0L

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 38 / 42

Page 91: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Gradient computation

E (ci , βi︸︷︷︸S0

) =∑

k (I0(φ−11 (yk))− I (yk))2︸ ︷︷ ︸

A(yk (0))

+σ2 Reg(φ1)︸ ︷︷ ︸L(S0)=β(0)tΓg(q(0),q(0))β(0)

S0 = {(ci , βi )}idS(t)

dt= F (S(t)) S(0) = S0

dy(t)

dt= G (S(t), y(t)) y(1) = y

∇S0E = dS0y(0)T∇y(0)A +∇S0L

∇yk (0)A = 2 (I0(yk(0))− I (yk(1)))∇yk (0)I0

- Momenta decrease image discrepancy

- Control Points attracted by image contours

dη(t)

dt= ∂S(t)G

Tη(t), η(0) = ∇y(0)A

dξ(t)

dt= ∂y(t)G

Tη(t)− dFT ξ(t), ξ(1) = 0

∇S0E = ξ(0) +∇S0L

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 38 / 42

Page 92: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Gradient computation

E (ci , βi︸︷︷︸S0

) =∑

k (I0(φ−11 (yk))− I (yk))2︸ ︷︷ ︸

A(yk (0))

+σ2 Reg(φ1)︸ ︷︷ ︸L(S0)=β(0)tΓg(q(0),q(0))β(0)

S0 = {(ci , βi )}idS(t)

dt= F (S(t)) S(0) = S0

dy(t)

dt= G (S(t), y(t)) y(1) = y

∇S0E = dS0y(0)T∇y(0)A +∇S0L

∇yk (0)A = 2 (I0(yk(0))− I (yk(1)))∇yk (0)I0

- Momenta decrease image discrepancy

- Control Points attracted by image contours

dη(t)

dt= ∂S(t)G

Tη(t), η(0) = ∇y(0)A

dξ(t)

dt= ∂y(t)G

Tη(t)− dFT ξ(t), ξ(1) = 0

∇S0E = ξ(0) +∇S0L

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 38 / 42

Page 93: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Gradient computation

E (ci , βi︸︷︷︸S0

) =∑

k (I0(φ−11 (yk))− I (yk))2︸ ︷︷ ︸

A(yk (0))

+σ2 Reg(φ1)︸ ︷︷ ︸L(S0)=β(0)tΓg(q(0),q(0))β(0)

S0 = {(ci , βi )}idS(t)

dt= F (S(t)) S(0) = S0

dy(t)

dt= G (S(t), y(t)) y(1) = y

dη(t)

dt= ∂S(t)G

Tη(t), η(0) = ∇y(0)A

dξ(t)

dt= ∂y(t)G

Tη(t)− dFT ξ(t), ξ(1) = 0

∇S0E = ξ(0) +∇S0L

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 38 / 42

Page 94: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Gradient computation

E (ci , βi︸︷︷︸S0

) =∑

k (I0(φ−11 (yk))− I (yk))2︸ ︷︷ ︸

A(yk (0))

+σ2 Reg(φ1)︸ ︷︷ ︸L(S0)=β(0)tΓg(q(0),q(0))β(0)

S0 = {(ci , βi )}idS(t)

dt= F (S(t)) S(0) = S0

dy(t)

dt= G (S(t), y(t)) y(1) = y

dη(t)

dt= ∂S(t)G

Tη(t), η(0) = ∇y(0)A

dξ(t)

dt= ∂y(t)G

Tη(t)− dFT ξ(t), ξ(1) = 0

∇S0E = ξ(0) +∇S0L

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 38 / 42

Page 95: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Gradient computation

E (ci , βi︸︷︷︸S0

) =∑

k (I0(φ−11 (yk))− I (yk))2︸ ︷︷ ︸

A(yk (0))

+σ2 Reg(φ1)︸ ︷︷ ︸L(S0)=β(0)tΓg(q(0),q(0))β(0)

S0 = {(ci , βi )}idS(t)

dt= F (S(t)) S(0) = S0

dy(t)

dt= G (S(t), y(t)) y(1) = y

dη(t)

dt= ∂S(t)G

Tη(t), η(0) = ∇y(0)A

dξ(t)

dt= ∂y(t)G

Tη(t)− dFT ξ(t), ξ(1) = 0

∇S0E = ξ(0) +∇S0L

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 38 / 42

Page 96: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Application on Bayesian Mixed effect template estimation Results on the template estimation using LDDMM shooting

Using LDDMM deformations via shooting (preliminary results)

AMALA : GH :

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 39 / 42

Page 97: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Conclusion

Conclusion

Good performances (as accurate as other algorithms)

Reduce computational time

Can handle the movement of control points in practice (theory toconfirm)

Can handle sparsity of the template (' model selection)

Removing control points ? In practice, why not... theory ?

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 40 / 42

Page 98: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Conclusion

Conclusion

Good performances (as accurate as other algorithms)

Reduce computational time

Can handle the movement of control points in practice (theory toconfirm)

Can handle sparsity of the template (' model selection)

Removing control points ? In practice, why not... theory ?

Thank you !

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 40 / 42

Page 99: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Conclusion

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 41 / 42

Page 100: Anisotropic Metropolis Adjusted Langevin Algorithm: convergence and utility in Stochastic EM algorithm

Conclusion

Stephanie Allassonniere (CMAP) AMALA BigMC, January 2012 42 / 42


Recommended