Date post: | 19-Jan-2016 |
Category: |
Documents |
Upload: | frank-french |
View: | 220 times |
Download: | 0 times |
A tutorial
on Markov
Chain
Monte Carlo
Problem
g (x) dxI =
If {X } form a Markov chain withstationary probability
i
I g(x )i(x )ii=1
N
1
N
( e.g. bayesian inference )
MCMC is thenThe problem of designing a Markov Chainwith a pre-specified stationary distribution so that the integral I can be accuratelyapproximated in a reasonable amount of time.
X1
X X2
, , . . . XN
Metropolis et. al. (circ. 1953)
yx
p
1-P
y ‘
. . .
.
.
.
p’
p = min( 1, (y)
(x) ) G(y|x)
Theorem:Metropolis works for any proposaldistribution G such that:
G(y|x) = G(x|y)
provided the MC is irreducible andaperiodic.
Proof: min( 1,
(y)(x) ) G(y|x) (x)
Is symmetric in x and y.Note: it also works for general G provided we change p a bit (Hastings’ trick)
12
3
3
2
1
1
.5
.5
1Period = 2
Reducible
Let f(w,z) be the joint density with conditionals u(w|z) ,v(z|w) and marginals g(w), h(z).
u(w|z) h(z) dz = g(w)
v(z|w) g(w) dw = h(z)
Gibbs samplerTake X=(W,Z) a vector. To sample X it is sufficient tosample cyclically from the conditionals (W|z), (Z|w).
Gibbs is in fact a special case of Metropolis. Take proposals as theexact conditionals and G(y|x) = 1, i.e. always accept a proposed move.
T(g,h) = (g,h) a fix point!
Example:Likelihood
Entropic prior
Entropic posterior
In terms of the suff. stats. and
Entropic inference on gaussians
d exp(-I(’)) d
The Conditionals
|
|
gaussian
Generalized inverse gaussian
Gibbs+Metropolis% init posterior log likelihoodLL = ((t1-n2*mu).*mu-t3).*v + (n3-1)*log(v) - a2*((mu-m).^2+1./v);
LL1s(1:Nchains,1) = LL;for t=1:burnin mu = normrnd((v*t1+a1m)./(n*v+a1),1./(n*v+a1)); v = do_metropolis(v,Nmet,n3,beta,a2); LL1s(1:Nchains,t+1) = ... ((t1-n2*mu).*mu-t3).*v + (n3-1)*log(v) - a2*((mu-m).^2+1./v);end
function y = do_metropolis(v,Nmet,n3,t3,a2)%[Nchains,one] = size(v);x = v;accept = 0;reject = 0;lx = log(x);lfx = (n3-1)*lx-t3*x-a2./x;for t=1:Nmet y = gamrnd(n3,t3,Nchains,1); ly = log(y); lfy = (n3-1)*ly-t3*y-a2./y; for c=1:Nchains if (lfy(c) > lfx(c)) | (rand(1,1) < exp(lfy(c)-lfx(c))) x(c) = y(c); lx(c) = ly(c); lfx(c) = lfy(c); accept = accept+1; else reject = reject+1; end endend
Convergence: Are we there yet?
Looks OKafter second point.
Mixing is Good Segregation is Bad!
The of simulation
• Run several chains• Start at over-dispersed
points• Monitor the log lik.• Monitor the serial
correlations• Monitor acceptance
ratios
• Re-parameterize (to get approx. indep.)
• Re-block (Gibbs)• Collapse (int. over
other pars.)• Run with troubled
pars. fixed at reasonable vals.
Monitor R-hat, Monitor mean of score functions, Monitor coalescence, use connections, become EXACT!
Get Connected!
Unnormalized posteriors: q(|ww) (w)
(e.g. w = w(x) = vector of suff. stats.)
q(|w)
q(|w
q(|w(t)
t=0
t=1
vk(t) k (,w(t))
log(Z1/Z0) (1/N) (j,w(tj))
Where tj uniform on [0,1] and j from (w(tj)).is the average tangent direction along the path.Choice of path is equivalent to choice of prior on [0,1]. Best (min. var.) prior (path) is generalized Jeffreys!Information geodesics are the best paths on manifoldof unnormalized posteriors.
Easy paths:
- geometric
- mixture
- scale
Exact rejection constants are known along the mixture path!
The Presentis
trying to bePerfectly Exact
New Exact MathMost MCs are iterations of random functions:
Let f:family of functions.Choose n points, …,n in independently
with some p.m. defined on
Forward iter.: X0 = x0, X1 = f(x0), …, Xn+1 = fn (Xn) = (fnff1)(x0)
Backward iter.: Y0 = x0, Y1 = fn(x0), …, Yn+1 = f1 (Yn) = (f1f2 ...fn)(x0)
Xn = Yn for all n, but as processes {Xn } {Yn } d
E.g. Let a<1. Take S (space of states) the real line, ={+,-}, (+)=(-)=1/2 and f+(x) = a x + 1, f-(x) = a x - 1.
Xn = a Xn-1 + en but Yn = a Yn-1 + e1
Moves all over S
To a constant on S
(corresp. frames have same distribution but the MOVIES are different )
Dead Leaves Simulation
Forward Backward
Looking down Looking up
http://www.warwick.ac.uk/statsdept/Staff/WSK/
Convergence of
functions are contracting on average
when
Yn+1 = (f1f2 ...fn)(x0)
Propp & Wilsonwww.dbwilson.com
Perfectly equilibrated2D Ising state atcritical T = 529K
t = 0
t = -MGibbs with the same random numbers
s t f(s) f(t)0
1
1
.5
.5
Need backward iterations.First time to coalescence isnot distributed as ere chain always coalesces at 0first BUT (0) = 2/3, (1) = 1/3
Not Exactly!Yet.