Introduction to MAP Inference in Discrete Models
Andrew Blake, Microsoft Research Cambridge
Modern probabilistic modelling has revolutionized the design and
implementation of machine vision systems. There are now numerous instances
of systems that can see stereoscopically in depth, or separate foreground from
background, or accurately pinpoint objects of a particular class, all in real time.
The underlying advances owe a lot to probabilistic frameworks for inference in
images. In particular, the Markov Random Field (MRF), borrowed originally
from statistical physics, first appeared in image processing in the 70s. It has
staged a resounding comeback in the last decade, for very interesting reasons.
Lecturers:
Pushmeet Kohli, Microsoft Research Cambridge
M. Pawan Kumar, Stanford University
Carsten Rother, Microsoft Research Cambridge
Course programme
9.30-10.00 Introduction (Andrew Blake)
10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)
15min Coffee break
11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar)
12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli)
1 hour Lunch break
14:00-15.00 Transformation and move-making methods (Pushmeet Kohli)
15:00-15.30 Speed and Efficiency (Pushmeet Kohli)
15min Coffee break
15:45-16.15 Comparison of Methods (Carsten Rother)
16:30-17.30 Recent Advances: Dual-decomposition, higher-order, etc.
(Carsten Rother + Pawan Kumar)
All online material will be online (after conference):http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
An account of how vision might work
Having the ability to test hypotheses
Dealing with the ambiguity of the visual world
Having the ability to “fuse” information
Having the ability to learn
Reasoning with probabilities
Bottom-Up Segmentation (Berkeley)
Better defined problem: Foreground Segmentation
Separability of colour pallettes
Red
Gre
en
Red
Gre
en
Pixelwise independent decisions?
Maximum
likelihood
estimate:
log P(zi|xi=0) log P(zi|xi=1)
likelihoods
... Segmentation in camouflage
Need for spatial priors
Data in Colour Space Mixture of Gaussian Models
-- f/b distributions intertwined
Connecting up pixels
How can you express coherence in a way that is practical?
N
N
Markov models
1st order Markov chain
Textbook example:
Mon Tues Wed Thur
?
"As I've commented before, really relating to
someone involves standing next to impossible."
"Oh, sorry. Nevermind. I am afraid of it becoming
another island in a nice suit."
Markov models
2nd order Markov chain
Predictive text
Dasher
(Ward, Blackwell & Mackay 2000)
1st order with stochastic observations – Hidden Markov Model
HMMs – ubiquitous in speech recognition(Rabiner, 89; Jelinek 98) HTK (Young,Woodland et al. 97)
tractable –
Dynamic Programming
etc.
time
Factorial Hidden Markov Model(Ghahramani and Jordan 1995)
Temporal HMMs in vision
Probabilistic Graphical Models (Pearl 1988)
•Probabilistic graphical models
•Inference by message passing
Generalisation (Lauritzen and Spiegelhalter, 1988)
Factor Graphs (Kschischang, Frey, Loeliger, 2001)
-- see also (Bishop, 2006)
2D Markov model?
Tree of connected pixels(Veksler 2005)
Markov Random Field (MRF) –
1st order(Geman & Geman 84; Besag, 1974, 1986)
Independence:
where
neighbours of i
2D MRF – 1st order Example
Ising Model
where
and
Binary variables:
Joint probability distribution:
K=0.4
2D MRF simulation (Swensden Wang MCMC)
Ising Model
K=0.5
K=0.55
2D Hidden MRF
2D Hidden MRF
priorobservation
likelihood
Inference – MAP:
Simple segmentation --- Ising prior
MRF – expressed as additive energy terms
where “energy”
(-ve) log-prior V(x)
and
energy/cost minimization
colour
observations
with and
MAP
?? How to compute ie
Segmentation artefacts --- Ising prior
?? How to overcome artefacts
(Boykov and Kolmogorov ICCV 2003)
Boykov-Jolly contrast-sensitive segmentation
• Conditional Random Field -- CRF
where now
with
(Lafferty et al. 2001; Kumar and Hebert 2003)
log-”prior” V(x,z)
data-dependence
(Boykov and Jolly 2001; Rother et al. 2004; Li, Shum et al. 2004 )
Approximate variational extremum [Mumford and Shah 1985,9]
MAP estimation for Markov Random Fields
– Energy Minimization
Iterated conditional Modes [Besag 1986]
Simulated annealing [Metropolis, Rosenbluth, Rosenbluth, Teller and Teller, 1953]
Graduated nonconvexity [Blake and Zisserman 1987]
Graph cut [Greig, Porteous and Seheult, 1989]
Gibbs sampling [Geman and Geman 1984]
Generally NP-hard, so approximate:
Loopy Belief Propagation [Freeman and Pasztor, 1999]
“Modern” graph cut [Boykov, Veksler and Zabih, 2001]
Marginals
Inference – MAP:
whole distribution?:
pixelwise marginals?: i
MRFs – a generic framework for low-level vision
Medical Imaging
Image Restoration
Stereo Vision
Recognition
Analysis of Structure
Medical Imaging
Medical image segmentation GeoS (Criminisi et al 08)
Image restoration
Stereo matching – solved problem – variational algorithmsDynamic programming (Ohta & Kanade, 85; Belhumeur and Mumford 92)
Graph cut (Roy and Cox 98; Boykov et al. 00)
Live Stereo Segmentation
Background substitution
Kolmogorov, Criminisi, Blake, Cross and Rother (CVPR 2005, PAMI 2006)
Recognising and segmenting objects(Winn and Jojic, 2005)
Unwrap mosaic(Rav-Acha, Kohli, Rother, Fitzgibbon 2008)
Forthcoming book!
“Advances in Markov Ramdom Fields for Computer Vision”
MIT Press, summer 2010
Topics of this course and much, much more
Contributors: usual suspects – lecturers on this course + Boykov,
Kolmogorov, Weiss, Freeman, ....
one for the office and one for home
www.research.microsoft.com/vision/MRFbook
Course programme
9.30-10.00 Introduction (Andrew Blake)
10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)
15min Coffee break
11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar)
12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli)
1 hour Lunch break
14:00-15.00 Transformation and move-making methods (Pushmeet Kohli)
15:00-15.30 Speed and Efficiency (Pushmeet Kohli)
15min Coffee break
15:45-16.15 Comparison of Methods (Carsten Rother)
16:30-17.30 Recent Advances: Dual-decomposition, higher-order, etc.
(Carsten Rother + Pawan Kumar)
All online material will be online (after conference):http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/