8/12/2019 Part1 MLSS 2011
1/33
Markov Random Fields for
Computer Vision (Part 1)Machine Learning Summer School (MLSS 2011)
Stephen [email protected]
Australian National University
1317 June, 2011
Stephen Gould 1/23
http://find/8/12/2019 Part1 MLSS 2011
2/33
Pixel Labeling
Label every pixel in an image with a class label from some
pre-defined set, i.e., yp L.
Stephen Gould 2/23
http://find/8/12/2019 Part1 MLSS 2011
3/33
Pixel Labeling
Label every pixel in an image with a class label from some
pre-defined set, i.e., yp L.
Interactive figure-groundsegmentation (Boykovand Jolly, 2001; Boykovand Funka-Lea, 2006)
Stephen Gould 2/23
http://find/8/12/2019 Part1 MLSS 2011
4/33
Pixel Labeling
Label every pixel in an image with a class label from some
pre-defined set, i.e., yp L.
Interactive figure-groundsegmentation (Boykovand Jolly, 2001; Boykovand Funka-Lea, 2006)
Surface context (Hoiemet al., 2005)
Stephen Gould 2/23
http://find/8/12/2019 Part1 MLSS 2011
5/33
Pixel Labeling
Label every pixel in an image with a class label from some
pre-defined set, i.e., yp L.
Interactive figure-groundsegmentation (Boykovand Jolly, 2001; Boykovand Funka-Lea, 2006)
Surface context (Hoiemet al., 2005)
Semantic labeling (He etal., 2004; Shotton et al.,2006; Gould et al., 2009)
Stephen Gould 2/23
http://find/8/12/2019 Part1 MLSS 2011
6/33
Pixel Labeling
Label every pixel in an image with a class label from some
pre-defined set, i.e., yp L.
Interactive figure-groundsegmentation (Boykovand Jolly, 2001; Boykovand Funka-Lea, 2006)
Surface context (Hoiemet al., 2005)
Semantic labeling (He etal., 2004; Shotton et al.,2006; Gould et al., 2009)
Stereo matching (Scharstein and Szeliski,2002)
Stephen Gould 2/23
http://find/8/12/2019 Part1 MLSS 2011
7/33
Pixel Labeling
Label every pixel in an image with a class label from some
pre-defined set, i.e., yp L.
Interactive figure-groundsegmentation (Boykovand Jolly, 2001; Boykovand Funka-Lea, 2006)
Surface context (Hoiemet al., 2005)
Semantic labeling (He etal., 2004; Shotton et al.,2006; Gould et al., 2009)
Stereo matching (Scharstein and Szeliski,2002)
Image denoising (Felzen-szwalb and Huttenlocher,2004; Szeliski et al., 2008)
Stephen Gould 2/23
http://find/8/12/2019 Part1 MLSS 2011
8/33
Digital Photo Montage
(Agarwala et al., 2004)
Stephen Gould 3/23
http://find/8/12/2019 Part1 MLSS 2011
9/33
Digital Photo Montage
demonstration
Stephen Gould 4/23
http://find/8/12/2019 Part1 MLSS 2011
10/33
Tutorial Overview
Part 1. Pairwise conditional Markov random fields for thepixel labeling problem (45 minutes)
Part 2.Pseudo-boolean functions and graph-cuts (1 hour)
Part 3. Higher-order terms and inference as integerprogramming (30 minutes)
please ask lots of questions
Stephen Gould 5/23
http://find/8/12/2019 Part1 MLSS 2011
11/33
8/12/2019 Part1 MLSS 2011
12/33
Probability Review
Bayes Rule
P (y|x) posterior
=
likelihood
P (x| y)
prior
P (y)
P (x)
Maximum a Posteriori (MAP) inference: y =argmaxyP (y|x).
Conditional IndependenceRandom variablesy and x areconditionally independentgiven z ifP (y, x|z) =P (y|z) P (x| z).
Stephen Gould 6/23
http://find/8/12/2019 Part1 MLSS 2011
13/33
Graphical Models
We can exploit conditional independence assumptions to represent
probability distributions in a way that is both compactand efficientfor inference.
This tutorial is all about one particular representation, calledaMarkov Random Field(MRF), and the associated inferencealgorithms that are used in computer vision.
A B
C D
ad |b, c
A B
C D
1
Z(a, b)(b, d)(d, c)(c, a)
Stephen Gould 7/23
http://find/8/12/2019 Part1 MLSS 2011
14/33
Graphical Models
A B
C D
P (a, b, c, d) = 1
Z(a, b)(b, d)(d, c)(c, a)
= 1Z
exp {(a, b) (b, d) (d, c) (c, a)}
where = log .
Stephen Gould 8/23
http://find/8/12/2019 Part1 MLSS 2011
15/33
Energy Functions
Let x be some observations (i.e., features from the image) and lety= (y1, . . . , yn) be a vector of random variables. Then we canwrite the conditional probability ofy given x as
P (y| x) = 1
Z(x)exp {E(y; x)}
where Z(x) =
yLn
exp {E(y; x)} is called the partition function.
Stephen Gould 9/23
http://find/8/12/2019 Part1 MLSS 2011
16/33
Energy Functions
Let x be some observations (i.e., features from the image) and lety= (y1, . . . , yn) be a vector of random variables. Then we canwrite the conditional probability ofy given x as
P (y| x) = 1
Z(x)exp {E(y; x)}
where Z(x) =
yLn
exp {E(y; x)} is called the partition function.
The energy function E(y; x) usually has some structured form:
E(y; x) =c
c(yc; x)
where c(yc; x) are clique potentialsdefined over a subset ofrandom variablesycy.
Stephen Gould 9/23
http://find/8/12/2019 Part1 MLSS 2011
17/33
Conditional Markov Random Fields
E(y; x) = c
c(yc
; x)
=iV
Ui (yi; x)
unary+ijE
Pij(yi, yj; x)
pairwise+cC
Hc(yc; x).
higher-orderx1 x2 x3
y1 y2 y3
x4 x5 x6
y4 y5 y6
x7 x8 x9
y7 y8 y9
Stephen Gould 10/23
http://find/8/12/2019 Part1 MLSS 2011
18/33
Pixel Neighbourhoods
y1 y2 y3
y4 y5 y6
y7 y8 y9
4-connected,N4
y1 y2 y3
y4 y5 y6
y7 y8 y9
8-connected,N8
Stephen Gould 11/23
http://find/http://goback/8/12/2019 Part1 MLSS 2011
19/33
Binary MRF Example
Consider the following energy function fortwo binary random variables, y1 and y2.
01
5
201
1
301
0 1
0 3
4 0
E(y1, y2) =1(y1) +2(y2) +12(y1, y2)
Stephen Gould 12/23
http://find/8/12/2019 Part1 MLSS 2011
20/33
Binary MRF Example
Consider the following energy function fortwo binary random variables, y1 and y2.
01
5
201
1
301
0 1
0 3
4 0
E(y1, y2) =1(y1) +2(y2) +12(y1, y2)= 5y1+ 2y1
1+ y2+ 3y2 2
+ 3y1y2+ 4y1y2
12
where y1 = 1 y1 and y2 = 1 y2.
Stephen Gould 12/23
http://find/8/12/2019 Part1 MLSS 2011
21/33
Binary MRF Example
Consider the following energy function fortwo binary random variables, y1 and y2.
01
5
201
1
301
0 1
0 3
4 0
E(y1, y2) =1(y1) +2(y2) +12(y1, y2)= 5y1+ 2y1
1+ y2+ 3y2 2
+ 3y1y2+ 4y1y2
12
where y1 = 1 y1 and y2 = 1 y2.
Graphical Model
y1 y2
Probability Table
y1 y2 E P
0 0 6 0.2440 1 11 0.002
1 0 7 0.090
1 1 5 0.664
Stephen Gould 12/23
http://find/8/12/2019 Part1 MLSS 2011
22/33
Compactness of Representation
Consider a 1 mega-pixel image, e.g., 1000 1000 pixels. We wantto annotate each pixel with a label from L. Let L= |L|.
There are L106
possible ways to label such an image.
A naive encodingi.e., one big tablewould require L106 1parameters.
A pairwise MRF over N4 requires 106L parameters for theunary terms and 2 1000 (1000 1)L2 parameters for the
pairwise terms, i.e., O(106
L2
). Even less are required if weshare parameters.
Stephen Gould 13/23
http://find/8/12/2019 Part1 MLSS 2011
23/33
Inference and Energy Minimization
We are usually interested in finding the most probable labeling,
y =argmaxy
P (y|x) =argminy
E(y; x) .
This is known as maximum a posteriori(MAP) inference or energyminimization.
Stephen Gould 14/23
http://find/8/12/2019 Part1 MLSS 2011
24/33
Inference and Energy Minimization
We are usually interested in finding the most probable labeling,
y =argmaxy
P (y|x) =argminy
E(y; x) .
This is known as maximum a posteriori(MAP) inference or energyminimization.
A number of techniques can be used to find y, including:
message-passing (dynamic programming)
integer programming(part 3)
graph-cuts(part 2)
However, in general, inference is NP-hard.
Stephen Gould 14/23
http://find/8/12/2019 Part1 MLSS 2011
25/33
Characterizing Markov Random Fields
Markov random fields can be categorized via a number of differentdimensions:
Label space: binary vs. multi-label; homogeneous vs.heterogeneous.
Order: unary vs. pairwise vs. higher-order.
Structure: chain vs. tree vs. grid vs. general graph;neighbourhood size.
Potentials: submodular, convex, compressible.
These all affect tractability of inference.
Stephen Gould 15/23
http://find/8/12/2019 Part1 MLSS 2011
26/33
Markov Random Fields for Pixel Labeling
P (y|x) P (x| y) P (y) = exp {E(y; x)}energy
E(y; x) =iV
Ui (yi; x)
unary+
ijN8
Pij(yi, yj; x)
pairwise
Ui (yi; x) =
likelihood
L [[yi=]] log P (xi |)
Pij(yi, yj; x) = [[yi=yj]] Potts prior
Here the prior acts to smooth predictions(independentofx).Stephen Gould 16/23
http://find/8/12/2019 Part1 MLSS 2011
27/33
Prior Strength
= 1 = 4 = 16 = 128 = 1024
Stephen Gould 17/23
http://find/8/12/2019 Part1 MLSS 2011
28/33
Interactive Segmentation Model
Label space: foreground or background
L= {0, 1}
Unary term: Gaussian mixture models for foreground and
background
Ui (yi; x) =k
12|k| + 12(xi k)
T 1k
(xi k) log k
Pairwise term: contrast-dependent smoothness prior
Pij(yi, yj; x) =
0+1exp
xixj
2
2
, ifyi=yj
0, otherwise
Stephen Gould 18/23
http://find/8/12/2019 Part1 MLSS 2011
29/33
8/12/2019 Part1 MLSS 2011
30/33
Stereo Matching Model
Label space: pixel disparity
L= {0, 1, . . . , 127}
Unary term: sum of absolute differences (SAD) ornormalized cross-correlation (NCC)
Ui (yi; x) =
(u,v)W
|xleft(u, v) xright(u yi, v)|
Pairwise term: discontinuity preserving prior
Pij(yi, yj) = max {|yi yj|, dmax}
Stephen Gould 20/23
http://find/8/12/2019 Part1 MLSS 2011
31/33
Image Denoising Model
Label space: pixel intensity or colour
L= {0, 1, . . . , 255}
Unary term: square distance
Ui (yi; x) =yi xi2
Pairwise term: truncated L2 distance
Pij(yi, yj) = max
yi yj2, d2max
Stephen Gould 21/23
http://find/8/12/2019 Part1 MLSS 2011
32/33
Digital Photo Montage Model
Label space: image index
L= {1, 2, . . . , K}
Unary term: none!
Pairwise term: seem penalty
Pij(yi, yj; x) =xyi(i) xyj(i) + xyi(j) xyj(j)
(or edge-normalized variant)
Stephen Gould 22/23
http://find/8/12/2019 Part1 MLSS 2011
33/33
end of part 1
Stephen Gould 23/23
http://find/