Download - Part1 MLSS 2011

8/12/2019 Part1 MLSS 2011

1/33

Markov Random Fields for

Computer Vision (Part 1)Machine Learning Summer School (MLSS 2011)

Stephen [email protected]

Australian National University

1317 June, 2011

Stephen Gould 1/23
http://find/

8/12/2019 Part1 MLSS 2011

2/33

Pixel Labeling

Label every pixel in an image with a class label from some

pre-defined set, i.e., yp L.

Stephen Gould 2/23
http://find/

8/12/2019 Part1 MLSS 2011

3/33

Pixel Labeling



Interactive figure-groundsegmentation (Boykovand Jolly, 2001; Boykovand Funka-Lea, 2006)

Stephen Gould 2/23
http://find/

8/12/2019 Part1 MLSS 2011

4/33

Pixel Labeling




Surface context (Hoiemet al., 2005)

Stephen Gould 2/23
http://find/

8/12/2019 Part1 MLSS 2011

5/33

Pixel Labeling





Semantic labeling (He etal., 2004; Shotton et al.,2006; Gould et al., 2009)

Stephen Gould 2/23
http://find/

8/12/2019 Part1 MLSS 2011

6/33

Pixel Labeling






Stereo matching (Scharstein and Szeliski,2002)

Stephen Gould 2/23
http://find/

8/12/2019 Part1 MLSS 2011

7/33

Pixel Labeling






Stereo matching (Scharstein and Szeliski,2002)

Image denoising (Felzen-szwalb and Huttenlocher,2004; Szeliski et al., 2008)

Stephen Gould 2/23
http://find/

8/12/2019 Part1 MLSS 2011

8/33

Digital Photo Montage

(Agarwala et al., 2004)

Stephen Gould 3/23
http://find/

8/12/2019 Part1 MLSS 2011

9/33

Digital Photo Montage

demonstration

Stephen Gould 4/23
http://find/

8/12/2019 Part1 MLSS 2011

10/33

Tutorial Overview

Part 1. Pairwise conditional Markov random fields for thepixel labeling problem (45 minutes)

Part 2.Pseudo-boolean functions and graph-cuts (1 hour)

Part 3. Higher-order terms and inference as integerprogramming (30 minutes)

please ask lots of questions

Stephen Gould 5/23
http://find/

8/12/2019 Part1 MLSS 2011

11/33

8/12/2019 Part1 MLSS 2011

12/33

Probability Review

Bayes Rule

P (y|x) posterior

=

likelihood

P (x| y)

prior

P (y)

P (x)

Maximum a Posteriori (MAP) inference: y =argmaxyP (y|x).

Conditional IndependenceRandom variablesy and x areconditionally independentgiven z ifP (y, x|z) =P (y|z) P (x| z).

Stephen Gould 6/23
http://find/

8/12/2019 Part1 MLSS 2011

13/33

Graphical Models

We can exploit conditional independence assumptions to represent

probability distributions in a way that is both compactand efficientfor inference.

This tutorial is all about one particular representation, calledaMarkov Random Field(MRF), and the associated inferencealgorithms that are used in computer vision.

A B

C D

ad |b, c

A B

C D

1

Z(a, b)(b, d)(d, c)(c, a)

Stephen Gould 7/23
http://find/

8/12/2019 Part1 MLSS 2011

14/33

Graphical Models

A B

C D

P (a, b, c, d) = 1

Z(a, b)(b, d)(d, c)(c, a)

= 1Z

exp {(a, b) (b, d) (d, c) (c, a)}

where = log .

Stephen Gould 8/23
http://find/

8/12/2019 Part1 MLSS 2011

15/33

Energy Functions

Let x be some observations (i.e., features from the image) and lety= (y1, . . . , yn) be a vector of random variables. Then we canwrite the conditional probability ofy given x as

P (y| x) = 1

Z(x)exp {E(y; x)}

where Z(x) =

yLn

exp {E(y; x)} is called the partition function.

Stephen Gould 9/23
http://find/

8/12/2019 Part1 MLSS 2011

16/33

Energy Functions

Let x be some observations (i.e., features from the image) and lety= (y1, . . . , yn) be a vector of random variables. Then we canwrite the conditional probability ofy given x as

P (y| x) = 1

Z(x)exp {E(y; x)}

where Z(x) =

yLn

exp {E(y; x)} is called the partition function.

The energy function E(y; x) usually has some structured form:

E(y; x) =c

c(yc; x)

where c(yc; x) are clique potentialsdefined over a subset ofrandom variablesycy.

Stephen Gould 9/23
http://find/

8/12/2019 Part1 MLSS 2011

17/33

Conditional Markov Random Fields

E(y; x) = c

c(yc

; x)

=iV

Ui (yi; x)

unary+ijE

Pij(yi, yj; x)

pairwise+cC

Hc(yc; x).

higher-orderx1 x2 x3

y1 y2 y3

x4 x5 x6

y4 y5 y6

x7 x8 x9

y7 y8 y9

Stephen Gould 10/23
http://find/

8/12/2019 Part1 MLSS 2011

18/33

Pixel Neighbourhoods

y1 y2 y3

y4 y5 y6

y7 y8 y9

4-connected,N4

y1 y2 y3

y4 y5 y6

y7 y8 y9

8-connected,N8

Stephen Gould 11/23
http://find/http://goback/

8/12/2019 Part1 MLSS 2011

19/33

Binary MRF Example

Consider the following energy function fortwo binary random variables, y1 and y2.

01

5

201

1

301

0 1

0 3

4 0

E(y1, y2) =1(y1) +2(y2) +12(y1, y2)

Stephen Gould 12/23
http://find/

8/12/2019 Part1 MLSS 2011

20/33

Binary MRF Example


01

5

201

1

301

0 1

0 3

4 0

E(y1, y2) =1(y1) +2(y2) +12(y1, y2)= 5y1+ 2y1

1+ y2+ 3y2 2

+ 3y1y2+ 4y1y2

12

where y1 = 1 y1 and y2 = 1 y2.

Stephen Gould 12/23
http://find/

8/12/2019 Part1 MLSS 2011

21/33

Binary MRF Example


01

5

201

1

301

0 1

0 3

4 0

E(y1, y2) =1(y1) +2(y2) +12(y1, y2)= 5y1+ 2y1

1+ y2+ 3y2 2

+ 3y1y2+ 4y1y2

12

where y1 = 1 y1 and y2 = 1 y2.

Graphical Model

y1 y2

Probability Table

y1 y2 E P

0 0 6 0.2440 1 11 0.002

1 0 7 0.090

1 1 5 0.664

Stephen Gould 12/23
http://find/

8/12/2019 Part1 MLSS 2011

22/33

Compactness of Representation

Consider a 1 mega-pixel image, e.g., 1000 1000 pixels. We wantto annotate each pixel with a label from L. Let L= |L|.

There are L106

possible ways to label such an image.

A naive encodingi.e., one big tablewould require L106 1parameters.

A pairwise MRF over N4 requires 106L parameters for theunary terms and 2 1000 (1000 1)L2 parameters for the

pairwise terms, i.e., O(106

L2

). Even less are required if weshare parameters.

Stephen Gould 13/23
http://find/

8/12/2019 Part1 MLSS 2011

23/33

Inference and Energy Minimization

We are usually interested in finding the most probable labeling,

y =argmaxy

P (y|x) =argminy

E(y; x) .

This is known as maximum a posteriori(MAP) inference or energyminimization.

Stephen Gould 14/23
http://find/

8/12/2019 Part1 MLSS 2011

24/33

Inference and Energy Minimization

We are usually interested in finding the most probable labeling,

y =argmaxy

P (y|x) =argminy

E(y; x) .

This is known as maximum a posteriori(MAP) inference or energyminimization.

A number of techniques can be used to find y, including:

message-passing (dynamic programming)

integer programming(part 3)

graph-cuts(part 2)

However, in general, inference is NP-hard.

Stephen Gould 14/23
http://find/

8/12/2019 Part1 MLSS 2011

25/33

Characterizing Markov Random Fields

Markov random fields can be categorized via a number of differentdimensions:

Label space: binary vs. multi-label; homogeneous vs.heterogeneous.

Order: unary vs. pairwise vs. higher-order.

Structure: chain vs. tree vs. grid vs. general graph;neighbourhood size.

Potentials: submodular, convex, compressible.

These all affect tractability of inference.

Stephen Gould 15/23
http://find/

8/12/2019 Part1 MLSS 2011

26/33

Markov Random Fields for Pixel Labeling

P (y|x) P (x| y) P (y) = exp {E(y; x)}energy

E(y; x) =iV

Ui (yi; x)

unary+

ijN8

Pij(yi, yj; x)

pairwise

Ui (yi; x) =

likelihood

L [[yi=]] log P (xi |)

Pij(yi, yj; x) = [[yi=yj]] Potts prior

Here the prior acts to smooth predictions(independentofx).Stephen Gould 16/23
http://find/

8/12/2019 Part1 MLSS 2011

27/33

Prior Strength

= 1 = 4 = 16 = 128 = 1024

Stephen Gould 17/23
http://find/

8/12/2019 Part1 MLSS 2011

28/33

Interactive Segmentation Model

Label space: foreground or background

L= {0, 1}

Unary term: Gaussian mixture models for foreground and

background

Ui (yi; x) =k

12|k| + 12(xi k)

T 1k

(xi k) log k

Pairwise term: contrast-dependent smoothness prior

Pij(yi, yj; x) =

0+1exp

xixj

2

2

, ifyi=yj

0, otherwise

Stephen Gould 18/23
http://find/

8/12/2019 Part1 MLSS 2011

29/33

8/12/2019 Part1 MLSS 2011

30/33

Stereo Matching Model

Label space: pixel disparity

L= {0, 1, . . . , 127}

Unary term: sum of absolute differences (SAD) ornormalized cross-correlation (NCC)

Ui (yi; x) =

(u,v)W

|xleft(u, v) xright(u yi, v)|

Pairwise term: discontinuity preserving prior

Pij(yi, yj) = max {|yi yj|, dmax}

Stephen Gould 20/23
http://find/

8/12/2019 Part1 MLSS 2011

31/33

Image Denoising Model

Label space: pixel intensity or colour

L= {0, 1, . . . , 255}

Unary term: square distance

Ui (yi; x) =yi xi2

Pairwise term: truncated L2 distance

Pij(yi, yj) = max

yi yj2, d2max

Stephen Gould 21/23
http://find/

8/12/2019 Part1 MLSS 2011

32/33

Digital Photo Montage Model

Label space: image index

L= {1, 2, . . . , K}

Unary term: none!

Pairwise term: seem penalty

Pij(yi, yj; x) =xyi(i) xyj(i) + xyi(j) xyj(j)

(or edge-normalized variant)

Stephen Gould 22/23
http://find/

8/12/2019 Part1 MLSS 2011

33/33

end of part 1

Stephen Gould 23/23
http://find/