ICCV2009: MAP Inference in Discrete Models: Part 4

Post on 21-May-2015

464 views 9 download

Tags:

transcript

Pushmeet Kohli

ICCV 2009

Course programme

9.30-10.00 Introduction (Andrew Blake)

10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)

15min Coffee break

11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar)

12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli)

1 hour Lunch break

14:00-15.00 Transformation and move-making methods (Pushmeet Kohli)

15:00-15.30 Speed and Efficiency (Pushmeet Kohli)

15min Coffee break

15:45-16.15 Comparison of Methods (Carsten Rother)

16:30-17.30 Recent Advances: Dual-decomposition, higher-order, etc. (Carsten Rother + Pawan Kumar)

All online material will be online (after conference):http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/

E(x) = ∑ fi (xi) + ∑ gij (xi,xj) + ∑ hc(xc) i ij c

Unary Pairwise Higher Order

Image Segmentation

∑ ci xi + ∑ dij |xi-xj|i i,j

E: {0,1}n → R

n = number of pixels

Space of Problems

Submodular

Functions

CSP

Tree Structured

Pair-wiseO(n3)

MAXCUT

O(n6)

n = Number of Variables

Segmentation Energy

NP-Hard

More General Minimization Problems

st-mincut and Pseudo-booleanoptimization

Speed and Efficiency

More General Minimization Problems

st-mincut and Pseudo-booleanoptimization

Speed and Efficiency

Example: n = 2, A = [1,0] , B = [0,1]

f([1,0]) + f([0,1]) f([1,1]) + f([0,0])

Property : Sum of submodular functions is submodular

E(x) = ∑ ci xi + ∑ dij |xi-xj|i i,j

Binary Image Segmentation Energy is submodular

for all A,B ϵ {0,1}nf(A) + f(B) f(A˅B) + f(A˄B)(AND)(OR)

Pseudo-boolean function f{0,1}n ℝ is submodular if

Discrete Analogues of Concave Functions[Lovasz, ’83]

Widely applied in Operations Research

Applications in Machine Learning MAP Inference in Markov Random Fields

Clustering [Narasimhan , Jojic, & Bilmes, NIPS 2005]

Structure Learning [Narasimhan & Bilmes, NIPS 2006]

Maximizing the spread of influence through a social network [Kempe, Kleinberg & Tardos, KDD 2003]

Polynomial time algorithms Ellipsoid Algorithm: [Grotschel, Lovasz & Schrijver ‘81]

First strongly polynomial algorithm: [Iwata et al. ’00] [A. Schrijver ’00]

Current Best: O(n5 Q + n6) [Q is function evaluation time] [Orlin ‘07]

Symmetric functions: E(x) = E(1-x) Can be minimized in O(n3)

Minimizing Pairwise submodular functions Can be transformed to st-mincut/max-flow [Hammer , 1965]

Very low empirical running time ~ O(n)

E(X) = ∑ fi (xi) + ∑ gij (xi,xj)i ij

Source

Sink

v1 v2

2

5

9

41

2

Graph (V, E, C)

Vertices V = {v1, v2 ... vn}

Edges E = {(v1, v2) ....}

Costs C = {c(1, 2) ....}

Source

Sink

v1 v2

2

5

9

41

2

What is a st-cut?

Source

Sink

v1 v2

2

5

9

41

2

What is a st-cut?

An st-cut (S,T) divides the nodes between source and sink.

What is the cost of a st-cut?

Sum of cost of all edges going from S to T

5 + 1 + 9 = 15

What is a st-cut?

An st-cut (S,T) divides the nodes between source and sink.

What is the cost of a st-cut?

Sum of cost of all edges going from S to T

What is the st-mincut?

st-cut with the minimum cost

Source

Sink

v1 v2

2

5

9

41

2

2 + 2 + 4 = 8

Construct a graph such that:

1. Any st-cut corresponds to an assignment of x

2. The cost of the cut is equal to the energy of x : E(x)

SolutionT

S st-mincut

E(x)

[Hammer, 1965] [Kolmogorov and Zabih, 2002]

E(x) = ∑ θi (xi) + ∑ θij (xi,xj)i,ji

θij(0,1) + θij (1,0) θij (0,0) + θij (1,1)For all ij

E(x) = ∑ ci xi + ∑ cij xi(1-xj) cij≥0i,ji

Equivalent (transformable)

Sink (1)

Source (0)

a1 a2

E(a1,a2)

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 2a1

2

a1 a2

E(a1,a2) = 2a1 + 5ā1

2

5

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2

2

5

9

4

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2

2

5

9

4

2

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

2

5

9

4

2

1

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

2

5

9

4

2

1

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

2

5

9

4

2

1a1 = 1 a2 = 1

E (1,1) = 11

Cost of cut = 11

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

2

5

9

4

2

1

Sink (1)

Source (0)

a1 = 1 a2 = 0

E (1,0) = 8

st-mincut cost = 8

Source

Sink

v1 v2

2

5

9

42

1

Solve the dual maximum flow problem

Compute the maximum flow between Source and Sink s.t.

Edges: Flow < Capacity

Nodes: Flow in = Flow out

Assuming non-negative capacity

In every network, the maximum flow equals the cost of the st-mincut

Min-cut\Max-flow Theorem

Augmenting Path Based Algorithms

Source

Sink

v1 v2

2

5

9

42

1

Flow = 0

Augmenting Path Based Algorithms

1. Find path from source to sink with positive capacity

Source

Sink

v1 v2

2

5

9

42

1

Flow = 0

Augmenting Path Based Algorithms

1. Find path from source to sink with positive capacity

2. Push maximum possible flow through this path

Source

Sink

v1 v2

2-2

5-2

9

42

1

Flow = 0 + 2

Source

Sink

v1 v2

0

3

9

42

1

Augmenting Path Based Algorithms

1. Find path from source to sink with positive capacity

2. Push maximum possible flow through this path

Flow = 2

Source

Sink

v1 v2

0

3

9

42

1

Augmenting Path Based Algorithms

1. Find path from source to sink with positive capacity

2. Push maximum possible flow through this path

3. Repeat until no path can be found

Flow = 2

Source

Sink

v1 v2

0

3

9

42

1

Augmenting Path Based Algorithms

1. Find path from source to sink with positive capacity

2. Push maximum possible flow through this path

3. Repeat until no path can be found

Flow = 2

Source

Sink

v1 v2

0

3

5

02

1

Augmenting Path Based Algorithms

1. Find path from source to sink with positive capacity

2. Push maximum possible flow through this path

3. Repeat until no path can be found

Flow = 2 + 4

Source

Sink

v1 v2

0

3

5

02

1

Augmenting Path Based Algorithms

1. Find path from source to sink with positive capacity

2. Push maximum possible flow through this path

3. Repeat until no path can be found

Flow = 6

Source

Sink

v1 v2

0

3

5

02

1

Augmenting Path Based Algorithms

1. Find path from source to sink with positive capacity

2. Push maximum possible flow through this path

3. Repeat until no path can be found

Flow = 6

Source

Sink

v1 v2

0

1

3

02-2

1+2

Augmenting Path Based Algorithms

1. Find path from source to sink with positive capacity

2. Push maximum possible flow through this path

3. Repeat until no path can be found

Flow = 6 + 2

Source

Sink

v1 v2

0

2

4

0

3

0

Augmenting Path Based Algorithms

1. Find path from source to sink with positive capacity

2. Push maximum possible flow through this path

3. Repeat until no path can be found

Flow = 8

Source

Sink

v1 v2

0

2

4

0

3

0

Augmenting Path Based Algorithms

1. Find path from source to sink with positive capacity

2. Push maximum possible flow through this path

3. Repeat until no path can be found

Flow = 8

a1 a2

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

2

5

9

4

2

1

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

2

5

9

4

2

1

Sink (1)

Source (0)

2a1 + 5ā1

= 2(a1+ā1) + 3ā1

= 2 + 3ā1

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 2 + 3ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

0

3

9

4

2

12a1 + 5ā1

= 2(a1+ā1) + 3ā1

= 2 + 3ā1

a1 a2

E(a1,a2) = 2 + 3ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

0

3

9

4

2

19a2 + 4ā2

= 4(a2+ā2) + 5ā2

= 4 + 5ā2

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 2 + 3ā1+ 5a2 + 4 + 2a1ā2 + ā1a2

0

3

5

0

2

19a2 + 4ā2

= 4(a2+ā2) + 5ā2

= 4 + 5ā2

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 6 + 3ā1+ 5a2 + 2a1ā2 + ā1a2

0

3

5

0

2

1

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 6 + 3ā1+ 5a2 + 2a1ā2 + ā1a2

0

3

5

0

2

1

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 6 + 3ā1+ 5a2 + 2a1ā2 + ā1a2

0

3

5

0

2

1

3ā1+ 5a2 + 2a1ā2

= 2(ā1+a2+a1ā2) +ā1+3a2

= 2(1+ā1a2) +ā1+3a2

F1 = ā1+a2+a1ā2

F2 = 1+ā1a2

a1 a2 F1 F2

0 0 1 1

0 1 2 2

1 0 1 1

1 1 1 1

Sink (1)

Source (0)

a1 a2

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2

0

1

3

0

0

3

3ā1+ 5a2 + 2a1ā2

= 2(ā1+a2+a1ā2) +ā1+3a2

= 2(1+ā1a2) +ā1+3a2

a1 a2 F1 F2

0 0 1 1

0 1 2 2

1 0 1 1

1 1 1 1

F1 = ā1+a2+a1ā2

F2 = 1+ā1a2

Sink (1)

Source (0)

a1 a2

0

1

3

0

0

3

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2

No more augmenting paths

possible

Sink (1)

Source (0)

a1 a2

0

1

3

0

0

3

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2

Total Flow

Residual Graph(positive coefficients)

bound on the optimal solution

Tight Bound --> Inference of the optimal solution becomes trivial

Sink (1)

Source (0)

a1 a2

0

1

3

0

0

3

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2

a1 = 1 a2 = 0

E (1,0) = 8

st-mincut cost = 8Total Flow

bound on the optimal solution

Residual Graph(positive coefficients)

Tight Bound --> Inference of the optimal solution becomes trivial

Sink (1)

Source (0)

[Slide credit: Andrew Goldberg]

Augmenting Path and Push-Relabel n: #nodes

m: #edges

U: maximum edge weight

Algorithms assume non-

negative edge weights

[Slide credit: Andrew Goldberg]

n: #nodes

m: #edges

U: maximum edge weight

Algorithms assume non-

negative edge weights

Augmenting Path and Push-Relabel

a1 a2

10001

Sink

Source

1000

1000 1000

0

Ford Fulkerson: Choose any augmenting path

a1 a2

10001

Sink

Source

1000

1000 1000

0 Good Augmenting

Paths

Ford Fulkerson: Choose any augmenting path

a1 a2

10001

Sink

Source

1000

1000 1000

0 Bad Augmenting

Path

Ford Fulkerson: Choose any augmenting path

a1 a2

9990

Sink

Source

1000

1000 999

1

Ford Fulkerson: Choose any augmenting path

a1 a2

9990

Sink

Source

1000

1000 999

1

Ford Fulkerson: Choose any augmenting path

n: #nodes

m: #edges

We will have to perform 2000 augmentations!

Worst case complexity: O (m x Total_Flow)(Pseudo-polynomial bound: depends on flow)

Dinic: Choose shortest augmenting path

n: #nodes

m: #edges

Worst case Complexity: O (m n2)

a1 a2

10001

Sink

Source

1000

1000 1000

0

Specialized algorithms for vision problems Grid graphs

Low connectivity (m ~ O(n))

Dual search tree augmenting path algorithm[Boykov and Kolmogorov PAMI 2004]

• Finds approximate shortest augmenting paths efficiently

• High worst-case time complexity

• Empirically outperforms other algorithms on vision problems

Efficient code available on the web

http://www.adastral.ucl.ac.uk/~vladkolm/software.html

E(x) = ∑ ci xi + ∑ dij |xi-xj|i i,j

xx* = arg min E(x)

How to minimize E(x)?

E: {0,1}n → R0 → fg1 → bg

n = number of pixels

Sink (1)

Source (0)

Graph *g;

For all pixels p

/* Add a node to the graph */nodeID(p) = g->add_node();

/* Set cost of terminal edges */set_weights(nodeID(p), fgCost(p), bgCost(p));

end

for all adjacent pixels p,qadd_weights(nodeID(p), nodeID(q), cost(p,q));

end

g->compute_maxflow();

label_p = g->is_connected_to_source(nodeID(p));// is the label of pixel p (0 or 1)

a1 a2

fgCost(a1)

Sink (1)

Source (0)

fgCost(a2)

bgCost(a1) bgCost(a2)

Graph *g;

For all pixels p

/* Add a node to the graph */nodeID(p) = g->add_node();

/* Set cost of terminal edges */set_weights(nodeID(p), fgCost(p), bgCost(p));

end

for all adjacent pixels p,qadd_weights(nodeID(p), nodeID(q), cost(p,q));

end

g->compute_maxflow();

label_p = g->is_connected_to_source(nodeID(p));// is the label of pixel p (0 or 1)

a1 a2

fgCost(a1)

Sink (1)

Source (0)

fgCost(a2)

bgCost(a1) bgCost(a2)

cost(p,q)

Graph *g;

For all pixels p

/* Add a node to the graph */nodeID(p) = g->add_node();

/* Set cost of terminal edges */set_weights(nodeID(p), fgCost(p), bgCost(p));

end

for all adjacent pixels p,qadd_weights(nodeID(p), nodeID(q), cost(p,q));

end

g->compute_maxflow();

label_p = g->is_connected_to_source(nodeID(p));// is the label of pixel p (0 or 1)

Graph *g;

For all pixels p

/* Add a node to the graph */nodeID(p) = g->add_node();

/* Set cost of terminal edges */set_weights(nodeID(p), fgCost(p), bgCost(p));

end

for all adjacent pixels p,qadd_weights(nodeID(p), nodeID(q), cost(p,q));

end

g->compute_maxflow();

label_p = g->is_connected_to_source(nodeID(p));// is the label of pixel p (0 or 1)

a1 a2

fgCost(a1)

Sink (1)

Source (0)

fgCost(a2)

bgCost(a1) bgCost(a2)

cost(p,q)

a1 = bg a2 = fg

Lunch

MIT Press, summer 2010

Topics of this course and much, much more

Contributors: usual suspects – lecturers on this course + Boykov,

Kolmogorov, Weiss, Freeman, ....

one for the office and one for home

www.research.microsoft.com/vision/MRFbook

Advances in Markov Random Fields for Computer Vision

More General Minimization Problems

st-mincut and Pseudo-booleanoptimization

Speed and Efficiency

Non-submodular Energy Functions

Mixed (Real-Integer) Problems

Higher Order Energy Functions

Multi-label Problems

Ordered Labels▪ Stereo (depth labels)

Unordered Labels▪ Object segmentation ( ‘car’, `road’, `person’)

Non-submodular Energy Functions

Mixed (Real-Integer) Problems

Higher Order Energy Functions

Multi-label Problems

Ordered Labels▪ Stereo (depth labels)

Unordered Labels▪ Object segmentation ( ‘car’, `road’, `person’)

Minimizing general non-submodular functions is NP-hard.

Commonly used method is to solve a relaxation of the problem

E(x) = ∑ θi (xi) + ∑ θij (xi,xj)i,ji

θij(0,1) + θij (1,0) ≤ θij (0,0) + θij (1,1) for some ij

[Boros and Hammer, ‘02]

pairwise nonsubmodular

unary

pairwise submodular

)0,1()1,0()1,1()0,0( pqpqpqpq

)0,1(~

)1,0(~

)1,1(~

)0,0(~

pqpqpqpq

[Boros and Hammer, ‘02]

Double number of variables: ppp xxx , )1( pp xx

[Boros and Hammer, ‘02]

Double number of variables: ppp xxx , )1( pp xx

Ignore constraint and solve

[Boros and Hammer, ‘02]

Double number of variables: ppp xxx , )1( pp xx

Local Optimality

[Boros and Hammer, ‘02]

[Rother, Kolmogorov, Lempitsky, Szummer] [CVPR 2007]

0 ? ? ? ? ?

rp q s t

0 0 0 ? ? 0 0 1 0 ?

rp q s t

rp q s tQPBO:

Probe Node p:

0 1

What can we say about variables?

•r -> is always 0•s -> is always equal to q•t -> is 0 when q = 1

Probe nodes in an order until energy unchanged

Simplified energy preserves global optimality and (sometimes) gives the global minimum

Result depends slightly on the order

• Property: E(y’) ≤ E(y) [autarky property]

0 ? ? ?

? ? ? ?

? ? ? ?

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 1

0 0 1 0

0 0 0 0

x (partial)y (e.g. from BP) y’ = FUSE(x,y)

0 0 ? ?

0 0 ? ?

? ? ? ?

0 0 0 1

0 0 1 ?

? ? ? ?

Non-submodular Energy Functions

Mixed (Real-Integer) Problems

Higher Order Energy Functions

Multi-label Problems

Ordered Labels▪ Stereo (image intensity, depth)

Unordered Labels▪ Object segmentation ( ‘car’, `road’, `person’)

[Kumar et al, 05] [Kohli et al, 06,08]

Image

colour appearance based Segmentation

Need for a human like segmentation

Segmentation Result

x – binary image segmentation (xi ∊ {0,1})

ω – non-local parameter (lives in some large set Ω)

constantunary

potentialspairwise

potentials

E(x,ω) = C(ω) + ∑ θi (ω, xi) + ∑ θij (ω,xi,xj)i,ji

≥ 0

Rough Shape Prior

Stickman Model

ωPose

θi (ω, xi) Shape Prior

[Kohli et al, 06,08]

x – binary image segmentation (xi ∊ {0,1})

ω – non-local parameter (lives in some large set Ω)

constantunary

potentialspairwise

potentials

E(x,ω) = C(ω) + ∑ θi (ω, xi) + ∑ θij (ω,xi,xj)i,ji

≥ 0

ωTemplate Position

Scale Orientation

[Kohli et al, 06,08] [Lempitsky et al, 08]

x – binary image segmentation (xi ∊ {0,1})

ω – non-local parameter (lives in some large set Ω)

constantunary

potentialspairwise

potentials

E(x,ω) = C(ω) + ∑ θi (ω, xi) + ∑ θij (ω,xi,xj)i,ji

≥ 0

{x*,ω*} = arg min E(x,ω)

• Standard “graph cut” energy if ω is fixed

x,ω

[Kohli et al, 06,08] [Lempitsky et al, 08]

Local Method: Gradient Descent over ω

ω*

ω * = arg min min E (x,ω)

Submodular

[Kohli et al, 06,08]

Local Method: Gradient Descent over ω

ω * = arg min min E (x,ω)

Submodular

Dynamic Graph Cuts

15- 20 time speedup!

E (x,ω1)

E (x,ω2)

Similar Energy Functions

[Kohli et al, 06,08]

[Kohli et al, 06,08]

Global Method: Branch and Mincut

[Lempitsky et al, 08]

Produces the global optimal solution

Exhaustively explores Ω in the worst case

Ω0

Ω0

Ω (space of w) is hierarchically clustered

Standard best-first branch-and-bound search:

Small fraction of nodes is visited

lowest lower bound

A

B

C

30,000,000 shapes

Exhaustive search: 30,000,000 mincuts

Branch-and-Mincut: 12,000 mincuts

Speed-up: 2500 times(30 seconds per 312x272 image)

[Lempitsky et al, 08]

Left ventricle epicardium tracking (work in progress)

Branch & Bound segmentation

Shape prior from other sequences

5,200,000 templates

≈20 seconds per frame

Speed-up 1150

Data courtesy: Dr Harald Becher, Department of Cardiovascular Medicine, University of Oxford

Original sequence No shape prior

[Lempitsky et al, 08]

Non-submodular Energy Functions

Mixed (Real-Integer) Problems

Higher Order Energy Functions

Multi-label Problems

Ordered Labels▪ Stereo (depth labels)

Unordered Labels▪ Object segmentation ( ‘car’, `road’, `person’)

Pairwise functions have limited expressive power

Inability to incorporate region based likelihoods and priors

Field of Experts Model[Roth & Black CVPR 2005 ][Potetz, CVPR 2007]

Minimize Curvature [Woodford et al. CVPR 2008 ]

Other Examples:[Rother, Kolmogorov, Minka & Blake, CVPR 2006][Komodakis and Paragios, CVPR 2009][Rother, Kohli, Feng, Jia, CVPR 2009][Ishikawa, CVPR 2009]And many others ...

E(X) = ∑ ci xi + ∑ dij |xi-xj|i i,j

E: {0,1}n → R

0 →fg, 1→bg

n = number of pixels

[Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and Blake `04]

Image Unary Cost Segmentation

Patch Dictionary (Tree)

Cmax C1

{ C1 if xi = 0, i ϵ pCmax otherwise

h(Xp) =

[Kohli et al. ‘07]

p

E(X) = ∑ ci xi + ∑ dij |xi-xj| + ∑ hp (Xp)i i,j p

p

{ C1 if xi = 0, i ϵ pCmax otherwise

h(Xp) =

E: {0,1}n → R

0 →fg, 1→bg

n = number of pixels

[Kohli et al. ‘07]

E(X) = ∑ ci xi + ∑ dij |xi-xj| + ∑ hp (Xp)i i,j

Image Pairwise Segmentation Final Segmentation

p

E: {0,1}n → R

0 →fg, 1→bg

n = number of pixels

[Kohli et al. ‘07]

T

Sst-mincut

Pairwise SubmodularFunction

Higher Order Submodular

Functions

Billionnet and M. Minoux [DAM 1985]Kolmogorov & Zabih [PAMI 2004]Freedman & Drineas [CVPR2005]Kohli Kumar Torr [CVPR2007, PAMI 2008]Kohli Ladicky Torr [CVPR 2008, IJCV 2009]Ramalingam Kohli Alahari Torr [CVPR 2008]Zivny et al. [CP 2008]

Exact Transformation

?

Identified transformable families of higher order function s.t.

1. Constant or polynomial number of auxiliary variables (a) added

2. All pairwise functions (g) are submodular

PairwiseSubmodular

Function

Higher Order Function

H (X) = F ( ∑ xi )Example:

H (X)

∑ xi

concave

[Kohli et al. ‘08]

0

Simple Example using Auxiliary variables

{ 0 if all xi = 0C1 otherwisef(x) =

min f(x) min C1a + C1 ∑ ā xi

x ϵ L = {0,1}n

x =x,a ϵ {0,1}

Higher Order Submodular Function

Quadratic Submodular Function

∑xi < 1 a=0 (ā=1) f(x) = 0

∑xi ≥ 1 a=1 (ā=0) f(x) = C1

min f(x) min C1a + C1 ∑ ā xix

=x,a ϵ {0,1}

Higher Order Submodular Function

Quadratic SubmodularFunction

∑xi

1 2 3

C1

C1∑xi

min f(x) min C1a + C1 ∑ ā xix

=x,a ϵ {0,1}

Higher Order Submodular Function

Quadratic SubmodularFunction

∑xi

1 2 3

C1

C1∑xi

a=1a=0Lower envelop of concave functions is

concave

min f(x) min f1 (x)a + f2(x)āx

=x,a ϵ {0,1}

Higher Order Submodular Function

Quadratic SubmodularFunction

∑xi

1 2 3

Lower envelop of concave functions is

concave

f2(x)

f1(x)

min f(x) min f1 (x)a + f2(x)āx

=x,a ϵ {0,1}

Higher Order Submodular Function

Quadratic SubmodularFunction

∑xi

1 2 3

a=1a=0Lower envelop of concave functions is

concave

f2(x)

f1(x)

Transforming Potentials with 3 variables [Woodford, Fitzgibbon, Reid, Torr, CVPR 2008]

Transforming general “sparse” higher order functions [Rother, Kohli, Feng, Jia, CVPR 2009][Ishikawa, CVPR 2009][Komodakis and Paragios, CVPR 2009]

Test Image Test Image(60% Noise)

TrainingImage

Result

PairwiseEnergy

P(x)

Minimized usingst-mincut or max-product

message passing

Test Image Test Image(60% Noise)

TrainingImage

Result

PairwiseEnergy

P(x)

Minimized usingst-mincut or max-product

message passing

Higher Order Structure not Preserved

Minimize:

Where:

Higher Order Function (|c| = 10x10 = 100)

Assigns cost to 2100 possible labellings!

Exploit function structure to transform it to a Pairwise function

E(X) = P(X) + ∑ hc (Xc) c

hc: {0,1}|c| → R

p1 p2 p3

Test Image Test Image(60% Noise)

TrainingImage

PairwiseResult

Higher-Order Result

Learned Patterns

[Joint work with Carsten Rother ]

Non-submodular Energy Functions

Mixed (Real-Integer) Problems

Higher Order Energy Functions

Multi-label Problems

Ordered Labels▪ Stereo (depth labels)

Unordered Labels▪ Object segmentation ( ‘car’, `road’, `person’)

Exact Transformation to QPBF

Move making algorithms

E(y) = ∑ fi (yi) + ∑ gij (yi,yj)i,ji

y ϵ Labels L = {l1, l2, … , lk}

Miny

[Roy and Cox ’98] [Ishikawa ’03] [Schlesinger & Flach ’06]

[Ramalingam, Alahari, Kohli, and Torr ’08]

So what is the problem?

Eb (x1,x2, ..., xm)Em (y1,y2, ..., yn)

Multi-label Problem Binary label Problem

yi ϵ L = {l1, l2, … , lk} xi ϵ L = {0,1}

such that:

Let Y and X be the set of feasible solutions, then

1. One-One encoding function T:X->Y

2. arg min Em(y) = T(arg min Eb(x))

• Popular encoding scheme [Roy and Cox ’98, Ishikawa ’03, Schlesinger & Flach ’06]

# Nodes = n * k

# Pairwise = m * k2

• Popular encoding scheme [Roy and Cox ’98, Ishikawa ’03, Schlesinger & Flach ’06]

# Nodes = n * k

# Pairwise = m * k2

Ishikawa’s result:

E(y) = ∑ θi (yi) + ∑ θij (yi,yj)i,ji

y ϵ Labels L = {l1, l2, … , lk}

θij (yi,yj) = g(|yi-yj|)Convex Function

g(|yi-yj|)

|yi-yj|

• Popular encoding scheme [Roy and Cox ’98, Ishikawa ’03, Schlesinger & Flach ’06]

# Nodes = n * k

# Pairwise = m * k2

Schlesinger & Flach ’06:

E(y) = ∑ θi (yi) + ∑ θij (yi,yj)i,ji

y ϵ Labels L = {l1, l2, … , lk}

θij(li+1,lj) + θij (li,lj+1) θij (li,lj) + θij (li+1,lj+1)

li +1

li

lj +1

lj

Image MAP Solution

Scanlinealgorithm

[Roy and Cox, 98]

Applicability Cannot handle truncated costs

(non-robust)

Computational Cost Very high computational cost

Problem size = |Variables| x |Labels|

Gray level image denoising (1 Mpixel image)

(~2.5 x 108 graph nodes)

θij (yi,yj) = g(|yi-yj|)

|yi-yj|

discontinuity preserving potentialsBlake&Zisserman’83,87

Unary Potentials

Pair-wise Potentials

Complexity

IshikawaTransformation

[03]

Arbitrary Convex and Symmetric

T(nk, mk2)

SchlesingerTransformation

[06]

Arbitrary Submodular T(nk, mk2)

Hochbaum[01]

Linear Convex and Symmetric

T(n, m) + n log k

Hochbaum[01]

Convex Convex and Symmetric

O(mn log n log nk)

Other “less known” algorithms

T(a,b) = complexity of maxflow with a nodes and b edges

Exact Transformation to QPBF

Move making algorithms

E(y) = ∑ fi (yi) + ∑ gij (yi,yj)i,ji

y ϵ Labels L = {l1, l2, … , lk}

Miny

[Boykov , Veksler and Zabih 2001] [Woodford, Fitzgibbon, Reid, Torr, 2008]

[Lempitsky, Rother, Blake, 2008] [Veksler, 2008] [Kohli, Ladicky, Torr 2008]

Solution Space

En

erg

y

Search Neighbourhood

Current Solution

Optimal Move

Solution Space

En

erg

y

Search Neighbourhood

Current Solution

Optimal Move

xc

(t) Key Property

Move Space

Bigger move space

Solution Space

En

erg

y

• Better solutions

• Finding the optimal move hard

Minimizing Pairwise Functions[Boykov Veksler and Zabih, PAMI 2001]

• Series of locally optimal moves

• Each move reduces energy

• Optimal move by minimizing submodular function

Space of Solutions (x) : Ln

Move Space (t) : 2nSearch Neighbourhood

Current Solution

n Number of Variables

L Number of Labels

Kohli et al. ‘07, ‘08, ‘09Extend to minimize Higher order Functions

Minimize over move variables t

x = t x1 + (1-t) x2

New solution

Current Solution

Second solution

Em(t) = E(t x1 + (1-t) x2)

For certain x1 and x2, the move energy is sub-modular QPBF

[Boykov , Veksler and Zabih 2001]

• Variables labeled α, β can swap their labels

[Boykov , Veksler and Zabih 2001]

Sky

House

Tree

GroundSwap Sky, House

• Variables labeled α, β can swap their labels

[Boykov , Veksler and Zabih 2001]

Move energy is submodular if:

Unary Potentials: Arbitrary

Pairwise potentials: Semi-metric

θij (la,lb) ≥ 0

θij (la,lb) = 0 a = b

Examples: Potts model, Truncated Convex

[Boykov , Veksler and Zabih 2001]

• Variables labeled α, β can swap their labels

[Boykov, Veksler, Zabih]

• Variables take label a or retain current label

[Boykov , Veksler and Zabih 2001]

Sky

House

Tree

Ground

Initialize with TreeStatus: Expand GroundExpand HouseExpand Sky

[Boykov, Veksler, Zabih][Boykov , Veksler and Zabih 2001]

• Variables take label a or retain current label

Move energy is submodular if:

Unary Potentials: Arbitrary

Pairwise potentials: Metric

[Boykov, Veksler, Zabih]

θij (la,lb) + θij (lb,lc) ≥ θij (la,lc)

Semi metric+

Triangle Inequality

Examples: Potts model, Truncated linear

Cannot solve truncated quadratic

• Variables take label a or retain current label

[Boykov , Veksler and Zabih 2001]

Expansion and Swap can be derived as a primal dual scheme

Get solution of the dual problem which is a lower bound on the energy of solution

Weak guarantee on the solution

[Komodakis et al 05, 07]

E(x) < 2(dmax /dmin) E(x*)

dmax

dmin

θij (li,lj) = g(|li-lj|)

|yi-yj|

Move Type First Solution

Second Solution

Guarantee

Expansion Old solution All alpha Metric

Fusion Any solution Any solution

Minimize over move variables t

x = t x1 + (1-t) x2

New solution

First solution

Second solution

Move functions can be non-submodular!!

x = t x1 + (1-t) x2

x1, x2 can be continuous

Fx1

x2

x

Optical Flow Example

Final Solution

Solution from

Method 1

Solution from

Method 2

[Woodford, Fitzgibbon, Reid, Torr, 2008] [Lempitsky, Rother, Blake, 2008]

Move variables can be multi-label

Optimal move found out by using the Ishikawa Transform

Useful for minimizing energies with truncated convex pairwise potentials

θij (yi,yj) = min(|yi-yj|2,T)

|yi-yj|

θij (yi,yj)

T

x = (t ==1) x1 + (t==2) x2 +… +(t==k) xk

[Veksler, 2007]

[Veksler, 2008]

ImageNoisy Image

Range Moves

Expansion Move

Why?

3,600,000,000 PixelsCreated from about 800 8 MegaPixel Images

[Kopf et al. (MSR Redmond) SIGGRAPH 2007 ]

[Kopf et al. (MSR Redmond) SIGGRAPH 2007 ]

Processing Videos1 minute video of 1M pixel resolution

3.6 B pixels

3D reconstruction [500 x 500 x 500 = .125B voxels]

Kohli & Torr (ICCV05, PAMI07)

Can we do better?

Segment

Segment

First Frame

Second Frame[Kohli & Torr, ICCV05 PAMI07]

Kohli & Torr (ICCV05, PAMI07)[Kohli & Torr, ICCV05 PAMI07]

Image

Flow Segmentation

EA SAminimize

fast

Simpler

3–100000

time speedup!Reparametrization

Reuse Computation

minimizeSB

differencesbetweenA and B

EB*

EBFrame 2

Frame 1

[Kohli & Torr, ICCV05 PAMI07] [Komodakis & Paragios, CVPR07]

Reparametrized Energy

Kohli & Torr (ICCV05, PAMI07)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2

Original Energy

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 7a1ā2 + ā1a2

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2 + 5a1ā2

New Energy

New Reparametrized Energy

[Kohli & Torr, ICCV05 PAMI07] [Komodakis & Paragios, CVPR07]

Original Problem(Large)

Fast partially optimal

algorithm

ApproximateSolution

[ Alahari Kohli & Torr CVPR ‘08]

Approximation algorithm

(Slow)

Original Problem(Large)

Fast partially optimal

algorithm

ApproximateSolution

Approximation algorithm

(Slow)

Reduced Problem

Solved Problem(Global Optima)

Approximation algorithm

ApproximateSolution

Fast partially optimal algorithm

[Kovtun ‘03] [ Kohli et al. ‘09]

[ Alahari Kohli & Torr CVPR ‘08]

(Fast)

Original Problem(Large)

Fast partially optimal

algorithm

ApproximateSolution

Tree ReweightedMessage Passing

(9.89 sec)

Reduced Problem

Solved Problem(Global Optima) Total Time

(0.30 sec)

ApproximateSolution

sky

Building

Airplane

Grass

sky

Building

Airplane

Grass

sky

Building

Airplane

Grass

3- 100Times Speed up

Tree ReweightedMessage Passing

[ Alahari Kohli & Torr CVPR ‘08]

Fast partially optimal algorithm

[Kovtun ‘03] [ Kohli et al. ‘09]

Minimization with Complex Higher Order Functions

Connectivity

Counting Constraints

Hybrid algorithms

Connections between Messages Passing algorithms BP, TRW, and graph cuts

MIT Press, summer 2010

Topics of this course and much, much more

Contributors: usual suspects – lecturers on this course + Boykov,

Kolmogorov, Weiss, Freeman, ....

one for the office and one for home

www.research.microsoft.com/vision/MRFbook

Advances in Markov Random Fields for Computer Vision

Space of Problems

CSP

MAXCUT

NP-Hard

Which functions are exactly solvable?

Approximate solutions of NP-hard problems

Scalability and Efficiency

Which functions are exactly solvable?Boros Hammer [1965], Kolmogorov Zabih [ECCV 2002, PAMI 2004] , Ishikawa [PAMI 2003], Schlesinger [EMMCVPR 2007], Kohli Kumar Torr [CVPR2007, PAMI 2008] , Ramalingam Kohli Alahari Torr [CVPR 2008] , Kohli Ladicky Torr [CVPR 2008, IJCV 2009] , Zivny Jeavons [CP 2008]

Approximate solutions of NP-hard problemsSchlesinger [76 ], Kleinberg and Tardos [FOCS 99], Chekuri et al. [01], Boykov et al. [PAMI 01], Wainwright et al. [NIPS01], Werner [PAMI 2007], Komodakis et al. [PAMI, 05 07], Lempitsky et al. [ICCV 2007], Kumar et al. [NIPS 2007], Kumar et al. [ICML 2008], Sontag and Jakkola [NIPS 2007], Kohli et al. [ICML 2008], Kohli et al. [CVPR 2008, IJCV 2009], Rother et al. [2009]

Scalability and Efficiency Kohli Torr [ICCV 2005, PAMI 2007], Juan and Boykov [CVPR 2006], Alahari Kohli Torr [CVPR 2008] , Delong and Boykov [CVPR 2008]

Iterated Conditional Modes (ICM)

Simulated Annealing

Dynamic Programming (DP)

Belief Propagtion (BP)

Tree-Reweighted (TRW), Diffusion

Graph Cut (GC)

Branch & Bound

Relaxation methods:

Classical Move making algorithms

Combinatorial Algorithms

Message passing

Convex Optimization(Linear Programming,

...)