ICCV2009: MAP Inference in Discrete Models: Part 4

transcript

Pushmeet Kohli

ICCV 2009

Course programme

9.30-10.00 Introduction (Andrew Blake)

10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)

15min Coffee break

11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar)

12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli)

1 hour Lunch break

14:00-15.00 Transformation and move-making methods (Pushmeet Kohli)

15:00-15.30 Speed and Efficiency (Pushmeet Kohli)

15min Coffee break

15:45-16.15 Comparison of Methods (Carsten Rother)

16:30-17.30 Recent Advances: Dual-decomposition, higher-order, etc. (Carsten Rother + Pawan Kumar)

All online material will be online (after conference):http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/

E(x) = ∑ fi (xi) + ∑ gij (xi,xj) + ∑ hc(xc) i ij c

Unary Pairwise Higher Order

Image Segmentation

∑ ci xi + ∑ dij |xi-xj|i i,j

E: {0,1}n → R

n = number of pixels

Space of Problems

Submodular

Functions

Tree Structured

Pair-wiseO(n3)

MAXCUT

n = Number of Variables

Segmentation Energy

NP-Hard

More General Minimization Problems

st-mincut and Pseudo-booleanoptimization

Speed and Efficiency

Example: n = 2, A = [1,0] , B = [0,1]

f([1,0]) + f([0,1]) f([1,1]) + f([0,0])

Property : Sum of submodular functions is submodular

E(x) = ∑ ci xi + ∑ dij |xi-xj|i i,j

Binary Image Segmentation Energy is submodular

for all A,B ϵ {0,1}nf(A) + f(B) f(A˅B) + f(A˄B)(AND)(OR)

Pseudo-boolean function f{0,1}n ℝ is submodular if

Discrete Analogues of Concave Functions[Lovasz, ’83]

Widely applied in Operations Research

Applications in Machine Learning MAP Inference in Markov Random Fields

Clustering [Narasimhan , Jojic, & Bilmes, NIPS 2005]

Structure Learning [Narasimhan & Bilmes, NIPS 2006]

Maximizing the spread of influence through a social network [Kempe, Kleinberg & Tardos, KDD 2003]

Polynomial time algorithms Ellipsoid Algorithm: [Grotschel, Lovasz & Schrijver ‘81]

First strongly polynomial algorithm: [Iwata et al. ’00] [A. Schrijver ’00]

Current Best: O(n5 Q + n6) [Q is function evaluation time] [Orlin ‘07]

Symmetric functions: E(x) = E(1-x) Can be minimized in O(n3)

Minimizing Pairwise submodular functions Can be transformed to st-mincut/max-flow [Hammer , 1965]

Very low empirical running time ~ O(n)

E(X) = ∑ fi (xi) + ∑ gij (xi,xj)i ij

Source

Graph (V, E, C)

Vertices V = {v1, v2 ... vn}

Edges E = {(v1, v2) ....}

Costs C = {c(1, 2) ....}

Source

What is a st-cut?

Source

What is a st-cut?

An st-cut (S,T) divides the nodes between source and sink.

What is the cost of a st-cut?

Sum of cost of all edges going from S to T

5 + 1 + 9 = 15

What is a st-cut?

An st-cut (S,T) divides the nodes between source and sink.

What is the cost of a st-cut?

Sum of cost of all edges going from S to T

What is the st-mincut?

st-cut with the minimum cost

Source

2 + 2 + 4 = 8

Construct a graph such that:

1. Any st-cut corresponds to an assignment of x

2. The cost of the cut is equal to the energy of x : E(x)

SolutionT

S st-mincut

[Hammer, 1965] [Kolmogorov and Zabih, 2002]

E(x) = ∑ θi (xi) + ∑ θij (xi,xj)i,ji

θij(0,1) + θij (1,0) θij (0,0) + θij (1,1)For all ij

E(x) = ∑ ci xi + ∑ cij xi(1-xj) cij≥0i,ji

Equivalent (transformable)

Sink (1)

Source (0)

E(a1,a2)

Sink (1)

Source (0)

E(a1,a2) = 2a1

E(a1,a2) = 2a1 + 5ā1

Sink (1)

Source (0)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2

Sink (1)

Source (0)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2

Sink (1)

Source (0)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

Sink (1)

Source (0)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

Sink (1)

Source (0)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

1a1 = 1 a2 = 1

E (1,1) = 11

Cost of cut = 11

Sink (1)

Source (0)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

Sink (1)

Source (0)

a1 = 1 a2 = 0

E (1,0) = 8

st-mincut cost = 8

Source

Solve the dual maximum flow problem

Compute the maximum flow between Source and Sink s.t.

Edges: Flow < Capacity

Nodes: Flow in = Flow out

Assuming non-negative capacity

In every network, the maximum flow equals the cost of the st-mincut

Min-cut\Max-flow Theorem

Augmenting Path Based Algorithms

Source

Flow = 0

1. Find path from source to sink with positive capacity

Source

Flow = 0

2. Push maximum possible flow through this path

Source

Flow = 0 + 2

Source

Flow = 2

Source

3. Repeat until no path can be found

Flow = 2

Source

Flow = 2

Source

Flow = 2 + 4

Source

Flow = 6

Source

Flow = 6

Source

Flow = 6 + 2

Source

Flow = 8

Source

Flow = 8

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

Sink (1)

Source (0)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

Sink (1)

Source (0)

2a1 + 5ā1

= 2(a1+ā1) + 3ā1

= 2 + 3ā1

Sink (1)

Source (0)

E(a1,a2) = 2 + 3ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

12a1 + 5ā1

= 2(a1+ā1) + 3ā1

= 2 + 3ā1

E(a1,a2) = 2 + 3ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

19a2 + 4ā2

= 4(a2+ā2) + 5ā2

= 4 + 5ā2

Sink (1)

Source (0)

E(a1,a2) = 2 + 3ā1+ 5a2 + 4 + 2a1ā2 + ā1a2

19a2 + 4ā2

= 4(a2+ā2) + 5ā2

= 4 + 5ā2

Sink (1)

Source (0)

E(a1,a2) = 6 + 3ā1+ 5a2 + 2a1ā2 + ā1a2

Sink (1)

Source (0)

E(a1,a2) = 6 + 3ā1+ 5a2 + 2a1ā2 + ā1a2

Sink (1)

Source (0)

E(a1,a2) = 6 + 3ā1+ 5a2 + 2a1ā2 + ā1a2

3ā1+ 5a2 + 2a1ā2

= 2(ā1+a2+a1ā2) +ā1+3a2

= 2(1+ā1a2) +ā1+3a2

F1 = ā1+a2+a1ā2

F2 = 1+ā1a2

a1 a2 F1 F2

0 0 1 1

0 1 2 2

1 0 1 1

1 1 1 1

Sink (1)

Source (0)

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2

3ā1+ 5a2 + 2a1ā2

= 2(ā1+a2+a1ā2) +ā1+3a2

= 2(1+ā1a2) +ā1+3a2

a1 a2 F1 F2

0 0 1 1

0 1 2 2

1 0 1 1

1 1 1 1

F1 = ā1+a2+a1ā2

F2 = 1+ā1a2

Sink (1)

Source (0)

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2

No more augmenting paths

possible

Sink (1)

Source (0)

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2

Total Flow

Residual Graph(positive coefficients)

bound on the optimal solution

Tight Bound --> Inference of the optimal solution becomes trivial

Sink (1)

Source (0)

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2

a1 = 1 a2 = 0

E (1,0) = 8

st-mincut cost = 8Total Flow

bound on the optimal solution

Residual Graph(positive coefficients)

Tight Bound --> Inference of the optimal solution becomes trivial

Sink (1)

Source (0)

[Slide credit: Andrew Goldberg]

Augmenting Path and Push-Relabel n: #nodes

m: #edges

U: maximum edge weight

Algorithms assume non-

negative edge weights

[Slide credit: Andrew Goldberg]

n: #nodes

m: #edges

U: maximum edge weight

Algorithms assume non-

negative edge weights

Augmenting Path and Push-Relabel

Source

1000 1000

Ford Fulkerson: Choose any augmenting path

Source

1000 1000

0 Good Augmenting

Source

1000 1000

0 Bad Augmenting

Source

1000 999

Source

1000 999

n: #nodes

m: #edges

We will have to perform 2000 augmentations!

Worst case complexity: O (m x Total_Flow)(Pseudo-polynomial bound: depends on flow)

Dinic: Choose shortest augmenting path

n: #nodes

m: #edges

Worst case Complexity: O (m n2)

Source

1000 1000

Specialized algorithms for vision problems Grid graphs

Low connectivity (m ~ O(n))

Dual search tree augmenting path algorithm[Boykov and Kolmogorov PAMI 2004]

• Finds approximate shortest augmenting paths efficiently

• High worst-case time complexity

• Empirically outperforms other algorithms on vision problems

Efficient code available on the web

http://www.adastral.ucl.ac.uk/~vladkolm/software.html

E(x) = ∑ ci xi + ∑ dij |xi-xj|i i,j

xx* = arg min E(x)

How to minimize E(x)?

E: {0,1}n → R0 → fg1 → bg

Sink (1)

Source (0)

Graph *g;

For all pixels p

/* Add a node to the graph */nodeID(p) = g->add_node();

/* Set cost of terminal edges */set_weights(nodeID(p), fgCost(p), bgCost(p));

for all adjacent pixels p,qadd_weights(nodeID(p), nodeID(q), cost(p,q));

g->compute_maxflow();

label_p = g->is_connected_to_source(nodeID(p));// is the label of pixel p (0 or 1)

fgCost(a1)

Sink (1)

Source (0)

fgCost(a2)

bgCost(a1) bgCost(a2)

Graph *g;

For all pixels p

fgCost(a1)

Sink (1)

Source (0)

fgCost(a2)

cost(p,q)

Graph *g;

For all pixels p

Graph *g;

For all pixels p

fgCost(a1)

Sink (1)

Source (0)

fgCost(a2)

cost(p,q)

a1 = bg a2 = fg

MIT Press, summer 2010

Topics of this course and much, much more

Contributors: usual suspects – lecturers on this course + Boykov,

Kolmogorov, Weiss, Freeman, ....

one for the office and one for home

www.research.microsoft.com/vision/MRFbook

Advances in Markov Random Fields for Computer Vision

Non-submodular Energy Functions

Mixed (Real-Integer) Problems

Higher Order Energy Functions

Multi-label Problems

Ordered Labels▪ Stereo (depth labels)

Unordered Labels▪ Object segmentation ( ‘car’, `road’, `person’)

Minimizing general non-submodular functions is NP-hard.

Commonly used method is to solve a relaxation of the problem

E(x) = ∑ θi (xi) + ∑ θij (xi,xj)i,ji

θij(0,1) + θij (1,0) ≤ θij (0,0) + θij (1,1) for some ij

[Boros and Hammer, ‘02]

pairwise nonsubmodular

pairwise submodular

)0,1()1,0()1,1()0,0( pqpqpqpq

)0,1(~

)1,0(~

)1,1(~

)0,0(~

pqpqpqpq

Double number of variables: ppp xxx , )1( pp xx

Ignore constraint and solve

Local Optimality

[Rother, Kolmogorov, Lempitsky, Szummer] [CVPR 2007]

0 ? ? ? ? ?

rp q s t

0 0 0 ? ? 0 0 1 0 ?

rp q s t

rp q s tQPBO:

Probe Node p:

What can we say about variables?

•r -> is always 0•s -> is always equal to q•t -> is 0 when q = 1

Probe nodes in an order until energy unchanged

Simplified energy preserves global optimality and (sometimes) gives the global minimum

Result depends slightly on the order

• Property: E(y’) ≤ E(y) [autarky property]

0 ? ? ?

? ? ? ?

0 0 0 0

0 0 0 1

0 0 1 0

0 0 0 0

x (partial)y (e.g. from BP) y’ = FUSE(x,y)

0 0 ? ?

? ? ? ?

0 0 0 1

0 0 1 ?

? ? ? ?

Ordered Labels▪ Stereo (image intensity, depth)

[Kumar et al, 05] [Kohli et al, 06,08]

colour appearance based Segmentation

Need for a human like segmentation

Segmentation Result

x – binary image segmentation (xi ∊ {0,1})

ω – non-local parameter (lives in some large set Ω)

constantunary

potentialspairwise

potentials

E(x,ω) = C(ω) + ∑ θi (ω, xi) + ∑ θij (ω,xi,xj)i,ji

Rough Shape Prior

Stickman Model

ωPose

θi (ω, xi) Shape Prior

[Kohli et al, 06,08]

constantunary

potentialspairwise

potentials

ωTemplate Position

Scale Orientation

[Kohli et al, 06,08] [Lempitsky et al, 08]

constantunary

potentialspairwise

potentials

{x*,ω*} = arg min E(x,ω)

• Standard “graph cut” energy if ω is fixed

[Kohli et al, 06,08] [Lempitsky et al, 08]

Local Method: Gradient Descent over ω

ω * = arg min min E (x,ω)

Submodular

Local Method: Gradient Descent over ω

ω * = arg min min E (x,ω)

Submodular

Dynamic Graph Cuts

15- 20 time speedup!

E (x,ω1)

E (x,ω2)

Similar Energy Functions

Global Method: Branch and Mincut

[Lempitsky et al, 08]

Produces the global optimal solution

Exhaustively explores Ω in the worst case

Ω (space of w) is hierarchically clustered

Standard best-first branch-and-bound search:

Small fraction of nodes is visited

lowest lower bound

30,000,000 shapes

Exhaustive search: 30,000,000 mincuts

Branch-and-Mincut: 12,000 mincuts

Speed-up: 2500 times(30 seconds per 312x272 image)

Left ventricle epicardium tracking (work in progress)

Branch & Bound segmentation

Shape prior from other sequences

5,200,000 templates

≈20 seconds per frame

Speed-up 1150

Data courtesy: Dr Harald Becher, Department of Cardiovascular Medicine, University of Oxford

Original sequence No shape prior

Pairwise functions have limited expressive power

Inability to incorporate region based likelihoods and priors

Field of Experts Model[Roth & Black CVPR 2005 ][Potetz, CVPR 2007]

Minimize Curvature [Woodford et al. CVPR 2008 ]

Other Examples:[Rother, Kolmogorov, Minka & Blake, CVPR 2006][Komodakis and Paragios, CVPR 2009][Rother, Kohli, Feng, Jia, CVPR 2009][Ishikawa, CVPR 2009]And many others ...

E(X) = ∑ ci xi + ∑ dij |xi-xj|i i,j

E: {0,1}n → R

0 →fg, 1→bg

[Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and Blake `04]

Image Unary Cost Segmentation

Patch Dictionary (Tree)

Cmax C1

{ C1 if xi = 0, i ϵ pCmax otherwise

h(Xp) =

[Kohli et al. ‘07]

E(X) = ∑ ci xi + ∑ dij |xi-xj| + ∑ hp (Xp)i i,j p

{ C1 if xi = 0, i ϵ pCmax otherwise

h(Xp) =

E: {0,1}n → R

0 →fg, 1→bg

E(X) = ∑ ci xi + ∑ dij |xi-xj| + ∑ hp (Xp)i i,j

Image Pairwise Segmentation Final Segmentation

E: {0,1}n → R

0 →fg, 1→bg

Sst-mincut

Pairwise SubmodularFunction

Higher Order Submodular

Functions

Billionnet and M. Minoux [DAM 1985]Kolmogorov & Zabih [PAMI 2004]Freedman & Drineas [CVPR2005]Kohli Kumar Torr [CVPR2007, PAMI 2008]Kohli Ladicky Torr [CVPR 2008, IJCV 2009]Ramalingam Kohli Alahari Torr [CVPR 2008]Zivny et al. [CP 2008]

Exact Transformation

Identified transformable families of higher order function s.t.

1. Constant or polynomial number of auxiliary variables (a) added

2. All pairwise functions (g) are submodular

PairwiseSubmodular

Function

Higher Order Function

H (X) = F ( ∑ xi )Example:

∑ xi

concave

Simple Example using Auxiliary variables

{ 0 if all xi = 0C1 otherwisef(x) =

min f(x) min C1a + C1 ∑ ā xi

x ϵ L = {0,1}n

x =x,a ϵ {0,1}

Higher Order Submodular Function

Quadratic Submodular Function

∑xi < 1 a=0 (ā=1) f(x) = 0

∑xi ≥ 1 a=1 (ā=0) f(x) = C1

min f(x) min C1a + C1 ∑ ā xix

=x,a ϵ {0,1}

Quadratic SubmodularFunction

C1∑xi

min f(x) min C1a + C1 ∑ ā xix

=x,a ϵ {0,1}

C1∑xi

a=1a=0Lower envelop of concave functions is

concave

min f(x) min f1 (x)a + f2(x)āx

=x,a ϵ {0,1}

Lower envelop of concave functions is

concave

min f(x) min f1 (x)a + f2(x)āx

=x,a ϵ {0,1}

a=1a=0Lower envelop of concave functions is

concave

Transforming Potentials with 3 variables [Woodford, Fitzgibbon, Reid, Torr, CVPR 2008]

Transforming general “sparse” higher order functions [Rother, Kohli, Feng, Jia, CVPR 2009][Ishikawa, CVPR 2009][Komodakis and Paragios, CVPR 2009]

Test Image Test Image(60% Noise)

TrainingImage

Result

PairwiseEnergy

Minimized usingst-mincut or max-product

message passing

TrainingImage

Result

PairwiseEnergy

Minimized usingst-mincut or max-product

message passing

Higher Order Structure not Preserved

Minimize:

Where:

Higher Order Function (|c| = 10x10 = 100)

Assigns cost to 2100 possible labellings!

Exploit function structure to transform it to a Pairwise function

E(X) = P(X) + ∑ hc (Xc) c

hc: {0,1}|c| → R

p1 p2 p3

TrainingImage

PairwiseResult

Higher-Order Result

Learned Patterns

[Joint work with Carsten Rother ]

Exact Transformation to QPBF

Move making algorithms

E(y) = ∑ fi (yi) + ∑ gij (yi,yj)i,ji

y ϵ Labels L = {l1, l2, … , lk}

[Roy and Cox ’98] [Ishikawa ’03] [Schlesinger & Flach ’06]

[Ramalingam, Alahari, Kohli, and Torr ’08]

So what is the problem?

Eb (x1,x2, ..., xm)Em (y1,y2, ..., yn)

Multi-label Problem Binary label Problem

yi ϵ L = {l1, l2, … , lk} xi ϵ L = {0,1}

such that:

Let Y and X be the set of feasible solutions, then

1. One-One encoding function T:X->Y

2. arg min Em(y) = T(arg min Eb(x))

• Popular encoding scheme [Roy and Cox ’98, Ishikawa ’03, Schlesinger & Flach ’06]

# Nodes = n * k

# Pairwise = m * k2

# Nodes = n * k

# Pairwise = m * k2

Ishikawa’s result:

E(y) = ∑ θi (yi) + ∑ θij (yi,yj)i,ji

θij (yi,yj) = g(|yi-yj|)Convex Function

g(|yi-yj|)

|yi-yj|

# Nodes = n * k

# Pairwise = m * k2

Schlesinger & Flach ’06:

E(y) = ∑ θi (yi) + ∑ θij (yi,yj)i,ji

θij(li+1,lj) + θij (li,lj+1) θij (li,lj) + θij (li+1,lj+1)

Image MAP Solution

Scanlinealgorithm

[Roy and Cox, 98]

Applicability Cannot handle truncated costs

(non-robust)

Computational Cost Very high computational cost

Problem size = |Variables| x |Labels|

Gray level image denoising (1 Mpixel image)

(~2.5 x 108 graph nodes)

θij (yi,yj) = g(|yi-yj|)

|yi-yj|

discontinuity preserving potentialsBlake&Zisserman’83,87

Unary Potentials

Pair-wise Potentials

Complexity

IshikawaTransformation

Arbitrary Convex and Symmetric

T(nk, mk2)

SchlesingerTransformation

Arbitrary Submodular T(nk, mk2)

Hochbaum[01]

Linear Convex and Symmetric

T(n, m) + n log k

Hochbaum[01]

Convex Convex and Symmetric

O(mn log n log nk)

Other “less known” algorithms

T(a,b) = complexity of maxflow with a nodes and b edges

Exact Transformation to QPBF

Move making algorithms

E(y) = ∑ fi (yi) + ∑ gij (yi,yj)i,ji

[Boykov , Veksler and Zabih 2001] [Woodford, Fitzgibbon, Reid, Torr, 2008]

[Lempitsky, Rother, Blake, 2008] [Veksler, 2008] [Kohli, Ladicky, Torr 2008]

Solution Space

Search Neighbourhood

Current Solution

Optimal Move

Solution Space

Search Neighbourhood

Current Solution

Optimal Move

(t) Key Property

Move Space

Bigger move space

Solution Space

• Better solutions

• Finding the optimal move hard

Minimizing Pairwise Functions[Boykov Veksler and Zabih, PAMI 2001]

• Series of locally optimal moves

• Each move reduces energy

• Optimal move by minimizing submodular function

Space of Solutions (x) : Ln

Move Space (t) : 2nSearch Neighbourhood

Current Solution

n Number of Variables

L Number of Labels

Kohli et al. ‘07, ‘08, ‘09Extend to minimize Higher order Functions

Minimize over move variables t

x = t x1 + (1-t) x2

New solution

Current Solution

Second solution

Em(t) = E(t x1 + (1-t) x2)

For certain x1 and x2, the move energy is sub-modular QPBF

[Boykov , Veksler and Zabih 2001]

• Variables labeled α, β can swap their labels

GroundSwap Sky, House

Move energy is submodular if:

Unary Potentials: Arbitrary

Pairwise potentials: Semi-metric

θij (la,lb) ≥ 0

θij (la,lb) = 0 a = b

Examples: Potts model, Truncated Convex

[Boykov, Veksler, Zabih]

• Variables take label a or retain current label

Ground

Initialize with TreeStatus: Expand GroundExpand HouseExpand Sky

[Boykov, Veksler, Zabih][Boykov , Veksler and Zabih 2001]

Move energy is submodular if:

Unary Potentials: Arbitrary

Pairwise potentials: Metric

[Boykov, Veksler, Zabih]

θij (la,lb) + θij (lb,lc) ≥ θij (la,lc)

Semi metric+

Triangle Inequality

Examples: Potts model, Truncated linear

Cannot solve truncated quadratic

Expansion and Swap can be derived as a primal dual scheme

Get solution of the dual problem which is a lower bound on the energy of solution

Weak guarantee on the solution

[Komodakis et al 05, 07]

E(x) < 2(dmax /dmin) E(x*)

θij (li,lj) = g(|li-lj|)

|yi-yj|

Move Type First Solution

Second Solution

Guarantee

Expansion Old solution All alpha Metric

Fusion Any solution Any solution

Minimize over move variables t

x = t x1 + (1-t) x2

New solution

First solution

Second solution

Move functions can be non-submodular!!

x = t x1 + (1-t) x2

x1, x2 can be continuous

Optical Flow Example

Final Solution

Solution from

Method 1

Solution from

Method 2

[Woodford, Fitzgibbon, Reid, Torr, 2008] [Lempitsky, Rother, Blake, 2008]

Move variables can be multi-label

Optimal move found out by using the Ishikawa Transform

Useful for minimizing energies with truncated convex pairwise potentials

θij (yi,yj) = min(|yi-yj|2,T)

|yi-yj|

θij (yi,yj)

x = (t ==1) x1 + (t==2) x2 +… +(t==k) xk

[Veksler, 2007]

[Veksler, 2008]

ImageNoisy Image

Range Moves

Expansion Move

3,600,000,000 PixelsCreated from about 800 8 MegaPixel Images

[Kopf et al. (MSR Redmond) SIGGRAPH 2007 ]

Processing Videos1 minute video of 1M pixel resolution

3.6 B pixels

3D reconstruction [500 x 500 x 500 = .125B voxels]

Kohli & Torr (ICCV05, PAMI07)

Can we do better?

Segment

First Frame

Second Frame[Kohli & Torr, ICCV05 PAMI07]

Kohli & Torr (ICCV05, PAMI07)[Kohli & Torr, ICCV05 PAMI07]

Flow Segmentation

EA SAminimize

Simpler

3–100000

time speedup!Reparametrization

Reuse Computation

minimizeSB

differencesbetweenA and B

EBFrame 2

Frame 1

[Kohli & Torr, ICCV05 PAMI07] [Komodakis & Paragios, CVPR07]

Reparametrized Energy

Kohli & Torr (ICCV05, PAMI07)

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 2a1ā2 + ā1a2

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2

Original Energy

E(a1,a2) = 2a1 + 5ā1+ 9a2 + 4ā2 + 7a1ā2 + ā1a2

E(a1,a2) = 8 + ā1+ 3a2 + 3ā1a2 + 5a1ā2

New Energy

New Reparametrized Energy

[Kohli & Torr, ICCV05 PAMI07] [Komodakis & Paragios, CVPR07]

Original Problem(Large)

Fast partially optimal

algorithm

ApproximateSolution

[ Alahari Kohli & Torr CVPR ‘08]

Approximation algorithm

(Slow)

algorithm

ApproximateSolution

(Slow)

Reduced Problem

Solved Problem(Global Optima)

ApproximateSolution

Fast partially optimal algorithm

[Kovtun ‘03] [ Kohli et al. ‘09]

(Fast)

algorithm

ApproximateSolution

Tree ReweightedMessage Passing

(9.89 sec)

Reduced Problem

Solved Problem(Global Optima) Total Time

(0.30 sec)

ApproximateSolution

Building

Airplane

Building

Airplane

Building

Airplane

3- 100Times Speed up

Tree ReweightedMessage Passing

Fast partially optimal algorithm

[Kovtun ‘03] [ Kohli et al. ‘09]

Minimization with Complex Higher Order Functions

Connectivity

Counting Constraints

Hybrid algorithms

Connections between Messages Passing algorithms BP, TRW, and graph cuts

MIT Press, summer 2010

Topics of this course and much, much more

Contributors: usual suspects – lecturers on this course + Boykov,

Kolmogorov, Weiss, Freeman, ....

one for the office and one for home

www.research.microsoft.com/vision/MRFbook

Advances in Markov Random Fields for Computer Vision

Space of Problems

MAXCUT

NP-Hard

Which functions are exactly solvable?

Approximate solutions of NP-hard problems

Scalability and Efficiency

Which functions are exactly solvable?Boros Hammer [1965], Kolmogorov Zabih [ECCV 2002, PAMI 2004] , Ishikawa [PAMI 2003], Schlesinger [EMMCVPR 2007], Kohli Kumar Torr [CVPR2007, PAMI 2008] , Ramalingam Kohli Alahari Torr [CVPR 2008] , Kohli Ladicky Torr [CVPR 2008, IJCV 2009] , Zivny Jeavons [CP 2008]

Approximate solutions of NP-hard problemsSchlesinger [76 ], Kleinberg and Tardos [FOCS 99], Chekuri et al. [01], Boykov et al. [PAMI 01], Wainwright et al. [NIPS01], Werner [PAMI 2007], Komodakis et al. [PAMI, 05 07], Lempitsky et al. [ICCV 2007], Kumar et al. [NIPS 2007], Kumar et al. [ICML 2008], Sontag and Jakkola [NIPS 2007], Kohli et al. [ICML 2008], Kohli et al. [CVPR 2008, IJCV 2009], Rother et al. [2009]

Scalability and Efficiency Kohli Torr [ICCV 2005, PAMI 2007], Juan and Boykov [CVPR 2006], Alahari Kohli Torr [CVPR 2008] , Delong and Boykov [CVPR 2008]

Iterated Conditional Modes (ICM)

Simulated Annealing

Dynamic Programming (DP)

Belief Propagtion (BP)

Tree-Reweighted (TRW), Diffusion

Graph Cut (GC)

Branch & Bound

Relaxation methods:

Classical Move making algorithms

Combinatorial Algorithms

Message passing

Convex Optimization(Linear Programming,

ICCV2009: MAP Inference in Discrete Models: Part 4

Education