Joint Probabilistic Matching
Using m-Best Solutions
S. Hamid Rezatofighi Anton Milan Zhen Zhang Qinfeng Shi
Antony Dick Ian Reid
1
Introduction
One-to-One Graph Matching in Computer Vision
• Action Recognition
• Feature Point Matching
• Multi-Target Tracking
• Person Re-Identification
2
⋮⋮
Introduction
Most existing works focus on
• Feature and/or metric learning [Zhao et al., CVPR 2014, Liu et al., ECCV 2010]
• Developing better solvers [Cho et al., ECCV 2010, Zhou & De la Torre, CVPR 2013]
The optimal solution does not necessarily yield the correct matching assignment
To improving the matching results, we propose
• to consider more feasible solutions
• a principle approach to combine the solutions
3
One-to-One Graph Matching
Formulating it as a constrained binary program
4
⋮ ⋮
One-to-One Graph Matching
Formulating it as a constrained binary program
5
⋮ ⋮
𝑥10
𝑥11
𝑥𝑀𝑁
⋮
One-to-One Graph Matching
Formulating it as a constrained binary program
6
⋮ ⋮
𝑥10
𝑥11
𝑥𝑀𝑁
⋮
𝑥𝑖𝑗= {0,1}
𝑋 = 𝑥10, 𝑥1
1, … , 𝑥𝑖𝑗, … , 𝑥𝑀
𝑁𝑇⊆ 𝔹𝑀×(𝑁+1)
One-to-One Graph Matching
Formulating it as a constrained binary program
7
⋮ ⋮
𝑥10
𝑥11
𝑥𝑀𝑁
⋮
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
Or𝑋∗ = argmax 𝑝 𝑋
𝑋 ∈ 𝒳
where
𝒳 = ቄ𝑋 = 𝑥𝑖𝑗
∀𝑖,𝑗| 𝑥𝑖
𝑗= 0,1 ,
∀𝑗: ∑ 𝑥𝑖𝑗≤ 1,
ቅ∀𝑖: ∑ 𝑥𝑖𝑗= 1
One-to-One Graph Matching
Formulating it as a constrained binary program
8
⋮ ⋮
𝑥10
𝑥11
𝑥𝑀𝑁
⋮𝒳 = ቄ𝑋 = 𝑥𝑖
𝑗
∀𝑖,𝑗| 𝑥𝑖
𝑗= 0,1 ,
∀𝑗: ∑ 𝑥𝑖𝑗≤ 1,
ቅ∀𝑖: ∑ 𝑥𝑖𝑗= 1
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
Or𝑋∗ = argmax 𝑝 𝑋
𝑋 ∈ 𝒳
where
One-to-One Graph Matching
Formulating it as a constrained binary program
9
⋮ ⋮
𝑥10
𝑥11
𝑥𝑀𝑁
⋮𝒳 = ቄ𝑋 = 𝑥𝑖
𝑗
∀𝑖,𝑗| 𝑥𝑖
𝑗= 0,1 ,
∀𝑗: ∑ 𝑥𝑖𝑗≤ 1,
ቅ∀𝑖: ∑ 𝑥𝑖𝑗= 1
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
Or𝑋∗ = argmax 𝑝 𝑋
𝑋 ∈ 𝒳
where
One-to-One Graph Matching
Formulating it as a constrained binary program
10
⋮ ⋮
𝑥10
𝑥11
𝑥𝑀𝑁
⋮𝒳 = ቄ𝑋 = 𝑥𝑖
𝑗
∀𝑖,𝑗| 𝑥𝑖
𝑗= 0,1 ,
∀𝑗: ∑ 𝑥𝑖𝑗≤ 1,
ቅ∀𝑖: ∑ 𝑥𝑖𝑗= 1
𝐴𝑋 ≤ 𝐵
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
Or𝑋∗ = argmax 𝑝 𝑋
𝑋 ∈ 𝒳
where
One-to-One Graph Matching
Formulating it as a constrained binary program
11
𝒳 = ቄ𝑋 = 𝑥𝑖𝑗
∀𝑖,𝑗| 𝑥𝑖
𝑗= 0,1 ,
∀𝑗: ∑ 𝑥𝑖𝑗≤ 1,
ቅ∀𝑖: ∑ 𝑥𝑖𝑗= 1
⋮ ⋮
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
Or𝑋∗ = argmax 𝑝 𝑋
𝑋 ∈ 𝒳
where
One-to-One Graph Matching
Examples of joint matching distribution 𝑝 𝑋 and cost 𝑓 𝑋 in different applications
• Multi-target tracking [Zheng et al., CVPR 2008] and person re-identification [Das et al., ECCV
2014 ]
• Feature point matching [Leordeanu et al., IJCV 2011]
• Stereo matching [Meltzer et al., ICCV 2005] and iterative closest point [Zheng, IJCV 1994]
higher-order constraints in addition to one-to-one constraints
12
𝑓 𝑋 = 𝐶𝑇𝑋 or equivalently 𝑝 𝑋 ∝ ς𝑝 𝑥𝑖𝑗 𝑥𝑖
𝑗
𝑓 𝑋 = 𝑋𝑇𝑄 𝑋
Marginalization VS MAP Estimates
In general, globally optimal solution may or may not be easily achieved.
Even the optimal solution does not necessarily yield the correct matching
assignment
13
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
𝑋∗ = argmax 𝑝 𝑋𝑋 ∈ 𝒳
Marginalization VS MAP Estimates
In general, globally optimal solution may or may not be easily achieved.
Even the optimal solution does not necessarily yield the correct matching
assignment
• Visual similarity
• Other ambiguities in the matching space
14
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
𝑋∗ = argmax 𝑝 𝑋𝑋 ∈ 𝒳
Marginalization VS MAP Estimates
In general, globally optimal solution may or may not be easily achieved.
Even the optimal solution does not necessarily yield the correct matching
assignment
• Visual similarity
• Other ambiguities in the matching space
15
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
𝑋∗ = argmax 𝑝 𝑋𝑋 ∈ 𝒳
Marginalization VS MAP Estimates
In general, globally optimal solution may or may not be easily achieved.
Even the optimal solution does not necessarily yield the correct matching
assignment
16
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
𝑋∗ = argmax 𝑝 𝑋𝑋 ∈ 𝒳
Marginalization VS MAP Estimates
In general, globally optimal solution may or may not be easily achieved.
Even the optimal solution does not necessarily yield the correct matching
assignment
17
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
𝑋∗ = argmax 𝑝 𝑋𝑋 ∈ 𝒳
Marginalization VS MAP Estimates
Motivation to use marginalization
Encoding the entire distribution to untangle potential ambiguities
MAP only considers one single value of that distribution
Improving matching ranking due to averaging / smoothing property
Exact marginalization is NP-hard
Requiring all feasible permutations to built the joint distribution
Solution
Approximation using m-Best solutions
18
Marginalization Using m-Best Solutions
Marginalization by considering a fraction of the matching space
Using m-highest joint probabilities 𝑝 𝑋 / m-lowest values for 𝑓 𝑋
19
Marginalization Using m-Best Solutions
Marginalization by considering a fraction of the matching space
Using m-highest joint probabilities 𝑝 𝑋 / m-lowest values for 𝑓 𝑋
20
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
𝑋∗ = argmax 𝑝 𝑋𝑋 ∈ 𝒳𝑋1
∗ is
1-st
optimal
solution
Marginalization Using m-Best Solutions
Marginalization by considering a fraction of the matching space
Using m-highest joint probabilities 𝑝 𝑋 / m-lowest values for 𝑓 𝑋
21
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
𝑋∗ = argmax 𝑝 𝑋𝑋 ∈ 𝒳𝑋2
∗ is
2-nd
optimal
solution
Marginalization Using m-Best Solutions
Marginalization by considering a fraction of the matching space
Using m-highest joint probabilities 𝑝 𝑋 / m-lowest values for 𝑓 𝑋
22
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
𝑋∗ = argmax 𝑝 𝑋𝑋 ∈ 𝒳𝑋3
∗ is
3-rd
optimal
solution
Marginalization Using m-Best Solutions
Marginalization by considering a fraction of the matching space
Using m-highest joint probabilities 𝑝 𝑋 / m-lowest values for 𝑓 𝑋
23
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
𝑋∗ = argmax 𝑝 𝑋𝑋 ∈ 𝒳𝑋𝑘
∗ is
k-th
optimal
solution
Marginalization Using m-Best Solutions
Marginalization by considering a fraction of the matching space
Using m-highest joint probabilities 𝑝 𝑋 / m-lowest values for 𝑓 𝑋
24
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
𝑋∗ = argmax 𝑝 𝑋𝑋 ∈ 𝒳𝑋𝑘
∗ is
k-th
optimal
solution
Marginalization Using m-Best Solutions
Marginalization by considering a fraction of the matching space
Using m-highest joint probabilities 𝑝 𝑋 / m-lowest values for 𝑓 𝑋
Approximation error bound decreases exponentially by increasing number of solutions [Rezatofighi et al. , ICCV 2015]
25
𝑋∗ = argmin 𝑓 𝑋𝑋 ∈ 𝒳
𝑋∗ = argmax 𝑝 𝑋𝑋 ∈ 𝒳𝑋𝑘
∗ is
k-th
optimal
solution
Computing the m-Best Solutions
Naïve exclusion strategy
26
𝑋1∗ = argmin 𝑓 𝑋
𝐴𝑋 ≤ 𝐵
Computing the m-Best Solutions
Naïve exclusion strategy
27
𝑋2∗ = argmin 𝑓 𝑋
𝐴𝑋 ≤ 𝐵𝑋, 𝑋1
∗ ≤ 𝑋1∗
1 − 1
Computing the m-Best Solutions
Naïve exclusion strategy
28
𝑋3∗ = argmin 𝑓 𝑋
𝐴𝑋 ≤ 𝐵𝑋, 𝑋1
∗ ≤ 𝑋1∗
1 − 1𝑋, 𝑋2
∗ ≤ 𝑋2∗
1 − 1
Computing the m-Best Solutions
Naïve exclusion strategy
29
𝑋𝑘∗ = argmin 𝑓 𝑋
𝐴𝑋 ≤ 𝐵𝑋, 𝑋1
∗ ≤ 𝑋1∗
1 − 1𝑋, 𝑋2
∗ ≤ 𝑋2∗
1 − 1⋮
𝑋, 𝑋𝑘−1∗ ≤ 𝑋𝑘−1
∗1 − 1
Computing the m-Best Solutions
Naïve exclusion strategy
30
𝑋𝑘∗ = argmin 𝑓 𝑋
𝐴𝑋 ≤ 𝐵ሖ𝐴𝑋 ≤ ሖ𝐵
General approach
Impractical for large values of m
Computing the m-Best Solutions
Naïve exclusion strategy
Binary Tree Partitioning
31
𝑋𝑘∗ = argmin 𝑓 𝑋
𝐴𝑋 ≤ 𝐵ሖ𝐴𝑋 ≤ ሖ𝐵
Efficient approach
Not a good strategy for weak solvers
Partitioning the space into a set of disjoint
subspaces [Rezatofighi et al., ICCV 2015 ]
General approach
Impractical for large values of m
Experimental Results
Person Re-Identification
32
Query images Gallery images
None o
f
them
𝑐10
𝑐11
𝑐𝑀𝑁
Experimental Results
Person Re-Identification
33
𝑐10 𝑐1
1 ⋯ 𝑐1𝑁
𝑐20 𝑐2
1 ⋯ ⋯
⋮ ⋮ ⋱ ⋮
𝑐𝑀0 𝑐𝑀
1 ⋯ 𝑐𝑀𝑁
Original Assignment Costs Query images Gallery images
None o
f
them
𝑐10
𝑐11
𝑐𝑀𝑁
Qu
ery i
mag
es
Gallery images
Experimental Results
Person Re-Identification
34
𝑐10 𝑐1
1 ⋯ 𝑐1𝑁
𝑐20 𝑐2
1 ⋯ ⋯
⋮ ⋮ ⋱ ⋮
𝑐𝑀0 𝑐𝑀
1 ⋯ 𝑐𝑀𝑁
Original Assignment Costs Query images Gallery images
None o
f
them
𝑐10
𝑐11
𝑐𝑀𝑁
Qu
ery i
mag
es
Gallery images
m-bst
𝔠10 𝔠1
1 ⋯ 𝔠1𝑁
𝔠20 𝔠2
1 ⋯ ⋯
⋮ ⋮ ⋱ ⋮
𝔠𝑀0 𝔠𝑀
1 ⋯ 𝔠𝑀𝑁Q
uer
y i
mag
esGallery images
m-best Marginalized Costs
𝑋∗ = argmin 𝐶𝑇𝑋𝑋 ∈ 𝒳
Experimental Results
Person Re-Identification
Ranking is improved
35
𝑐10 𝑐1
1 ⋯ 𝑐1𝑁
𝑐20 𝑐2
1 ⋯ ⋯
⋮ ⋮ ⋱ ⋮
𝑐𝑀0 𝑐𝑀
1 ⋯ 𝑐𝑀𝑁
Original Assignment Costs Query images Gallery images
None o
f
them
𝑐10
𝑐11
𝑐𝑀𝑁
Qu
ery i
mag
es
Gallery images
m-bst
𝔠10 𝔠1
1 ⋯ 𝔠1𝑁
𝔠20 𝔠2
1 ⋯ ⋯
⋮ ⋮ ⋱ ⋮
𝔠𝑀0 𝔠𝑀
1 ⋯ 𝔠𝑀𝑁Q
uer
y i
mag
esGallery images
m-best Marginalized Costs
𝑋∗ = argmin 𝐶𝑇𝑋𝑋 ∈ 𝒳
Experimental Results
Person Re-Identification
36
FT [Das et al., ECCV 2014] AvgF [Paisitkriangkrai et al., CVPR 2015 ]
Dataset
(Size)
Method
(m=100)
Time
(Sec.)
RAiD
(20×20)
FT
mbst-FT
74.0
85.0
82.0
99.0
96.0
100.0 1.6
iLIDS
(59×59)
AvgF
mbst-AvgF
51.9
54.7
60.7
63.6
72.4
75.4 15.4
VIPeR
(316×316)
AvgF
mbst-AvgF
44.9
50.5
58.3
63.0
76.3
78.0 201.9
Experimental Results
Person Re-Identification
37
Experimental Results
Feature Matching
38
𝑋∗ = argmax 𝑋𝑇𝐾𝑋𝑋 ∈ 𝒳
Matching PASCAL VOC dataset
[Leordeanu et al., IJCV 2011]
Experimental Results
Feature Matching
39
𝑋∗ = argmax 𝑋𝑇𝐾𝑋𝑋 ∈ 𝒳
Matching PASCAL VOC dataset
[Leordeanu et al., IJCV 2011]
Experimental Results
Feature Matching
40
𝑋∗ = argmax 𝑋𝑇𝐾𝑋𝑋 ∈ 𝒳
Matching PASCAL VOC dataset
[Leordeanu et al., IJCV 2011]
BP solver [Zhang et al., CVPR 2016]
IPFP Solver [Leordeanu et al., IJCV 2011]
Experimental Results
Feature Matching
41
𝑋∗ = argmax 𝑋𝑇𝐾𝑋𝑋 ∈ 𝒳
Matching PASCAL VOC dataset
[Leordeanu et al., IJCV 2011]
Experimental Results
Feature Matching
42
𝑋∗ = argmax 𝑋𝑇𝐾𝑋𝑋 ∈ 𝒳
Matching PASCAL VOC dataset
[Leordeanu et al., IJCV 2011]
Experimental Results
Feature Matching
43
𝑋∗ = argmax 𝑋𝑇𝐾𝑋𝑋 ∈ 𝒳
Matching PASCAL VOC dataset
[Leordeanu et al., IJCV 2011]
Discussion & Conclusion
Limitations
One-to-One constraint is no longer guaranteed by marginalization
Requires computational overhead to calculate m solutions
Conclusion
Graph matching by approximated marginals using m-best solutions instead of MAP
A generic approach applicable to similar problems
Marginalization improves matching accuracy and ranking
Take-home message
Do not rely on a single solution, explore more solutions
Future work
Exploring further applications with arbitrary cost functions44