Efficient Inference in Fully Connected CRFs with Gaussian
Edge PotentialsPhilipp Krähenbühl and Vladlen Koltun
Stanford University
Presenter: Yuan-Ting Hu
1
Conditional Random Field (CRF)
𝐸 𝑥 𝐼 =
𝑖
𝜙𝑢 𝑥𝑖|𝐼 +
𝑖
𝑗∈𝜕𝑖
𝜙𝑝(𝑥𝑖 , 𝑥𝑗|𝐼)
• 𝑋, 𝐼 : random fields
• Application:• Image segmentation: achieve state-of-the-art performance (in 2011)
Unary term Pairwise term
2
Image Segmentation
• Example: semantic image segmentation
Input Output
3
CRF for Image Segmentation
𝐸 𝑥 𝐼 =
𝑖
𝜙𝑢 𝑥𝑖|𝐼 +
𝑖
𝑗∈𝜕𝑖
𝜙𝑝(𝑥𝑖 , 𝑥𝑗|𝐼)
• 𝑋: a random field defined over a set of variables {𝑋1, … , 𝑋𝑁}• Label of pixels (grass, bench, tree,..)
• 𝐼: a random field defined over a set of variables 𝐼1, … , 𝐼𝑁• Image (observation)
Unary term Pairwise term
4
CRF for Image Segmentation
• Unary term• Trained on dataset
𝐸(𝑥) =
𝑖
𝜓𝑢 𝑥𝑖 +
𝑖
𝑗∈𝜕𝑖
𝜓𝑝(𝑥𝑖 , 𝑥𝑗)
Unary term Pairwise term
5
CRF for Image Segmentation
• Pairwise term• Impose consistency of the labeling• Defined over neighboring pixels
𝜓𝑝 𝑥𝑖 , 𝑥𝑗 = 𝜇 𝑥𝑖 , 𝑥𝑗
𝑚=1
𝐾
𝑤 𝑚 𝑘 𝑚 (𝑓𝑖 , 𝑓𝑗)
𝐸(𝑥) =
𝑖
𝜓𝑢 𝑥𝑖 +
𝑖
𝑗∈𝜕𝑖
𝜓𝑝(𝑥𝑖 , 𝑥𝑗)
Unary term Pairwise term
6
CRF for Image Segmentation
• Pairwise term
𝜓𝑝 𝑥𝑖 , 𝑥𝑗 = 𝜇 𝑥𝑖 , 𝑥𝑗
𝑚=1
𝐾
𝑤 𝑚 𝑘 𝑚 (𝑓𝑖 , 𝑓𝑗)
𝑘(𝑚)(𝑓𝑖 , 𝑓𝑗) is a Gaussian kernel
𝑓𝑖 , 𝑓𝑗 is the feature vectors for pixel i and j, e.g., color intensities, …
𝑤 𝑚 is the weight of the m-th kernel
𝜇 𝑥𝑖 , 𝑥𝑗 is the label compatibility function
𝐸(𝑥) =
𝑖
𝜓𝑢 𝑥𝑖 +
𝑖
𝑗∈𝜕𝑖
𝜓𝑝(𝑥𝑖 , 𝑥𝑗)
Unary term Pairwise term
7
CRF for Image Segmentation
𝐸 𝑥 𝐼 =
𝑖
𝜙𝑢 𝑥𝑖|𝐼 +
𝑖
𝑗∈𝝏𝒊
𝜙𝑝(𝑥𝑖 , 𝑥𝑗|𝐼)
Unary term Pairwise term
• Neighboring pixels
• Local connections
• May not capture the sharp boundaries
8
Grid CRF for Image Segmentation
Unary term
• Local connections
• May not capture the sharp boundaries
9
Fully connected CRF for Image Segmentation
• Fully connected CRF• Every node is connected to
every other node
• MCMC inference, 36 hours!!
Dense CRF
10
Efficient Inference on Fully connected CRF
• They propose an efficient approximate algorithm for inference on fully connected CRF
• Inference in 0.2 seconds• ~50,000 nodes (apply to pixel level segmentation)
• Based on a mean field approximation to the CRF distribution
11
Mean Field Approximation
• Mean field update rule for CRF
𝑄𝑖 𝑥𝑖 = 𝑙
=1
𝑍𝑖exp{−𝜓𝑢 𝑥𝑖 −
𝑙′∈𝐿
𝜇 𝑙, 𝑙′
𝑚=1
𝐾
𝑤 𝑚
𝑗≠𝑖
𝑘 𝑚 𝑓𝑖 , 𝑓𝑗 𝑄𝑗(𝑙′)}
12
Mean Field Approximation
1
𝑍𝑖exp −𝜓𝑢 𝑥𝑖 −
𝑙′∈𝐿
𝜇 𝑙, 𝑙′
𝑚=1
𝐾
𝑤 𝑚
𝑗≠𝑖
𝑘 𝑚 𝑓𝑖 , 𝑓𝑗 𝑄𝑗 𝑙′
Algorithm• Initialize Q : 𝑄𝑖 𝑥𝑖 =
1
𝑍𝑖exp{−𝜙𝑢(𝑥𝑖)}
• While not converged
• Message passing: ෫𝑄𝑖
𝑚(𝑙) = σ𝑗≠𝑖 𝑘
𝑚 𝑓𝑖 , 𝑓𝑗 𝑄𝑗 𝑙′
𝑄𝑖 𝑥𝑖 = 𝑙 =
13
Mean Field Approximation
1
𝑍𝑖exp −𝜓𝑢 𝑥𝑖 −
𝑙′∈𝐿
𝜇 𝑙, 𝑙′
𝑚=1
𝐾
𝑤 𝑚 ෫𝑄𝑖
𝑚(𝑙)
Algorithm• Initialize Q : 𝑄𝑖 𝑥𝑖 =
1
𝑍𝑖exp{−𝜙𝑢(𝑥𝑖)}
• While not converged
• Message passing: ෫𝑄𝑖(𝑚)
= σ𝑗≠𝑖 𝑘𝑚 𝑓𝑖, 𝑓𝑗 𝑄𝑗 𝑙
′
• Compatibility transform: 𝑄𝑖 𝑥𝑖 = σ𝑙′∈𝐿 𝜇 𝑙, 𝑙′ σ𝑚=1𝐾 𝑤 𝑚 ෫
𝑄𝑖𝑚(𝑙)
𝑄𝑖 𝑥𝑖 = 𝑙 =
14
Mean Field Approximation
1
𝑍𝑖exp −𝜓𝑢 𝑥𝑖 −𝑄𝑖 𝑥𝑖
Algorithm• Initialize Q : 𝑄𝑖 𝑥𝑖 =
1
𝑍𝑖exp{−𝜙𝑢(𝑥𝑖)}
• While not converged
• Message passing: ෫𝑄𝑖(𝑚)
= σ𝑗≠𝑖 𝑘𝑚 𝑓𝑖, 𝑓𝑗 𝑄𝑗 𝑙
′
• Compatibility transform: 𝑄𝑖 𝑥𝑖 = σ𝑙′∈𝐿 𝜇 𝑙, 𝑙′ σ𝑚=1𝐾 𝑤 𝑚 ෫
𝑄𝑖𝑚(𝑙)
• Update to calculate 𝑄𝑖 𝑥𝑖 = 𝑙
• Normalization
𝑄𝑖 𝑥𝑖 = 𝑙 =
15
Mean Field Approximation
1
𝑍𝑖exp −𝜓𝑢 𝑥𝑖 −𝑄𝑖 𝑥𝑖
Algorithm• Initialize Q : 𝑄𝑖 𝑥𝑖 =
1
𝑍𝑖exp{−𝜙𝑢(𝑥𝑖)}
• While not converged
• Message passing: ෫𝑄𝑖(𝑚)
= σ𝑗≠𝑖 𝑘𝑚 𝑓𝑖, 𝑓𝑗 𝑄𝑗 𝑙
′
• Compatibility transform: 𝑄𝑖 𝑥𝑖 = σ𝑙′∈𝐿 𝜇 𝑙, 𝑙′ σ𝑚=1𝐾 𝑤 𝑚 ෫
𝑄𝑖𝑚(𝑙)
• Update to calculate 𝑄𝑖 𝑥𝑖 = 𝑙
• Normalization
𝐎(𝐍𝟐)
𝑶(𝑵)
𝐎(𝐍)
𝐎(𝐍)
𝑄𝑖 𝑥𝑖 = 𝑙 =
16
Mean Field Approximation
1
𝑍𝑖exp −𝜓𝑢 𝑥𝑖 −𝑄𝑖 𝑥𝑖
Algorithm• Initialize Q : 𝑄𝑖 𝑥𝑖 =
1
𝑍𝑖exp{−𝜙𝑢(𝑥𝑖)}
• While not converged
• Message passing: ෫𝑸𝒊(𝒎)
= σ𝒋≠𝒊𝒌𝒎 𝒇𝒊, 𝒇𝒋 𝑸𝒋 𝒍
′
• Compatibility transform: 𝑄𝑖 𝑥𝑖 = σ𝑙′∈𝐿 𝜇 𝑙, 𝑙′ σ𝑚=1𝐾 𝑤 𝑚 ෫
𝑄𝑖𝑚(𝑙)
• Update to calculate 𝑄𝑖 𝑥𝑖 = 𝑙
• Normalization
𝑄𝑖 𝑥𝑖 = 𝑙 =
𝐎(𝐍𝟐)
𝑶(𝑵)
𝐎(𝐍)
𝐎(𝐍)
17
Efficient Message Passing
• Message passing෫𝑄𝑖(𝑚)
=
𝑗≠𝑖
𝑘 𝑚 𝑓𝑖 , 𝑓𝑗 𝑄𝑗 𝑙′
• Gaussian filter 𝑘 𝑚 𝑓𝑖 , 𝑓𝑗• Apply convolution to 𝑄𝑗 𝑙
′
18
Efficient Message Passing
• Message passing෫𝑄𝑖(𝑚)
=
𝑗≠𝑖
𝒌 𝒎 𝒇𝒊, 𝒇𝒋 𝑄𝑗 𝑙′ = 𝑮 𝒎 ⊗Q l − Qi(l)
• Gaussian filter 𝑘 𝑚 𝑓𝑖 , 𝑓𝑗• Apply convolution to 𝑄𝑗 𝑙
′
• Smooth, low-pass filter -> can be reconstructed by a set of samples (by sampling theorem)
19
Efficient Message Passing
• Message passing෫𝑄𝑖(𝑚)
=
𝑗≠𝑖
𝒌 𝒎 𝒇𝒊, 𝒇𝒋 𝑄𝑗 𝑙′ = 𝑮 𝒎 ⊗Q l − Qi(l)
• Downsampling 𝑄𝑗 𝑙′
• Blur the downsampled signal (apply convolution operator with kernel 𝑘(𝑚))
• Upsampling to reconstruct the filtered signal ~ ෫𝑄𝑖(𝑚)
• Reduce the time complexity to 𝑶(𝑵)
20
Mean Field Approximation
1
𝑍𝑖exp −𝜓𝑢 𝑥𝑖 −𝑄𝑖 𝑥𝑖
Algorithm• Initialize Q : 𝑄𝑖 𝑥𝑖 =
1
𝑍𝑖exp{−𝜙𝑢(𝑥𝑖)}
• While not converged
• Message passing: ෫𝑄𝑖(𝑚)
= σ𝑗≠𝑖 𝑘𝑚 𝑓𝑖, 𝑓𝑗 𝑄𝑗 𝑙
′
• Compatibility transform: 𝑄𝑖 𝑥𝑖 = σ𝑙′∈𝐿 𝜇 𝑙, 𝑙′ σ𝑚=1𝐾 𝑤 𝑚 ෫
𝑄𝑖𝑚(𝑙)
• Update to calculate 𝑄𝑖 𝑥𝑖 = 𝑙
• Normalization
𝑄𝑖 𝑥𝑖 = 𝑙 =
𝐎(𝑵)
𝑶(𝑵)
𝐎(𝐍)
𝐎(𝐍)
21
ResultsImage Dense CRF Results
22
Results
23
Conclusion
• A fully connected CRF model for pixel level segmentation
• Efficient inference on the fully connected CRF• Linear in number of variables
24
Dense CRF as Post-processingSemantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. Chen et al. ICLR’15
25
Convergent Inference
• Parameter Learning and Convergent Inference for Dense Random Fields. Philipp Krähenbühl and Vladlen Koltun. ICML’13.• A new efficient inference algorithm in dense CRF that is guaranteed to
converge for some specific kernels and label compatibility functions.
26
Questions?
27
Pairwise Term in the Dense CRF Model
• Pairwise term
𝜓𝑝 𝑥𝑖 , 𝑥𝑗 = 𝜇 𝑥𝑖 , 𝑥𝑗
𝑚=1
𝐾
𝑤 𝑚 𝑘 𝑚 (𝑓𝑖 , 𝑓𝑗)
• They use
𝑝𝑖: position of pixel i𝐼𝑖: color intensity of pixel I𝜃∗: hyper parameters
28