A Probabilistic U-Net for Segmentation of Ambiguous Images
Simon A. A. Kohl1*,2, Bernardino Romera-Paredes1, Clemens Meyer1, Jeffrey De Fauw1, Joseph R. Ledsam1, Klaus H. Maier-Hein2,
S. M. Ali Eslami1, Danilo Jimenez Rezende1, Olaf Ronneberger1
1DeepMind2German Cancer Research Center
*work done during an internship at DeepMind
A Probabilistic U-Net for Segmentation of Ambiguous Images
Simon A. A. Kohl1*,2, Bernardino Romera-Paredes1, Clemens Meyer1, Jeffrey De Fauw1, Joseph R. Ledsam1, Klaus H. Maier-Hein2,
S. M. Ali Eslami1, Danilo Jimenez Rezende1, Olaf Ronneberger1
1DeepMind2German Cancer Research Center
Poster #127Medical Imaging Workshop Talk: Sat, Dec 8, 9:45 am
*work done during an internship at DeepMind
3
Images are often Ambiguous
4
Images are often AmbiguousPotential Cancer
Expert Graders
5
Images are often AmbiguousPotential Cancer
Expert Graders Segmentations from our model (U-Net + conditional VAE)
6
U-Net
Image
Deterministic U-NetInference
7
Probabilistic U-Net
U-Net
Latent SpacePrior Net
Image
𝛍,𝞂prior
Sampling
8
Probabilistic U-Net
U-Net
Latent SpacePrior Net
Image
𝛍,𝞂prior
Sampling
11
Sample
*
1
9
Probabilistic U-Net
U-Net
Latent SpacePrior Net
Image
𝛍,𝞂prior
Sampling
11
*
1
2
2
Sample
10
Probabilistic U-Net
U-Net
Latent SpacePrior Net
Image
𝛍,𝞂prior 13
2
*
1
Sampling
2
3
Sample
11
Probabilistic U-Net
U-Net
Image
Sample 𝐳
Sample Groundtruth
Cross-Entropy
Prior Net
𝛍,𝞂prior
Training
12
Probabilistic U-Net
U-Net
Image
Sample 𝐳
Sample Groundtruth
Cross-Entropy
Prior Net
𝛍,𝞂prior
Training
Position in Latent Space for this GT example?
13
Probabilistic U-Net
U-Net
Image
Posterior Net
Sample 𝐳
Sample Groundtruth
KL
Cross-Entropy
Prior Net Latent Space
𝛍,𝞂prior
𝛍,𝞂post
Training
14
Image
Latent Space AnalysisProbabilistic U-Net
15
Image
Latent Space AnalysisProbabilistic U-Net
16
Graders
Image
Latent Space AnalysisProbabilistic U-Net
1
0
3
2
17
Lung Abnormalities Segmentation: Quantitative ResultsEn
ergy
dis
tanc
e (lo
wer
is b
ette
r)
18
Lung Abnormalities Segmentation: Quantitative ResultsEn
ergy
dis
tanc
e (lo
wer
is b
ette
r)
1
19
Lung Abnormalities Segmentation: Quantitative ResultsEn
ergy
dis
tanc
e (lo
wer
is b
ette
r)
1 4
20
Lung Abnormalities Segmentation: Quantitative ResultsEn
ergy
dis
tanc
e (lo
wer
is b
ette
r)
1 4 8
21
Lung Abnormalities Segmentation: Quantitative ResultsEn
ergy
dis
tanc
e (lo
wer
is b
ette
r)
1 4 8 16
22
Lung Abnormalities Segmentation: Quantitative ResultsEn
ergy
dis
tanc
e (lo
wer
is b
ette
r)
23
Lung Abnormalities Segmentation: Quantitative ResultsEn
ergy
dis
tanc
e (lo
wer
is b
ette
r)
24
Lung Abnormalities Segmentation: Quantitative ResultsEn
ergy
dis
tanc
e (lo
wer
is b
ette
r)
25
Cityscapes segmentation: Qualitative Results
sidewalk
person
car veget.
Input Image Ground-truth Grader Styles
Samples (Probabilistic U-Net)
person 2
car 2 veget. 2
sidewalk 2 47 %
41 %
35 %
29 %
stochastic flips:
road road 2 24 %
26
Cityscapes segmentation: Qualitative Results
sidewalk
person
car veget.
Input Image Ground-truth Grader Styles
Samples (Probabilistic U-Net)
person 2
car 2 veget. 2
sidewalk 2 47 %
41 %
35 %
29 %
stochastic flips:
road road 2 24 %
27
Cityscapes segmentation: Quantitative Results
28
Conclusions
● Learn conditional probability over segmentation maps
● Each sample is a valid & consistent segmentation
● The likelihoods are well calibrated
● Works on large-scale, real-world data
● Can also be trained with a uni-modal GT
● Can be used to asses annotations under the model
code: github.com/SimonKohl/probabilistic_unet
https://github.com/SimonKohl/probabilistic_unet
A Probabilistic U-Net for Segmentation of Ambiguous Images
Simon A. A. Kohl1*,2, Bernardino Romera-Paredes1, Clemens Meyer1, Jeffrey De Fauw1, Joseph R. Ledsam1, Klaus H. Maier-Hein2,
S. M. Ali Eslami1, Danilo Jimenez Rezende1, Olaf Ronneberger1
1DeepMind2German Cancer Research Center
Poster #127Medical Imaging Workshop Talk: Sat, Dec 8, 9:45 am
*work done during an internship at DeepMind
30
Probabilistic Segmentation: Clinical Use-Cases
● Best-fit could be picked by clinician and adjusted if necessary.
● Hypotheses could be propagated into next diagnostic pipeline steps.
● Hypotheses could inform actions to resolve ambiguities.
31
Evaluation Metric for Quantitative ComparisonWe use the Energy Distance1 statistic (aka MMD):
where d(x,y) = 1 - IoU(x,y) and
Pout Pgt
1 Székely, G.J., Rizzo, M.L.: Energy statistics: A class of statistics based on distances. Journal of statistical planning and inference 143(8) (2013) 1249–1272
32
Baselines
1
2
m
U-Net Ensemble
1
2
m
M-HeadsDropout U-Net
1,2,3,...
U-Net
Normal Prior
Sample 𝐳1,𝐳2,𝐳3,...
13
2
Image2Image VAE