Towards Total Scene Understanding:Classification, Annotation and
Segmentation in an Automatic Framework
Li-Jia Li, Richard Socher, Li Fei-Fei
1
2
City Travel
Pagoda
SunriseSunshine
Sun
3
City Travel
Pagoda
SunriseSunshine
Sun
Weber et al 00Fergus et al 03Felzenswalb et al 04Fei-Fei et al 05Sivic et al 05Bosch et al 06Oliva et al 01Lazebnik et al 06
Shi et al 00Felzenszwalb et al04Sali et al 99Winn et al 05Kumar et al 05Cao et al 07Russell et al 06Todorovic et al 06
Duygulu et al 02
Barnard et al 03
Blei et al 03Gupta et al 08
Alipr Li et al 03Sudderth et al 05
Segmentation
Classification
AnnotationRemark: Approaches in yellow will be used to compare withour model in later Experiments.
4
City Travel
Pagoda
SunriseSunshine
Sun
Weber et al 00Fergus et al 03Felzenswalb et al 04Fei-Fei et al 05Sivic et al 05Bosch et al 06Oliva et al 01Lazebnik et al 06
Shi et al 00Felzenszwalb et al04Sali et al 99Winn et al 05Kumar et al 05Cao et al 07Russell et al 06Todorovic et al 06
Duygulu et al 02
Barnard et al 03
Blei et al 03Gupta et al 08
Alipr Li et al 03Sudderth et al 05
Segmentation
Classification
Annotation
UTotal Scene Understandi
ng
Application
5
6
Classification Annotation Segmentation
Mutually beneficial!
7
AthleteHorseGrassTreesSkySaddle
Classification Annotation Segmentation
HorseHorse
class: Polo
8
Horse
Horse
Horse
HorseHorse
SkyTree
Grass
AthleteHorseGrassTreesSkySaddle
Classification Annotation Segmentation
Horse
Athleteclass: Polo
9
class: Polo
Horse
Horse
Horse
HorseHorse
AthleteHorseGrassTreesSkySaddle
Classification Annotation Segmentation
10
Related Work:
Tu et al 03
AnnotationSegmentation
Horse
Horse
Horse
HorseHorse
SkyTree
GrassHorse
Athlete
Li & Fei-Fei 07
AnnotationClassification
Sky
GrassHorse
AthleteHorse
Horse
Horse
HorseHorse
Class: Polo
ClassificationSegmentation
Tree
Heitz et al 08
Class: Polo
Learning
Model
Recognition & Experiment
Outline
Classification
Annotation Segmentation
12
C
Nr
O
RNF
XAr
NtZ
S
T
D
AthleteHorseGrassTreesSkySaddle
13
C
Visual
Text
class: Polo
AthleteHorseGrassTreesSkySaddle
Joint distribution of random variable Visual Component
Text Component.
D
14
O
14
Text Component.
D
Visual
TextC
class: Polo
15
RNF
Color LocationTexture Shape
Text Component.
O
D
Visual
TextC
class: Polo
RNF
O
D
Visual
TextC
class: Polo
16
XAr
Text Component.
RNF
O
D
Visual
TextC
class: Polo
XAr ZNr Nt “Connector variable”
AthleteHorseGrassTreesSkySaddle
Text Component.
RNF
O
D
Visual
TextC
class: Polo
XAr ZNr Nt “Connector variable”
.
S AthleteHorseGrassTreesSkySaddle
AthleteHorseGrassTreesSkySaddle
VisibleNot visible
“Switch variable”
Horse
Horse
Horse
HorseHorse
Athlete
Horse
RNF
O
D
Visual
TextC
class: Polo
XAr ZNr Nt “Connector variable”
S AthleteHorseGrassTreesSkySaddle
VisibleNot visible
“Switch variable”
T
Horse
.
Visual Text C
Nr
O
RNF
XAr
NtZ
S
TLearning
Model
Recognition & Experiment
Outline
21
Learning
Exact Inference is Intractable !
Relationship of the random variables
Visual
Text C
Nr
O
RNF
XAr
NtZ
S
T
22
Relationship of the random variables
Visual
Text C
Nr
O
RNF
XAr
NtZ
S
T
Top-down force
Bottom-up force from visual information
Bottom-up force from text information
Collapsed Gibbs Sampling
(R. Neal, 2000)
23
Scene/Event imagesfrom the Internet
There is no object-text correspondence…
AthleteHorseGrassTree
Saddle
24
Scene/Event imagesfrom the Internet
Our model builds the correspondence…
C
Nr
O
RNF
XAr
NtZ
S
T
D
AthleteHorseGrassTree
Saddle
25
AthleteHorseGrassTreesSkySaddle
AthleteHorseGrassBall
However, a big obstacle is: many objects always co-occur together
??
?
Scene/Event imagesfrom the Internet
26
C
RNF
XAr Nr Z
Nt
T
S
O
One solution: some good initialization of O
Grass
Athlete
Horse
AthleteHorseGrassTreesSkySaddle
Scene/Event imagesfrom the Internet
27
Scene/Event imagesfrom the Internet
Initializing O: obtain internet images for each O Object images
28
Scene/Event images
C
RNF
XAr Nr Z
Nt
T
SO
Any object
detection&
segmentation
Algorithm
D
Initializing O: train an object detector for each OObject imagesEvent/Scene images
29
Scene/Event images
…Black box
object detection& segmentation
Black box object detection& segmentation
C
RNF
XAr Nr Z
Nt
T
SO
D
Initialize O in the scene image by the trained object detectors
Object imagesEvent/Scene images
Any object
detection&
segmentation
Algorithm
30
Scene/Event images
…Black box
object detection& segmentation
Black box object detection& segmentation
C
RNF
XAr Nr Z
Nt
T
SO
Black box object detection& segmentation
D
Initialize O in the scene image by the trained object detectors
Cao & Fei-Fei, 2007
θ C
XR
O
NrAr
Our Model
Object imagesEvent/Scene images
C
RNF
XAr Nr Z
Nt
T
SO
D
Auto-semi-supervised learning: Small # of initialized images + Large # of uninitialized images
Our Model + AthleteHorseGrassTree
SaddleWind
Small # of initialized images
AthleteRockGrassTree
SkyRope
AthleteSnow
TreeSky
SnowboardLarge # of uninitialized images
Scene/Event images
AthleteHorseGrassTree
SaddleWind
AthleteRockGrassTree
SkyRope
AthleteSnow
TreeSky
Snowboard
Large # of uninitialized images
Visual Text C
Nr
O
RNF
XAr
NtZ
S
T
Learning Model
Recognition & Experiment• Dataset• Learned Model• Results
OutlineSmall # of automatically initialized images
Badminton
Bocce
Croquet
Polo
33
8 Event/Scene Classes
Remark: Tags are not used during testing
Rockclimbing
Rowing
Sailing
Snowboarding
34
8 Event/Scene Classes
35
C
Nr
RNF
XAr
NtZ
S
T
Learned model: O
D
O
36
Athlete
Grass
Horse
C
Nr
O
NF
XAr
NtZ
S
T
D
R
Learned model: R
37
C
Nr
O
RNF
XAr
NtZ
T
D
S
Learned model: S
38
8 way classification: 54%
Classification Annotation Segmentation
39
Classification Annotation Segmentation
Alipr: Li et al 03 Corr LDA: Blei et al 03
40
Classification Annotation Segmentation
41
Effect of top-down class context
Horse
C
O
R X Z
T
SO
R X Z
T
S
Model w/o top-down class Full Model
AthleteHorseGrassTree
SaddleWind
AthleteRockGrassTree
SkyRope
AthleteSnow
TreeSky
Snowboard
Large # of uninitialized images
Small # of automatically initialized images
Visual Text C
Nr
O
RNF
XAr
NtZ
S
T
Sky
AthleteTree
Mountain
Rock Class: Rock
climbingAthleteMountainTreeRockSkyAscent
Sky
Athlete
Water
Tree sailboat
Class: SailingAthleteSailboatTreeWaterSkyWind
Learning Model
Recognition & Experiment
Tree
AthleteSnowboard
Snow
Class: Snowboarding
AthleteSnowboardTreeSnowSkyPowder
43
ThankProf. Silvio Savarese , Juan Carlos Niebles, Chong Wang, Barry Chai, Min Sun, Bangpeng Yao, Hao Su, Jia Deng, anonymous reviewers
And You