Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
gD
iscu
ssio
n S
essio
n:
Slid
ing
Win
dow
s
Sliding Windows – Silver Bullet or Evolutionary Deadend?
Alyosha Efros, Bastian Leibe, Krystian Mikolajczyk
Sicily Workshop, Syracusa, 23.09.2006
2
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
sWhat is a Sliding Window Approach?
• Search over space and scale
• Detection as subwindow classification problem
• “In the absence of a more intelligent strategy, any global image classification approach can be converted into a localization approach by using a sliding-window search.”
... ...
3
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
sTask: Object Localization in Still Images
• What options do we have to choose from? Sliding window approaches
– Classification problem– [Papageorgiou&Poggio,’00], [Schneiderman&Kanade,’00],
[Viola&Jones,01], [Mikolajczyk et al.,’04], [Torralba et al.,’04], [Dalal&Triggs,’05], [Wu&Nevatia,’05], [Laptev,’06],…
Feature-transform based approaches– Part-based generative models, typically with a star
topology– [Fergus et al.,’03], [Leibe&Schiele,’04], [Fei-Fei et al.,’04],
[Felszenszwalb&Huttenlocher,’05], [Winn&Criminisi,’06], [Opelt et al.,’06], [Mikolajczyk et al.,’06],…
Massively parallel NN architectures– e.g. convolutional NNs– [LeCun et al.,’98], [Osadchy et al.,’04], [Garcia et al.,??],…
“Smart segmentation” based approaches– Localization based on robustified bottom-up segmentation– [Todorovic&Ahuja,’06], [Roth&Ommer,’06]
4
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
s
Sliding-Window Approaches
• Pros: Can draw from vast stock of ML methods.
Independence assumption between subwindows.– Makes classification easier.– Process can be parallelized.
Simple technique, can be tried out very easily.– No translation/scale invariance required in model.
There are methods to do it very fast.– Cascades with AdaBoost/SVMs
Good detection performance on many benchmark datasets.
– e.g. face detection, VOC challenges
Direct control over search range (e.g. on ground plane).
5
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
s
Sliding-Window Approaches
• Cons: Can draw from vast stock of ML methods…
…as long as they can be evaluated in a few ms.
Need to evaluate many subwindows (100’000s). Needs very fast & accurate classification Many training examples required,
often limited to low training resolution. Can only deal with relatively small occlusions.
Still need to fuse resulting detections Hard/suboptimal from binary classification output
Classification task often ill-defined– How to label half a car?
Difficult to deal with changing aspect ratios
6
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
sDuality to Feature-Based Approaches…
• How to find maxima in the Hough space efficiently?
• Maxima search = coarse-to-fine sliding window stage!
• Main differences: All features evaluated upfront (instead of in cascade). Generative model instead of discriminative classifier. Maxima search already performs detection fusion.
y
s
Binned accum. array
y
s
xRefinement
(MSME)
y
s
xCandidatemaxima
y
s
Hough votes
7
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
s
So What is Left to Oppose?
1. Feature-based vs. Window-based?
2. (Almost) exclusive use of discriminative methods
3. Low training resolutions
4. How to deal with changing aspect ratios?
8
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
s1. Feature-based vs. Window-based
• May be mainly an implementation trade-off Few, localized features feature-based evaluation
better Many, dense features window-based evaluation
better Noticed already by e.g. [Schneiderman,’04] The trade-offs may change as your method
develops…
y
s
9
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
s2. Exclusive Use of Discriminative Methods
BackprojectedHypotheses
Interest PointsMatched Codebook Entries
Probabilistic Voting
Segmentation
3D Voting Space(continuous)
x
y
s
Backprojection
of Maxima
p(figure)Probabilities
[Leibe & Schiele,04]
Gen. Modelinside!
10
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
sGenerative Models for Sliding Windows
• Continuous confidence scores Smoother maxima in hypothesis space Coarser sampling possible
11
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
sGenerative Models for Sliding Windows
• Continuous confidence scores Smoother maxima in hypothesis space Coarser sampling possible
• Backprojection capability Determine a hypothesis’s support in the image Resolve overlapping cases
12
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
sGenerative Models for Sliding Windows
• Continuous confidence scores Smoother maxima in hypothesis space Coarser sampling possible
• Backprojection capability Determine a hypothesis’s support in the image Resolve overlapping cases
• Easier to deal with partial occlusion Part-based models Reasoning about missing parts
13
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
sSliding Windows for Generative Models
• Apply cascade idea to generative models Discriminative training Evaluate most promising features first
14
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
sSliding Windows for Generative Models
• Apply cascade idea to generative models Discriminative training Evaluate most promising features first
• Direct control over search range Only need to evaluate positions in search corridor Only need to consider subset of features Easier to adapt to different geometry
(e.g. curved ground surface)
Should combine discriminative and generative elements!
x
s
y Search corridor
15
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
s
3. Low Training Resolutions
• Many current s-w detectors operate on tiny images Viola&Jones: 2424 pixels Torralba et al.: 3232 pixels Dalal&Triggs: 6496 pixels (notable exception)
• Main reasons Training efficiency (exhaustive feature selection in
AdaBoost) Evaluation speed Want to recognize objects at small scales
• But… Limited information content available at those resolutions Not enough support to compensate for occlusions!
16
Perc
ep
tual an
d S
en
sory
Au
gm
en
ted
Com
pu
tin
g
A. Efros, B. Leibe, K. Mikolajczyk
Dis
cu
ssio
n S
essio
n:
Slid
ing
Win
dow
s
4. Changing Aspect Ratios
• Sliding window requires fixed window size Basis for learning efficient cascade classifier
• How to deal with changing aspect ratios? Fixed window size Wastes training dimensions
Adapted window size Difficult to share features
“Squashed” views [Dalal&Triggs] Need to squash test image, too
A. Efros, B. Leibe, K. Mikolajczyk 17
• What is wrong with sliding window?Search complexity?
A. Efros, B. Leibe, K. Mikolajczyk 18
• Is there anything that cannot be done with sliding window?