© 2010 Adobe Systems Incorporated. All Rights Reserved.
Lubomir Bourdev | Sr. Research Scientist
From PostScript to Face Detectors How Computer Vision is Transforming Adobe
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Overview
Overview of computer vision features and papers
Evolution of product features
Challenges of adopting vision
Technology challenges
UI challenges
Common misconceptions about vision
Future
Videos of some recent papers
2
© 2010 Adobe Systems Incorporated. All Rights Reserved.
About Adobe
Founded in 1982 by John Warnock and Chuck Geschke
8300 employees (March 2010)
Most popular products:
Over 100 other products
3
Photoshop Acrobat Flash
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Computer Vision Features in our Products
4
2005-2006 2007-2008 2009-2010
Photoshop CS3 Auto-Align Layers Photoshop CS3 Auto-Blend Layers Photoshop CS4 Improved Seam Carving Photoshop CS4 Extended Depth of Field Photoshop CS4 Improved Color Range Selection Photoshop CS4 Auto Skin Tone Masks Photoshop CS4 Vignette and Exposure Correction Photoshop CS4 Fisheye Correction and Alignment Photoshop CS4 Enhanced Image Correspondence Photoshop Elements 6 Photomerge Group Shot Photoshop Elements 7 Photomerge Scene Cleaner AfterEffects CS4 Fast Bilateral Filtering
(Unreleased feature) (Unreleased feature) (Unreleased feature) (Unreleased feature) AfterEffects CS5 Roto Brush Photoshop CS5 Content-Aware Fill Photoshop CS5 New Sharpen Tool Photoshop CS5 Color Decontamination Photoshop CS5 Smart Radius Photoshop CS5 Tone Mapping Premiere CS5 Face Detection Photoshop Elements 8 Recompose Photoshop Elements 8 People Recognition Photoshop Elements 8 Photomerge Exposure CS5 Lens Profile Creator
Photoshop Elements 4 Auto Red Eye Photoshop Elements 4 Face Tagging Photoshop Elements 4 Shadow-Highlight
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Computer Vision Papers with Adobe Authors
5
CVPR09 Rhemann et al. CVPR09 Smith et al. CVPR09 Zhang et al. CVPR10 Brandt CVPR10 Price et al. CVPR10 Price et al. CVPR10 Shechtman et al ECCV10 Bai et al. ECCV10 Barnes et al. ECCV10 Bourdev et al. ECCV10 Kemelmacher-Shlizerman et al. ECCV10 Lin & Brandt ECCV10 Tao et al. ECCV10 Vazquez-Reina et al. ICCV09 Bourdev & Malik ICCV09 Dale et al. ICCV09 Price et al. ICCV09 Ni et al. ICCV09 Smith et al. IJCV09 Paris & Durand. PAMI10 Goldman PAMI10 Goldman et al SIGGRAPH ASIA09 Bousseau et al. SIGGRAPH ASIA09 Chen et al. SIGGRAPH09 Bai et al. SIGGRAPH09 Barnes et al. SIGGRAPH09 Carroll et al. SIGGRAPH09 Liu et al. SIGGRAPH09 Wang & Popovic SIGGRAPH09 Rubenstein et al. SIGGRAPH10 Barnes et al. SIGGRAPH10 Carroll et al.
CVPR08 Boiman et al. CVPR08 Cho et al. CVPR08 Jin CVPR08 Simakov et al. CVPR08 Sunkavalli et al. CVPR08 Wang et al. CVPR08 Zadicario et al. ECCV08 Kuthirummal et al. ECCV08 Levin et al. ECCV08 Paris ECCV08 Wang et al. IJCV08 Jin et al. JMIV07 Jin et al PAMI08 Szeliski et al. PAMI07 Zeng et al. SIGGRAPH08 Hsu et al. SIGGRAPH08 Rubenstein et al. SIGGRAPH08 Shan et al. SIGGRAPH ASIA09 Barnes et al.
CVPR05 Bourdev & Brandt ICCV05 Cohen ICCV05 Jin et al. ICCV05 Vedaldi et al. IJCV05 Jin et al. SIGGRAPH06 Agarwala et al.
2005-2006 2007-2008 2009-2010
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Computer Vision Papers with Adobe Authors
6
CVPR09 Rhemann et al. CVPR09 Smith et al. CVPR09 Zhang et al. CVPR10 Brandt CVPR10 Price et al. CVPR10 Price et al. CVPR10 Shechtman et al ECCV10 Bai et al. ECCV10 Barnes et al. ECCV10 Bourdev et al. ECCV10 Kemelmacher-Shlizerman et al. ECCV10 Lin & Brandt ECCV10 Tao et al. ECCV10 Vazquez-Reina et al. ICCV09 Bourdev & Malik ICCV09 Dale et al. ICCV09 Price et al. ICCV09 Ni et al. ICCV09 Smith et al. IJCV09 Paris & Durand. PAMI10 Goldman PAMI10 Goldman et al SIGGRAPH ASIA09 Bousseau et al. SIGGRAPH ASIA09 Chen et al. SIGGRAPH09 Bai et al. SIGGRAPH09 Barnes et al. SIGGRAPH09 Carroll et al. SIGGRAPH09 Liu et al. SIGGRAPH09 Wang & Popovic SIGGRAPH09 Rubenstein et al. SIGGRAPH10 Barnes et al. SIGGRAPH10 Carroll et al.
CVPR08 Boiman et al. CVPR08 Cho et al. CVPR08 Jin CVPR08 Simakov et al. CVPR08 Sunkavalli et al. CVPR08 Wang et al. CVPR08 Zadicario et al. ECCV08 Kuthirummal et al. ECCV08 Levin et al. ECCV08 Paris ECCV08 Wang et al. IJCV08 Jin et al. JMIV07 Jin et al PAMI08 Szeliski et al. PAMI07 Zeng et al. SIGGRAPH08 Hsu et al. SIGGRAPH08 Rubenstein et al. SIGGRAPH08 Shan et al. SIGGRAPH ASIA09 Barnes et al.
CVPR05 Bourdev & Brandt ICCV05 Cohen ICCV05 Jin et al. ICCV05 Vedaldi et al. IJCV05 Jin et al. SIGGRAPH06 Agarwala et al.
2005-2006 2007-2008 2009-2010
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Academic Collaborators
Brigham Young University
Carnegie Melon University
Chinese University of Hong Kong
Columbia
Georgia Institute of Technology
Harvard
Hong Kong University of Science and Technology
INRIA/Grenoble
Max Planck Institute
MIT
Northwestern University
NYU
7
Princeton Stanford Tel-Aviv University Telecom ParisTech U.C. Berkeley University of British Columbia University of Kentucky University of Michigan University of Minnesota University of Toronto University of Washington University of Wisconsin Weizmann Institute
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Overview
Overview of computer vision features and papers
Evolution of product features
Challenges of adopting vision
Technology challenges
UI challenges
Common misconceptions about vision
Videos of some recent papers
Future
8
© 2010 Adobe Systems Incorporated. All Rights Reserved.
First Computer Vision feature
Adobe Acrobat Capture
Project started in 1992 and shipped in 1994
Based on Adobe-purchased OCR Systems and NTI Technologies
OCR solution, using combination of template matching and neural networks
Can recognize and preserve fonts
Won several product of the year awards
9
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Acrobat ClearScan
10
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Red Eye Correction Ramesh Gupta, Gregg Wilensky, Jon Brandt
11
PSE 1.0
Must click on the red portion
Basic segmentation of red area
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Red Eye Correction Ramesh Gupta, Gregg Wilensky, Jon Brandt
12
PSE 1.0
Must click on the red portion
Basic segmentation of red area
X
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Red Eye Correction Ramesh Gupta, Gregg Wilensky, Jon Brandt
13
PSE 1.0 PSE 2.0
Must click on the red portion
Must click anywhere on the eye
Basic segmentation of red area
Template matching to locate the eye
X X
© 2010 Adobe Systems Incorporated. All Rights Reserved.
PSE 4.0
Red Eye Correction Ramesh Gupta, Gregg Wilensky, Jon Brandt
14
PSE 1.0 PSE 2.0
Must click on the red portion
Must click anywhere on the eye
Fully automatic
Basic segmentation of red area
Template matching to locate the eye
Face detector to find the face
X X
© 2010 Adobe Systems Incorporated. All Rights Reserved.
PSE 4.0
Red Eye Correction Ramesh Gupta, Gregg Wilensky, Jon Brandt
15
PSE 1.0 PSE 2.0
Must click on the red portion
Must click anywhere on the eye
Fully automatic
Basic segmentation of red area
Template matching to locate the eye
Face detector to find the face
Use other images of the same person Face detector + face recognizer
Future version
? Low-level
Image processing High-level
Computer vision
X X
© 2010 Adobe Systems Incorporated. All Rights Reserved.
PSE 4.0
Red Eye Correction Ramesh Gupta, Gregg Wilensky, Jon Brandt
16
PSE 1.0 PSE 2.0
Must click on the red portion
Must click anywhere on the eye
Fully automatic
Basic segmentation of red area
Template matching to locate the eye
Face detector to find the face
Use other images of the same person Face detector + face recognizer
Future version
? X X
Pixel Level Context
Cross-image Context
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Photo Merge John Peterson, Hailin Jin, Aseem Agarwala
17
Photoshop CS 1
Intensity-based registration Requires manual careful alignment
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Photo Merge John Peterson, Hailin Jin, Aseem Agarwala
18
Photoshop CS 1
Intensity-based registration Requires manual careful alignment
Photoshop CS 3
Fully automatic. Feature extraction RANSAC Bundle adjustment Graph-cut
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Photo Merge John Peterson, Hailin Jin, Aseem Agarwala
19
Photoshop CS 1
Intensity-based registration Requires manual careful alignment
Photoshop CS 3
Fully automatic. Feature extraction RANSAC Bundle adjustment Graph-cut
Photoshop CS 4
Spherical composition Lens correction Fish-eye
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Photo Merge John Peterson, Hailin Jin, Aseem Agarwala
20
Photoshop CS 1
Intensity-based registration Requires manual careful alignment
Photoshop CS 3
Fully automatic. Feature extraction RANSAC Bundle adjustment Graph-cut
Photoshop CS 4
Spherical composition Lens correction Fish-eye
Recognizes camera model; camera-specific calibration
Photoshop CS5
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Photo Merge John Peterson, Hailin Jin, Aseem Agarwala
21
Photoshop CS 1
Intensity-based registration Requires manual careful alignment
Photoshop CS 3
Fully automatic. Feature extraction RANSAC Bundle adjustment Graph-cut
Photoshop CS 4
Spherical composition Lens correction Fish-eye
Recognizes camera model; camera-specific calibration
Photoshop CS5
Pixel Level Context
Cross-image Context
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Selection Tools Gregg Wilensky, Scott Cohen, Jue Wang, Jeff Chien
22
Photoshop 3 Magic Wand
Selection based on color difference
Works only for objects with uniform color
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Selection Tools Gregg Wilensky, Scott Cohen, Jue Wang, Jeff Chien
23
Photoshop 3 Magic Wand
Selection based on color difference
Works only for objects with uniform color
Photoshop 7 Extract
Tri-map
Requires careful tracing of the object outline
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Selection Tools Gregg Wilensky, Scott Cohen, Jue Wang, Jeff Chien
24
Photoshop 3 Magic Wand
Selection based on color difference
Works only for objects with uniform color
Photoshop 7 Extract
Tri-map
Requires careful tracing of the object outline
Photoshop CS 3 Quick Selection
PDEs + GraphCut
Paint some foreground. Optionally paint background
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Selection Tools Gregg Wilensky, Scott Cohen, Jue Wang, Jeff Chien
25
Photoshop 3 Magic Wand
Selection based on color difference
Works only for objects with uniform color
Photoshop 7 Extract
Tri-map
Requires careful tracing of the object outline
Photoshop CS 3 Quick Selection
PDEs + GraphCut
Paint some foreground. Optionally paint background
Future version
? Top-down + bottom up information
One or a couple of clicks to select
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Selection Tools Gregg Wilensky, Scott Cohen, Jue Wang, Jeff Chien
26
Photoshop 3 Magic Wand
Selection based on color difference
Works only for objects with uniform color
Photoshop 7 Extract
Tri-map
Requires careful tracing of the object outline
Photoshop CS 3 Quick Selection
PDEs + GraphCut
Paint some foreground. Optionally paint background
Low-level Image Processing
High Level Computer Vision
Future version
Top-down + bottom up information
One or a couple of clicks to select
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Selection Tools Gregg Wilensky, Scott Cohen, Jue Wang, Jeff Chien
27
Photoshop 3 Magic Wand
Selection based on color difference
Works only for objects with uniform color
Photoshop 7 Extract
Tri-map
Requires careful tracing of the object outline
Photoshop CS 3 Quick Selection
PDEs + GraphCut
Paint some foreground. Optionally paint background
Future version
Top-down + bottom up information
One or a couple of clicks to select
Lots of user’s time Very quick
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Tagging People in Photo Albums Lubomir Bourdev, Alex Parenteau
28
PSE 1
Simple tagging field
Manually look at every image and type names of people
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Tagging People in Photo Albums Lubomir Bourdev, Alex Parenteau
29
PSE 1
Simple tagging field
Manually look at every image and type names of people
PSE 4
Face detector
User must label each face in a grid of faces
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Tagging People in Photo Albums Lubomir Bourdev, Alex Parenteau
30
PSE 1
Simple tagging field
Manually look at every image and type names of people
PSE 4
Face detector
User must label each face in a grid of faces
PSE 8
Face detector + Face recognizer
User labels some faces and corrects remaining labels
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Tagging People in Photo Albums Lubomir Bourdev, Alex Parenteau
31
PSE 1
Simple tagging field
Manually look at every image and type names of people
PSE 4
Face detector
User must label each face in a grid of faces
PSE 8
Face detector + Face recognizer
User labels some faces and corrects remaining labels
Future version
? Detect people not facing the camera. Parse clothes.
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Tagging People in Photo Albums Lubomir Bourdev, Alex Parenteau
32
PSE 1
Simple tagging field
Manually look at every image and type names of people
PSE 4
Face detector
User must label each face in a grid of faces
PSE 8
Face detector + Face recognizer
User labels some faces and corrects remaining labels
Low-level Image Processing
High Level Computer Vision
Future version
Detect people not facing the camera. Parse clothes.
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Overview
Overview of computer vision features and papers
Evolution of product features
Challenges of adopting vision
Technology challenges
UI challenges
Common misconceptions about vision
Future
Videos of some recent papers
33
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Overview
Overview of computer vision features and papers
Evolution of product features
Challenges of adopting vision
Technology challenges
UI challenges
Common misconceptions about vision
Future
Videos of some recent papers
34
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Technology challenges – “Solved” problems
Is face detection solved?
Why are there no papers about face detection anymore?
Detecting text in images. Solved problem?
Face recognition “in the wild”, with low-res images, motion blur, profile view.
35
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Technology Challenges – Different Context
The best performing technology on standard tests may not be the optimal one for our needs
Face detector
36
CMU-MIT set Our needs Low resolution High resolution Grayscale images Color images Upright faces Large in-plane rotation
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Technology Challenges – Different Context
The best performing technology on standard tests may not be the optimal one for our needs
Face detector
Face recognizer
Cooperative vs. non-cooperative subject
Controlled environment vs. non-controlled
Sunglasses. Hair style. Clothes.
37
CMU-MIT set Our needs Low resolution High resolution Grayscale images Color images Upright faces Large in-plane rotation
For traditional FR sunglasses are noise In our case they are a useful signal
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Scalability Challenges
Methods that are ok in academia many not scale well and may not be applicable to industry.
Adding scalability can be non-trivial
Scalability is not always just an engineering problem.
Face tagging: How do we avoid the N2 face-to-face distance?
The differences in requirements can be staggering
38
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Overview
Overview of computer vision features and papers
Evolution of product features
Challenges of adopting vision
Technology challenges
UI challenges
Common misconceptions about vision
Future
Videos of some recent papers
39
© 2010 Adobe Systems Incorporated. All Rights Reserved.
UI Design Challenges
Traditional design model:
UI designer creates a feature spec to optimize user experience
Engineers implement the spec
Any deviations from the spec are “bugs” to be fixed
Computer vision requires designing features that complement the strengths and weaknesses of the underlying technology
40
Elements chooses which faces to label
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Failures of the technology are not bugs but should be part of the workflow
41
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Successful features require active collaboration
42
UI Designer Engineer
Computer Vision
Researcher
© 2010 Adobe Systems Incorporated. All Rights Reserved.
UI Challenges
43
UI does not leverage the full knowledge of the engine UI is forced to make specific label proposals, so:
If it makes too few proposals it won’t help the user much If it makes too many, the user will spend a lot of time correcting
wrong labels
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Overview
Overview of computer vision features and papers
Evolution of product features
Challenges of adopting vision
Technology challenges
UI challenges
Common misconceptions about vision
Future
Videos of some recent papers
44
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Computer Vision is Too Hard
45
“How did you do this??”
“This feels like magic”
“Do you extract my facial features and give them to the government?”
For many people Photoshop Elements 4.0 (2005) was their first encounter of object recognition and computer vision
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Computer Vision is Too Easy
“If you can detect faces, why can’t you detect dogs?”
Computer vision must work perfectly
“The Adobe face detector has a bug: it thinks my chair is a face!”
46
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Future
Intelligent image/video manipulation Click on a person to select. Press delete to remove
Click on the hair. Change the hairstyle
Turn the head towards the camera.
Intelligent fill
Relighting
Cut and paste for images and video
Leveraging context (cat in one image – correct the other)
Intelligent search Cloud architecture
Constantly improving via online training
New modalities (HDR, stereo, depth)
47
© 2010 Adobe Systems Incorporated. All Rights Reserved.
Overview
Overview of computer vision features and papers
Evolution of product features
Challenges of adopting vision
Technology challenges
UI challenges
Common misconceptions about vision
Future
Recent work from Adobe
48