Components of a computer vision system
Lighting
Scene
Camera
Computer
Scene Interpretation
Srinivasa Narasimhan’s slide
Computer vision vs Human Vision
What we see What a computer sees
Srinivasa Narasimhan’s slide
A little story about Computer Vision
In 1966, Marvin Minsky at MIT asked his undergraduate student Gerald Jay Sussman to “spend the summer linking a camera to acomputer and getting the computer to describe what it saw”. We now know that the problem is slightly more difficult than that. (Szeliski 2009, Computer Vision)
A little story about Computer Vision
In 1966, Marvin Minsky at MIT asked his undergraduate student Gerald Jay Sussman to “spend the summer linking a camera to acomputer and getting the computer to describe what it saw”. We now know that the problem is slightly more difficult than that. (Szeliski 2009, Computer Vision)
Founder, MIT AI project
A little story about Computer Vision
In 1966, Marvin Minsky at MIT asked his undergraduate student Gerald Jay Sussman to “spend the summer linking a camera to acomputer and getting the computer to describe what it saw”. We now know that the problem is slightly more difficult than that. (Szeliski 2009, Computer Vision)
Founder, MIT AI project
Professor of Electrical Engineering, MIT
A little story about Computer Vision
In 1966, Marvin Minsky at MIT asked his undergraduate student Gerald Jay Sussman to “spend the summer linking a camera to acomputer and getting the computer to describe what it saw”. We now know that the problem is slightly more difficult than that. (Szeliski 2009, Computer Vision)
Image Understanding
A little story about Computer Vision
In 1966, Marvin Minsky at MIT asked his undergraduate student Gerald Jay Sussman to “spend the summer linking a camera to acomputer and getting the computer to describe what it saw”. We now know that the problem is slightly more difficult than that. (Szeliski 2009, Computer Vision)
Image Understanding
Image Sensing
Continue on CAPTCHA
CAPTCHA stands for "Completely Automated Public Turing test to Tell Computers and Humans Apart".
Picture of a CAPTCHA in use at Yahoo.
http://www.cs.sfu.ca/~mori/research/gimpy/
Breaking a Visual CAPTCHA
http://www.cs.sfu.ca/~mori/research/gimpy/
On EZ-Gimpy: a success rate of 176/191=92%!
Other exampleshttp://www.cs.sfu.ca/~mori/research/gimpy/ez/
Breaking a Visual CAPTCHA
http://www.cs.sfu.ca/~mori/research/gimpy/
On more difficult Gimpy: a success rate of 33%!
Other exampleshttp://www.cs.sfu.ca/~mori/research/gimpy/hard/
Breaking a Visual CAPTCHA
YAHOO’s current CAPTCHA format
http://en.wikipedia.org/wiki/CAPTCHA
Face Detection and Recognition
Applications: Security, Law Enforcement, Surveillance
Face Detection and Recognition
Smart cameras: auto focus, red eye removal, auto color correction
Face Detection and Tracking
Face Detection and Tracking
Face Detection and Tracking
Lexus LS600 Driver Monitor System
General Motion Tracking
Hidden Dragon Crouching Tiger
General Motion Tracking
Application
Andy Serkis, Gollum, Lord of the Rings
Segmentation
http://www.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/
Segmentation using Graph Cuts
Application
Medical Image Processing
Segmentation using Graph Cuts
Input Matting: Soft Segmentation
Composition
Segmentation using Graph Cuts
State-of-the-art Tool (videosnapcut.mp4)http://juew.org/projects/SnapCut/snapcut.htm
From 2D to 3D
http://www.eecs.harvard.edu/~zickler/helmholtz.html
Projective Geometry
Single View Metrology
• http://research.microsoft.com/vision/cambridge/3d/default.htm
Single View Metrology
• http://research.microsoft.com/vision/cambridge/3d/default.htm
Stereo
scene point
optical center
image plane
Stereo
Basic Principle: Triangulation• Gives reconstruction as intersection of two rays• Requires
– Camera positions– point correspondence
Using 3D structure to organize photos
http://phototour.cs.washington.edu/
Using 3D structure to organize photos
http://photosynth.net/
Reconstructing detailed 3D models
example input imagerendered model
Reconstructing detailed 3D models
example input imagerendered model
Reconstructing detailed 3D models
example input imagerendered model
http://grail.cs.washington.edu/projects/mvscpc/
Reconstructing detailed 3D models
example input imagerendered model
Reconstructing detailed 3D models
example input imagerendered model
Application: View morphing
Application: View morphing
From Static Statues to Dynamic Targets
http://research.microsoft.com/~larryz/videoviewinterpolation.htm
…|
MSR Image based Reality Project
Video Projectors
Color Cameras
Black & White Cameras
Spacetime Face Capture System
System in Action
Input Videos (640480, 60fps)
Spacetime Stereo Reconstruction
Applications
Entertainment: Games & Movies
Medical Practice:Prosthetics
Computational Photography• High Dynamic Range
Conventional Image High Dynamic Range ImageNayar et al 2002
Computational Photography• High Dynamic Range
High Dynamic Range ImageNayar et al 2002
Sensor Optics
Modulator
Assorted-pixel camera
Computational Photography• High Dynamic Range
Digital Gain AdjustmentHandheld camera
Computational Photography• High Dynamic Range
High Dynamic Range ImageZhang et al 2010
Handheld camera
Summary• Recognize things• Reconstruct 3D structures• Enhance Photography
If you are interested in,
Courses:CS766 Computer Vision CS638 Special Topics
Computational PhotographyCS638 Special Topics
Computational Methods in Medical Image Analysis
Faculty: Chuck Dyer, Vikas Singh, Li Zhang
Major Conferences: Computer Vision and Pattern Recognition (CVPR)International Conference on Computer Vision (ICCV)European Conference on Computer Vision (ECCV)ACM SIGGRAPH Conference (SIGGRAPH)