January 14, 2014 Sam Siewert
CS A485 Computer and Machine Vision
Lecture 1 – Introduction Part-1
The Course An introductory course on computer vision and machine vision. Topics covered include difference between computer and machine vision, image capture and processing, filtering, thresholds, edge detection, shape analysis, shape detection, pattern matching, digital image stabilization, stereo ranging, 3D models from images, real-time vision systems, recognition of targets, and applications including inspection, surveillance, search and rescue, and machine vision navigation. http://www.cse.uaa.alaska.edu/~ssiewert/a485.html Richard Szeliski, Computer Vision: Algorithms and Applications, Springer, 2011. (ISBN 978-1-84882-934-3) author link Gary Bradski, Adrian Kaehler, Learning OpenCV, 2nd Edition, O’Reilly, 2012. (ISBN 978-1449314651) publisher link Sam Siewert 2
Sam Siewert UC Berkeley – National Research University, Philosophy/Physics University of Notre Dame, BS - Private, Aerospace/Mechanical Johnson Space Center, U. of Houston – UHCL Computer Engineering, National R&D Center, Mission Control Center U. of Colorado, Boulder, MS/PhD – Growing Research University, Gov’t Labs, Start-Ups, Computer Science
Interdisciplinary Teaching & Research – Aerospace/Mechanical, Computer Science, Computer Engineering
CU Boulder Senior Instructor, Adjunct Professor CTO, Architect, Developer/Engineer
Sam Siewert 3
1984-85
1985-89
1989-92
1992-today
Sam Siewert 4
Related Industry Background
General Experience (~24 Years in Embedded and Scalable Systems) – Intel Architecture Group (Atom, Scalable Cloud Solutions) – CTO at Atrato Inc., a Digital Media Storage Start-up in Broomfield – Consulting with Numerous Digital Media Firms – 12 Years NASA JSC, NASA JPL / CU, NASA JPL / Ball Aerospace – 12 Years Commercial Telecomm, Storage/Networks, Embedded, Digital
Video
Machine Vision – Spitzer Space Telescope – Sky-scan Mosaics, Super-resolution, Peak-
Up – Optical Navigation – JPL – Robotics at CU-Boulder
Computer Graphics – UAV/UAS Digital Video and Graphics Overlays for Frame Latency
Indication – Space Station/Shuttle Mission Control Real-time Avionics Displays
Digital Media – Real-Time Digital Video Frame Transformation (1080p, 60Hz), Color
Enhancement – Digital Media Storage and Networking
Course Topics Computer Vision – Emulation and Replication of Human-like Vision with Computers – Goal is to Understand, Model, and Augment Human Vision – Provide Robotics with Human-like Vision Capability
Machine Vision – Using Digital Cameras to Automate a Process (E.g. Printed Circuit
Board Inspection, Sorting Recycling, Security Cameras) – Compared to Computer Vision (More Practical Application) – Applications (Optical Navigation, Sorting, Segmentation and
Recognition)
Digital Media – Digital Video Encoding/Decoding – Some Overlap with A490 Course, but Only to Extent Needed to
Understand CV/MV Cameras and Transport
Linux-based Labs
Sam Siewert 5
Computer Vision
Sam Siewert 6
http://en.wikipedia.org/wiki/File:CVoverview2.svg
Sam Siewert 7
Why Linux? … From Game Consoles to Super-Computing
From Android Mobiles to GIS and Digital Video Services Huge Value in Open Source Drivers, Tools, and Applications – Speeds Up Time to Market Focus on Leveraging Linux for Desktop and Embedded Systems for Machine Vision and Graphics
http://www.nscc-tj.gov.cn/en/
Tianhe-1 Pflop Blue Gene PS3 GPGPU
Why OpenCV Long History of Computer Vision Capture in one C/C++ Library Open Source Runs on Linux (Easy Ubuntu install) Students and Instructors Love it! Abstracts Low-Level Algorithmic Details – We will Leverage, but Also Implement Ground-Up to Learn – Provides Algorithm Compare and Abstracted CV Design
Well Documented on the Web and in Books
Sam Siewert 8
Machine Vision Systems Camera Basics – Extrinsic and Intrinsic Embedded Systems for Machine Vision Fundamentals – Background Elimination – Edge Enhancement and Other Convolutions
Optical Navigation – Segmentation Methods – Tracking (Centroid of Object)
Stereo Vision – Distance Estimation Methods (Disparity) – Binocular Vision vs. RGB-Depth Mappers (PrimeSense, Asus Xtion,
Creative Cam)
Sam Siewert 9
Digital Media Systems Embedded Media Devices – Set-Top Boxes (Linux) – Mobile Media Systems: Smart Phones, Tablet Computing, Readers,
Notebooks, DVD Players, iPODs, etc. – Digital Camera Systems (SD, HD, HD-SDI, 2K, 4K, 6K)
Resolutions/Formats - http://en.wikipedia.org/wiki/File:Vector_Video_Standards2.svg
– Game Consoles: X-box, PS3, Etc. – Gesture Recognition, Augmented Reality – SD , HD Cameras and Interfaces: Composite, S-Video, Component,
DVI, HDMI
Scalable Digital Media Server Systems (Head End) – Post Production for Digital Cinema, TV, Web
2K, 4K, 6K Streams from Digital Cameras Frame/Color Editing, CGI (Computer Generated Imagery), Soundtrack, Write to Distribution Media
– Digital Cinema: HD Digital Projectors, 3D Digital Projectors – Closed Circuit Security Systems: Multi-Camera NTSC/HD
Sam Siewert 10
MV vs CV vs Video Analytics Machine Vision – Photometers Used in Process Control – Successful History – Industrial Automation and Robotics – Controlled Environments – Inspection, Optical Navigation, Medical
Computer Vision – Emulate Human Vision System – Early Underestimation – Marvin Minsky Summer Project – Challenge of Un-controlled Environments – 50 Years Later, Challenges Better Understood – Vision Prosthetics, General Automation – Recent Breakthroughs – USC, DARPA Artificial Retina, Google
Car – Efficiency and Generalization?
Video Analytics – CV from RT or Stored/Networked Video Sam Siewert 11
Spitzer – JPL/Caltech
CU-Boulder ECEN 5623
If Possible, CV => MV Conversion – Cheat! Practical Solution – Convert CV to MV Problem – Loss of Generalization (Solves One Problem Rather than Class) – Controlling Environment May Be Difficult – Use Non-Visible Spectrum to Advantage (e.g. FLIR) – Sensor Fusion (Visible + FLIR, RADAR, GPS, …) – Models and Prior-Knowledge of Problem Exploited
Overhead Camera Dark Background Overhead Lighting Game State / Grid Known Shape Database Active RGB-D (e.g. Kinect)
Sam Siewert 12
CU-Boulder ECEN 5623
Why is Human Vision > Computer?
Sam Siewert 13
Approximately 100+ Mega-Pixel
(Rod/Cone Count)
Cortex=10 Billion Neurons (High fan-out)
Total=100 Billion Neurons
Neuroscience. 2nd edition. Purves D, Augustine GJ, Fitzpatrick D, et al., editors. Sunderland (MA): Sinauer Associates; 2001. http://www.ncbi.nlm.nih.gov/books/NBK10848/
Red Epic 645 63 Mega-Pixel
I/O Bus (x16 5Gbps = 8GB/sec)
Camera Link
Interface Card
Local Bus
CPU CPU
Memory Controller
5 To 7 billion transistors 1. Neuron > Transistor 2. Better Programming? ROM? 3. More Richly Interconnected 4. Storage + Processing
> 1 Trillion Synapses
Biological Vision vs. Machine Vision (Why A Honey Bee is Better than HPC for CV)
Humans - 100 million Photoreceptors
– 10 billion Neurons (Cerebral Cortex) – Brain with 100 billion Neurons – Millisecond Transfer – Massively Parallel Analog + Digital Computation
Synapse Match is a Challenge
– 7000 Connections from 10 Billion Neurons – 3 Year Olds Have 1015 Synapses
CPU to Digital Camera/HDD
– Connects 10’s of millions of pixels – to Several Billion transistors – Through Sequential Logic and I/O Bus
Sam Siewert 14
960K Neurons in flight: Learns locations, complex odors, colors, and shapes; with high efficiency (500 Watt/Kg), 0.218g
Brain plasticity for learning, connectedness, concurrency, integrated sensing, power efficiency, and resiliency
2012 – 8 billion?
NVIDIA GK110 28nm, (7.1 billion)
Intel MICA 22nm (5 billion)
http://en.wikipedia.org/wiki/List_of_animals_by_number_of_neurons
http://en.wikipedia.org/wiki/File:Transistor_Count_and_Moore%27s_Law_-_2011.svg