+ All Categories
Home > Documents > COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE...

COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE...

Date post: 20-Jan-2016
Category:
Upload: eleanor-dixon
View: 213 times
Download: 0 times
Share this document with a friend
25
COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013
Transcript
Page 1: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

COMPUTER VISION: SOME CLASSICAL PROBLEMS

ADWAY MITRAMACHINE LEARNING LABORATORY

COMPUTER SCIENCE AND AUTOMATIONINDIAN INSTITUTE OF SCIENCE

June 24, 2013

Page 2: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

WHAT IS COMPUTER VISION and WHY IS ITDIFFICULT?

Computer Vision, obviously, aims to build computers that can see!

In other words, it deals with analyzing/understanding images and videos through computers

Aim of analysis is to find known patterns in images -

Detection, or

match images with known patterns - Recognition For analysis of image we first need a representation for it

An image is stored in a computer as a 2 or 3 dimensional matrix, each element a pixel

A single pixel carries very little, if any, semantic information!!!!

Page 3: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

Representation with Features

For most applications of machine learning, the first and foremost step is to find features

Features are used for representation of the data

Features should be such that we can have a metric space for them - usually they are vectors

Very elaborate features (high-dimensional) need to be avoided for computational reasons

Feature Vector- Difficult to process

Smaller FeatureVector

Representation Dimensionality Reduction

Page 4: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

Features for Computer Vision

Pixel values can serve as features, but are often not very meaningful

Groups of pixels can have more meaning- but how to form such groups??

Groups-of-pixels/sub-images at large number of scales and positions

Image gradients/edges

Various Filter Outputs have also been explored

Difficult to interpret semantically, but found to work well in certain applications

Finding concise, semantically meaningful features still a very major issue in Computer Vision

Page 5: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

SIFT Interest Points

A filter is an operator which processes a signal and removes some undesired components

Difference-of-Gaussian Filters - a popular filter for images

Positions of local maxima of this filter output are the interest points

Some interest points, like those on the edges, are discarded

At each interest point, a feature vector is computed using image gradients and their orientations inside small windows around the interest point

This feature is invariant to orientation and scale of the image

SIFT: Scale-Invariant Feature Transform

Page 6: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

SIFT INTEREST POINTS

Page 7: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

FACE DETECTION-PROBLEM

Given an image, find the faces in it.

Used in many places like digital cameras and photo sharing albums, including Facebook

Given a rectangular region in an image, say if it is a face or not!

Repeat this process for every location and every size of the rectangular region

Page 8: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

FACE DETECTION-GENERAL APPROACH

Basically a binary classification problem

Requires building model for face

Needs training samples- both positive and negative

Positive samples are face images, negative samples are non-face images

FACE images NON-FACE images

Page 9: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

FACE DETECTION-GENERAL APPROACH

Basically a binary classification problem

Requires building model for face

Needs training samples- both positive and negative

Positive samples are face images, negative samples are non-face images

Learning algorithm finds boundary between face and non-face images

FACE images NON-FACE images

Page 10: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

FACE DETECTION-GENERAL APPROACH

Basically a binary classification problem

Requires building model for face

Needs training samples- both positive and negative

Positive samples are face images, negative samples are non-face images

Learning algorithm finds boundary between face and non-face images

FACE images NON-FACE images

Candidate

Page 11: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

FACE DETECTION- BENCHMARK and EVALUATION

Standard face-detection benchmark datasets available

FDDB: Face Detection dataset for unconstrained setting

Performance usually measured using Precision and Recall

Precision: Of the reported face detections, how many were actually faces?

Recall: Of the faces actually present, how many were detected?

F-score: Harmonic mean of precision and recall

Page 12: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

FACE RECOGNITION-PROBLEM

Consists of a training phase and a testing phase

In the training phase we are given many face images, each marked with the identity of the person

In the testing phase, we are given a new face image, belonging to one of these persons

The task is to find out the identity of the person

This is a simple Classification problem in Machine Learning

First suitable features and representations have to be found

Page 13: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

FACE RECOGNITION-PROBLEM

One approach is to build a model for each person, using the training images provided for him

Second approach is to compare the test image to each of the training images, and find the closest match

It may be observed that not every part of face image helps in recognition- certain things about faces are common to everyone

A good strategy is to find the features that are most distinctive and represent images only by them

Eigenfaces (1991) uses the last two strategies

Recognition accuracy is the obvious evaluation criteria

A good recognition algorithm should work well with less number of training images

Page 14: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

FACE RECOGNITION-CURRENT STATUS

Face recognition has traditionally been done with well-cropped, focussed face images - Controlled Environment

Considered a solved problem.

Nowadays face recognition is being revisited for semi-controlled or uncontrolled environments.

LFW (Labelled Faces in Wild) - a dataset of face images taken in such settings - a new benchmark

Page 15: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

OBJECT RECOGNITION-PROBLEM

Classification task like face recognition

Practically much more complex

Large number of images given from many object categories

Classify a test image into one of these categories

Problem made very difficult by intra-class variations

Page 16: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

OBJECT RECOGNITION-GENERAL APPROACH

Once again the idea is to build models for different objects

No single feature may be enough for classification

Some objects may have a distinctive color, others may have a distinctive shape

Multiple Kernel Learning - a sophisticated machine learning formulation, generally considered the best approach for this problem

Caltech-101: a dataset of 101 object categories

Close to 80 % accuracy obtained by Multiple Kernel Learning

Caltech-256: a dataset of 256 object categories - Accuracy of 50 % considered good!

Intra-class variations continue to pose significant challenge and even scepticism - is it at all a valid problem???

Page 17: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

OBJECT DETECTION

Given an image find all the birds, trees, and cars in it!

Requires building models for each of these objects

Once again search entire image at multiple positions and scales

Part-based Models of objects considered efficient

Instead of modelling whole object, model different parts separately

Helps to handle occlusion and perhaps intra-class variations

Page 18: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

IMAGE SEGMENTATION

Given an image, divide it such that each segment contains an object

Basically a clustering problem

Does not require features and is done purely with pixel values

Has inspired advanced clustering techniques like spectral clustering

Graph-based method- models image as graph with each pixel representing a node and adjacent pixels connected by edges

Each edge is given a weight according to similarilty of the corresponding pixel values

Requires number of segments to be specified

Page 19: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

IMAGE SEGMENTATION

Segmentation evaluated with respect to a gold standard segmentation

Every pair of pixels coming in the same segment in the gold standard should also be in same segment in the segmentation

(and similarly for each pair of pixels coming in different segments)

Page 20: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

Video Problems

Videos are collections of images taken over an interval of time- successive images are quite similar

Having to handle several images rather than one may make video problems tougher

But the temporal continuity of videos provides a way out

Joint modelling of multiple similar images can, in fact, give better performance than modelling single image

For video tasks, additional motion-based features like optical flow can be used

Concept of Interest-points for images is extended to Space-Time Interest Points for videos

Face Recognition, Face Detection etc can also be done in videos, often more effectively than in images

Page 21: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

OBJECT TRACKING-PROBLEM

Given a video which shows a person/object moving

Need to find it in each frame

Naive approach- reduce it to object detection problem

If object is at position (x, y) in frame t, it will be very close in frame (t + 1)

So if we know the position in time t, we need to search only around that same position

Reduces search space greatly!!

Main idea is to build an appearance model for the object

The appearance may change over time due to variations in size, illumination, viewpoint etc

The appearance model must be adaptive- and recomputed throughout the video

Page 22: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

OBJECT TRACKING- BENCHMARK and EVALUATION

Performance measured with respect to gold standard, where in each frame a bounding box is provided

Proportion of overlapping areas of the gold standard and reported bounding boxes

Page 23: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

OBJECT TRACKING-CURRENT STATUS

Considered a solved problem under controlled illumination and background

Current research aims to handle occlusion of the object, and sudden changes in background and illumination

Tracking multiple objects at the same time is another important problem

Tracking is a real-time application. Efforts are on to process as many frames as possible per second

To adapt or not adapt- remains the fundamental problem in vision.

A single miss can make the whole tracking go wrong.

Detection and correction of miss is an important problem to solve

Page 24: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

ACTION RECOGNITION IN VIDEOS

Surveillance cameras are nowadays available at many sensitive public locations

The aim is to record activities of people

Requires use of dynamic features, which make use of the motion in videos

Some image-based features can be extended to videos, like space-time interest points

These can be used by viewing the video as a space-time volume

The features can also be in the form of time-series

Page 25: COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

ACTION RECOGNITION IN VIDEOS

In presenece of a benign background, static camera and a single actor, the problem is considered solved

Current research aims to handle complex environments, like crowded places, where the persons frequently get occluded

Multi-person interaction recognition is another recent branchout of the problem


Recommended