Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British...

transcript

Distinctive Image Features from Scale-Invariant KeypointsBy David G. Lowe, University of British Columbia

Presented by:

Tim Havinga,

Joël van Neerbos

Robert van der Linden

Organization

• Introduction• Keypoint extraction• Applications

Introduction

Matching images across affine transformation:

Change in lightingand 3D viewpoint:

Introduction

• Motion tracking• Object and scene recognition• Stereo correspondence

Extracting features

• Extrema detection• Keypoint localization• Orientation assignment• Local image descriptor

Extrema detection

Blur copies of the image with broadening Gaussian filters.

Extrema detection

Subtract these (DoG) to find local extrema.

Extrema detection

Calculate the DoGs for different gaussians.

Extrema detection

Calculate the DoGs for different gaussians.

Extrema detection

Blur Blur

Keypoint localization

Select keypoints that are higher or lower than their 26 neighbours.

Reject all points where the contrast is too low.

Reject all points that lie on an edge.

Effects of this elimination

Extrema detection

Contrast check

Edge check

Extracting features

• Extrema detection• Keypoint localization• Orientation assignment• Local image descriptor

Orientation assignment

Assign an orientation to a keypoint to make its descriptor invariant to rotation

Orientation assignment

The orientation of a keypoint is determined in four steps:

1. Determine sample points2. Determine the gradient magnitude and

orientation of each sample point3. Create an orientation histogram of the

sample points4. Extract the dominant directions from the

histogram

Step 1: Determine sample points

• The source image is the Gaussian smoothed image with the closest scale

• Use all pixels within a certain radius

Actual scale

Used Gaussian

Step 2: Determine gradient magnitude and orientation of each sample point

• Gradient magnitude:

• Gradient orientation:

Step 2: Determine gradient magnitude and orientation of each sample point

• Gradient magnitude:

• Gradient orientation:

Step 3: Create an orientation histogram

• The histogram has 36 bins, each covering 10 degrees

• Each sample is weighted its gradient magnitude and a Gaussian weighted circular window

Step 4: Extract dominant directions

• Take the peak(s) from the orientation histogram

• Use all peaks greater than 80% of the highest peak

• Every direction gets its own keypoint

The Local image descriptor

• Every keypoint now has a location, scale and orientation, from which a repeatable 2D grid can be determined

• We want distinctive descriptor vectors, partially invariant to illumination and viewpoint changes

Computing the Local image descriptor

• Take the 16 x 16 sample array around the keypoint

• Compute 4 x 4 orientation histograms from this array

• Use 8 bins per histogram: 4x4x8=128 features

Local image descriptor optimizations

• Normalize the obtained feature vector to enhance invariance to illumination changes

• Reduce the influence of large gradient magnitudes by capping the normalized features to 0.2

• Normalize again

Possible applications for SIFT

We have a feature extraction method which yields useful keypoints, what's next?

• Some appications:• Object recognition in images• Panorama stitching• 3D scene modelling• 3D human action tracking

(for example for security surveillance)• Robot localisation and mapping

Panorama stitching

Brown, ICCV 2003

Panorama stitching

(from Sudderth et al., 2006)

3D modelling

Application: SIFT to object recognition

We can applicate SIFT to recognize objects in images.Say, we have an image which contains an object. How to recognize?

Key idea: Compare keypoints, if these are similar it is likely that it is the same object.

First problem: a lot of features arise from background clutter. How to remove these?

Possible approach: - Look for clusters of matching features- Look for distance of closest match to the second-closest match

Efficiently locating the nearest neighbour

128 dimensional feature vector for each keypoint: no search optimization possible, no better way to find the nearest neighbour than exhaustive search.

But: only 3 features are enough to locate objects, for example when occluded.

Hough Transform method is used to describe clusters of keypoints as shapes and let them 'vote' for the pose of an object, described in location, orientation and scale.

Application: robot vision, localization and mapping

Se, S. Lowe, D. G. Little, J. Vision-based Mobile Robot Localization And Mapping using Scale-Invariant Features, 2001

Application of SIFT to mobile robotics SIFT features combined with Simultaneous

Localization And Map Building (SLAMB) Recognizing landmarks: estimation of the 10m by 10m lab, 3000 features collected Preliminary results: quite good

Conclusions from the paper

• The keypoints SIFT extracts are indeed invariant to image rotation, scale and robust to affine distortion, noise and change in illumination.

• SIFT can be optimized to run real-time.• The proposed approach (SIFT combined with

Hough transform for object recognition) has shown to work reliably.

Discussion

• Is the SIFT method for keypoint extraction the best way to get distinctive features from images?

• Is SIFT biologically plausible? Is it important to have biologically inspired methods in object recognition / localization?

References

Main article:• Distictive Image Features from Scale-Invariant

Keypoints, D. G. Lowe. International Journal of Computer Vision 60, 91-110, 2004.

Other articles:• Depth from Familiar Objects: A Hierarchical Model for

3D Scenes, Sudderth et al, Proceedings of the 2006 IEEE Conference on computer vision and pattern recognition, volume II, 2410-2417, 2006.

• Vision-based Mobile Robot Localization And Mapping using Scale-Invariant Features, Se, S. Lowe, D. G. Little, J., 2001

Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British...

Documents