Improvements to SIFT - SURF & CenSure Applications of · PDF file ·...

Lecture Name Course Name

School of Computer Science & StatisticsTrinity College DublinDublin 2Irelandwww.scss.tcd.ie

Improvements to SIFT - SURF & CenSure

1

Applications of strong features - Photosynth & CBIR

1


Objectives Look at extensions to the SIFT model of strong

features Detector + Descriptor + Matching

Examine the use of strong features within two different web based applications of computer vision Building 3D models from web photo collections

Photosynth, www.arc3d.be Content based or Reverse image retrieval on the web

Goggle images / Google Goggles / www.tineye.com

2

2

http://www.tineye.com

http://www.tineye.com


SURF: Speeded Up Robust Features - ECCV 2006Herbert Bay, Tinne Tuytelaars and Luc Van Gool

Three stages Detection

Find the key points Description

Build a descriptor of the key points Matching / recognition

Finding key points in a database SIFT especially slow in the last step

Dimensionally of the key point descriptor SURF tries to speed up each step

3

3

http://www.springerlink.com/content/?Author=Herbert+Bay

http://www.springerlink.com/content/?Author=Herbert+Bay

http://www.springerlink.com/content/?Author=Tinne+Tuytelaars

http://www.springerlink.com/content/?Author=Tinne+Tuytelaars

http://www.springerlink.com/content/?Author=Luc+Van+Gool

http://www.springerlink.com/content/?Author=Luc+Van+Gool


First Speed Up - Integral Image Integral image

4

4


SURF - fast key-point detector Hessian based key point detector

Where Laa(x) is the partial derivative in direction a and Lab(x) is a mixed partial derivative

Approximate Partial derivatives with box filters

5

Lyy Lxy DxyDyy

5


SURF - Scale detector not image

6

SURF - one image, scale the detectorsSIFT - multiple images convolved with Gaussian, then calculate DoG

6


Detection similar - SIFT & SURF

7

7


Orientation Assignment SIFT = HOG, gaussian

weighting, multiple orientations SURF

distribution of Harr wavelet responses in X and Y

Summed in a [60 deg] window Max response is the principal

direction Other candidates ignored

8

8


SURF - Key point description Split the interest region up into 4 x 4 square sub-regions

with 5 x 5 regularly spaced sample points inside Calculate Haar wavelet response dx and dy

Weight with a Gaussian kernel centered at key-point Sum the response over each sub-region for dx and dy

separately: feature vector of length 32 In order to bring in information about the polarity of the

intensity changes, extract the sum of absolute value of the responses: feature vector of length 64

SURF-128 - The sum of dx and |dx| are computed separately for dy < 0 and dy >0 Similarly for the sum of dy and |dy|

9

9

Lecture Name Course Name 10

10


SURF - Matching Key points tend to be found

on blob like structures Sign of the Laplacian

separates bright from dark blobs This information is free and

accelerates matching

11

11


SURF - Evaluation Camera Calibration

3D object recognition

12

12


SURF Exploits Integral Images

Approximation of Partial Derivatives Harr wavelet responses to calculate orientation

Size of sliding window Resolution of the harr wavelets in the window

U-SURF Avoid orientation calculation Applications were camera

will not change significantly Driving, wearable cameras….

13

13


CenSure CenSurE: center surround extremeas for

realtime feature detection and matching Agrawal, Konolige, Blas ECCV 2008

More stable and more accurate than SIFT or SURF

Test domain Visual Odometery

14

14

http://www.cs.ualberta.ca/~jag/papersVis2/08ECCV/papers/5305/53050102.pdf





CenSurE SIFT - DOG SURF - approximation to Hessian CenSurE - approximation to Laplacian (LOG)

Centre Surround Function

15

Approximate Laplacian - bi-level filter 1 or -1

15


CenSure - Integral Image Uses Integral

Image But… regions are

not rectangular: e.g. octogon….?

Answer.. Build “slanted” Integral images to calculate polygons

16

16


CenSurE Similar to SURF - Scale invariance by upscaling

the filters n= 1,2,3,4,5,6,7

Non maximal suppression in scale Maxima in a 3x3x3 neighborhood Maxima and minima must pass a threshold

Octagonal filters rotation invariant

17

17


CenSurE - evaluation Visual Odometry

Extract distinctive features, find potential matches - search 1/5th image RANSAC based pose estimation If motion estimate is small and large number of

inliers - discard the frame (key frames) Further refinement using bundle adjustment (beyond

scope of course)

18

18


CenSurE results

19

19


Summary Strong Feature Detectors allow a range of 2D

and 3D applications Realtime responses on conventional hardware

Choose feature detector carefully SURF vs U-SURF, CenSurE vs SURF?

Need to measure performance within an application domain empirically

Very active area of research! http://cvlab.epfl.ch/publications/publications/2009/

CalonderLKBMF09.pdf

20

20

http://cvlab.epfl.ch/publications/publications/2009/CalonderLKBMF09.pdf





Applications of strong feature detectors Photosynth

Build a 3D image using my photos & videos Location based links between images

Content Based Image retrieval Text based image look up

“find me images of dogs” Reverse image look up

“find me images like this”

21

21


Photosynth Multi-View Stereo for Community Photo

Collections - M Goesele, N Snavely, B Curless, H Hoppe, SM Seitz - research.microsoft.com/~hoppe/mvscpc.pdf

Multi View Stereo Dense Depth Maps Different

View points Illumination conditions Camera parameters Clutter

22

22

http://research.microsoft.com/~hoppe/mvscpc.pdf







Algorithm overview Calibrate cameras geometrically and

radiometrically PTLens

Global View Selection Good set of images for stereo matching

Local View selection Subsets of these images that will yield good

matches Stereo match at each pixel Match Optimised if higher confidence found

23

23


Strong Reliance on SIFT Strong Feature extraction method

Scale Invariant Rotation Invariant Illumination Invariant

24

24


SIFT in Photosynth Common SIFT features used to identify potential

Global Views but.. Don’t want to be too close stereo ill conditioned- c.f.

Pollefeys Function used to encourage view angle difference of

>10degs Views need to be rescaled (SIFT is scale invarient)

Resample views to lowest common demoninator

25

25


Local View Selection Select a subset A of the N good match

candidates (A typically 4 ) A updated through the process based on

current estimate of depth Photometric consistency

NCC matching Sufficiently wide baseline

Match along epi-polar lines Reject outliers Depth and normal are optimised

26

26


106 images - 51 photographers

27

27


56 images, 8 photographers

28

28


2007

29

29


http://photosynth.net 2010

30

30

http://photosynth.net

http://photosynth.net

Course Name

School of Computer Science & StatisticsTrinity College DublinDublin 2Irelandwww.scss.tcd.ie

Content Based Image Retrieval

31

Image Retrieval: Ideas, Influences, and Trends of the New AgeRITENDRA DATTA, DHIRAJ JOSHI, JIA LI, and JAMES Z. WANG Computing Surveys, Vol. 40, No. 2, Article 5, Publication date: April 2008

Forsyth and Ponce Computer vision a modern approach Ch 25

31

http://infolab.stanford.edu/~wangz/project/imsearch/review/JOUR/datta.pdf






Course Name

Content Based Image Retrieval

32

Organise images using their visual content Google Images vs iPhoto “faces” feature

Internet becoming more visual Flicker, you tube, google images …. Text based tagging limited, leading to rise in CBIR

See http://www.gwap.com/gwap/ for “human computing” behind captchas

Image based Captcha

32

http://www.gwap.com/gwap/

http://www.gwap.com/gwap/

Course Name 33

Datta08

33

Course Name

Why understanding images is hard Sensory Gap

Real world object vs Information in description of the scene - picture is worth 1000 words...

Semantic Gap Differences between peoples interpretation of the

same image.

34

www.gwap.com

34

http://www.gwap.com

http://www.gwap.com

Course Name

Image processing

35

Datta08

35

Course Name

Colour Histograms 2D or 3D histograms of colour

Colour space: RGB, RG Chromaticity, XYLAB… Different tolerance to lighting changes, perceptual

weighting of colours

Tolerant to scale changes, rotations & warping

Colour manipulation of an image can put CBIR off

Does not capture spatial organisation

36

36

Course Name

Colour Correlogram Similar in concept to GLCM Autocorrelogram compares only identical

colours Captures small scale spatial distribution of

colours Scale of the correlogram

E.g. 64x64 quantised colours Can be used for image sub regions

Compare with texture segmentation

37

37

Course Name

Texture GLCM, Autocorrelation… Texture Histograms

Define bins that correspond to texture types in sub regions in the image

Histogram counts the occurrence of the texture types

Compact representation of texture info Texture of Textures

Recursive approach - texture based region segmentation - texture analysis of regions

38

38

Course Name

Global Vs Local Global features

Looking for a specific image no distinction between foreground and

background Simple & Fast

Local features Image sub-regions Perceptually driven features

39

39

Course Name

Image Sub regions

Tile based Similar to global Object “fracture” Simple...

Region based Sensible ROIs Objects

(sometimes) Complex...

40

40

Course Name

Shape Detection Dense Feature Approach to Shape

description Histograms of Oriented Gradients (HOG)

Dalal and Triggs

Sparse Approach involves “Segmentation” Edge based approaches Regions based approaches Shape Descriptors

Matching based on User Sketch / example image

41

41

Course Name

High Level Semantics Extract regions at belong to a large broadly

defined semantic group Face Detection Car, Boat, Airplane, Dog…….

Image tagging in large databases Very computationally intensive Limitations to approach

42

42

Course Name

Strong Features SIFT, SURF et. al. Image represented by an unordered

collection of features “Bag of words”

Features are “words” Create “Codebook”

E.g. Use KNN Use training set to create clusters, each representing an

object N clusters = N objects that can be recognised

Image mapped onto a histogram of codewords 43

43

Course Name

Matching Image Histograms Many representations of images are

histogram based Comparison based on the similarity of the

histograms Simple measures

Euclidian / SSD Overlap Chi-Square Statistic...

More robust Earth Mover distance(EMD)

44

44

Course Name

Chi Square (χ2) Compares two distributions

Simple approaches are sensitive to the way in which the data is binned

45

45

Course Name

Earth Mover Distance (EMD)(Wasserstein metric)

Each bin viewed as a pile of dirt piled on M The metric measures the cost of turning one

histogram into the other… i.e. The amount of dirt by the distance it needs to

be moved We want to find that measures the flow

between pi and qj such that it minimises the work

Subject to 4 constraints46

46

Course Name

EMD Once the optimisation problem has been

solved

Details in The Earth Mover's Distance as a Metric for

Image Retrieval - 1998 Yossi Rubner, Carlo Tomasi, and Leonidas J. Guibas

Code for EMD http://www.cs.duke.edu/~tomasi/software/emd.htm

47

47

http://www.cs.duke.edu/~tomasi/papers/rubner/rubnerTr98.pdf






Course Name

Relevance Feedback Image Retrieval is iterative User target class is unique

For some images colour may be a good feature For others texture, shape, etc.

User provides feedback on Good match Bad match Neutral match

Imbalance between new training set and global set - SVM not suitable

48

48

Course Name 49

49

Course Name

CBIR

50

50

Course Name

CBIR Large and rapidly growing field Applications to video as well as images

Violence detection - TCD & Google Pornography Detection - TCD and PixAlert

Knowledge of image descriptors strengths and weaknesses

Knowledge of image matching methods strengths and weaknesses

Role of Relevance Feedback

51

51

Date post:	22-Mar-2018
Category:	Documents
Upload:	lequynh
View:	219 times
Download:	1 times

Improvements to SIFT - SURF & CenSure Applications of · PDF file ·...

Documents