+ All Categories
Home > Documents > Reconnaissance d’objets et vision artificielle

Reconnaissance d’objets et vision artificielle

Date post: 26-Feb-2016
Category:
Upload: melva
View: 21 times
Download: 0 times
Share this document with a friend
Description:
Reconnaissance d’objets et vision artificielle http://www.di.ens.fr/willow/teaching/recvis09. Lecture 3 A refresher on camera geometry Image alignment and 3D alignment. Check it out! Cours de “Computational photography” de Frédo Durand Le jeudi de 9h30 a 12h30 Salle Info 2 - PowerPoint PPT Presentation
Popular Tags:
75
econnaissance d’objet t vision artificielle p://www.di.ens.fr/willow/teaching/recvi Lecture 3 A refresher on camera geometry Image alignment and 3D alignmen
Transcript
Page 1: Reconnaissance d’objets et vision  artificielle

Reconnaissance d’objetset vision artificielle

http://www.di.ens.fr/willow/teaching/recvis09

Lecture 3

A refresher on camera geometryImage alignment and 3D alignment

Page 2: Reconnaissance d’objets et vision  artificielle

Check it out!Cours de “Computational photography”

de Frédo DurandLe jeudi de 9h30 a 12h30 Salle Info 2

http://people.csail.mit.edu/fredo/Classes/Comp_Photo_ENS/

Page 3: Reconnaissance d’objets et vision  artificielle

N’oubliez pas!Premier exercice de programmation du

le 27 octobrehttp://www.di.ens.fr/willow/teaching/recvis09/assignment1/

Page 4: Reconnaissance d’objets et vision  artificielle

Pinhole perspective equation

zyfy

zxfx

''

''NOTE: z is always negative..

Page 5: Reconnaissance d’objets et vision  artificielle

Affine models: Weak perspective projection

0

'where''

zfm

myymxx

is the magnification.

When the scene relief is small compared its distance from theCamera, m can be taken constant: weak perspective projection.

Page 6: Reconnaissance d’objets et vision  artificielle

Affine models: Orthographic projection

yyxx

'' When the camera is at a

(roughly constant) distancefrom the scene, take m=1.

Page 7: Reconnaissance d’objets et vision  artificielle

Analytical camera geometry

Page 8: Reconnaissance d’objets et vision  artificielle

Coordinate Changes: Pure Translations

OBP = OBOA + OAP , BP = AP + BOA

Page 9: Reconnaissance d’objets et vision  artificielle

Coordinate Changes: Pure Rotations

BABABA

BABABA

BABABABAR

kkkjkijkjjjiikijii

.........

TB

A

TB

A

TB

A

kji

ABA

BA

B kji

Page 10: Reconnaissance d’objets et vision  artificielle

Coordinate Changes: Rotations about

the z Axis

1000cossin0sincos

RBA

Page 11: Reconnaissance d’objets et vision  artificielle

A rotation matrix is characterized by the following properties:

• Its inverse is equal to its transpose, and

• its determinant is equal to 1.

Or equivalently:

• Its rows (or columns) form a right-handedorthonormal coordinate system.

Page 12: Reconnaissance d’objets et vision  artificielle

Coordinate changes: pure rotations

PRP

zyx

zyx

OP

ABA

B

B

B

B

BBBA

A

A

AAA

kjikji

Page 13: Reconnaissance d’objets et vision  artificielle

Coordinate Changes: Rigid Transformations

ABAB

AB OPRP

11111P

TOPRPORP A

BA

ABAB

AA

TA

BBA

B

0

Page 14: Reconnaissance d’objets et vision  artificielle

Pinhole perspective equation

zyfy

zxfx

''

''NOTE: z is always negative..

Page 15: Reconnaissance d’objets et vision  artificielle

The intrinsic parameters of a camera

Normalized imagecoordinates

Physical image coordinates

Units:k,l : pixel/mf : ma,b : pixel

Page 16: Reconnaissance d’objets et vision  artificielle

The intrinsic parameters of a camera

Calibration matrix

The perspectiveprojection equation

Page 17: Reconnaissance d’objets et vision  artificielle

The extrinsic parameters of a camera

Page 18: Reconnaissance d’objets et vision  artificielle

Perspective projections induce projective transformations between planes

Page 19: Reconnaissance d’objets et vision  artificielle

Weak-perspective projection

Paraperspective projection

Affine cameras

Page 20: Reconnaissance d’objets et vision  artificielle

Orthographic projection

Parallel projection

More affine cameras

Page 21: Reconnaissance d’objets et vision  artificielle

Weak-perspective projection model

r(p and P are in homogeneous coordinates)

p = A P + b (neither p nor P is in hom. coordinates)

p = M P (P is in homogeneous coordinates)

Page 22: Reconnaissance d’objets et vision  artificielle

Affine projections induce affine transformations from planes onto their images.

Page 23: Reconnaissance d’objets et vision  artificielle

Image alignment task

?

• It helps to be able to compare descriptors of local patches surrounding interest points (cf last lecture).

• This is not strictly necessary. We will concentrate here on the geometry of the problem.

Page 24: Reconnaissance d’objets et vision  artificielle

Dealing with outliersThe set of putative matches still contains a very high

percentage of outliers

How do we fit a geometric transformation to a small subset of all possible matches?

Possible strategies:• RANSAC• Incremental alignment• Hough transform• Hashing

Page 25: Reconnaissance d’objets et vision  artificielle

Strategy 1: RANSACRANSAC loop (Fischler & Bolles, 1981):

• Randomly select a seed group of matches

• Compute transformation from seed group

• Find inliers to this transformation

• If the number of inliers is sufficiently large, re-compute least-squares estimate of transformation on all of the inliers

• Keep the transformation with the largest number of inliers

Page 26: Reconnaissance d’objets et vision  artificielle

RANSAC example: Translation

Putative matches

Page 27: Reconnaissance d’objets et vision  artificielle

RANSAC example: Translation

Select one match, count inliers

Page 28: Reconnaissance d’objets et vision  artificielle

RANSAC example: Translation

Select one match, count inliers

Page 29: Reconnaissance d’objets et vision  artificielle

RANSAC example: Translation

Find “average” translation vector

Page 30: Reconnaissance d’objets et vision  artificielle

Strategy 2: Incremental alignmentTake advantage of strong locality constraints: only pick

close-by matches to start with, and gradually add more matches in the same neighborhood

Approach introduced in [Ayache & Faugeras, 1982;Hebert & Faugeras, 1983; Gaston & Lozano-Perez, 1984]

Illustrated here with the method from S. Lazebnik, C. Schmid and J. Ponce, “Semi-local affine parts for object recognition”, BMVC 2004

Page 31: Reconnaissance d’objets et vision  artificielle

Incremental alignment: Details

Generating seed groups:• Identify triples of neighboring features (i, j, k) in first image• Find all triples (i', j', k') in the second image such that i' (resp.

j', k') is a putative match of i (resp. j, k), and j', k' are neighbors of i'

Page 32: Reconnaissance d’objets et vision  artificielle

Incremental alignment: Details

Beginning with each seed triple, repeat:• Estimate the aligning transformation between corresponding features

in current group of matches• Grow the group by adding other consistent matches in the

neighborhood

Until the transformation is no longer consistent or no more matches can be found

A

Page 33: Reconnaissance d’objets et vision  artificielle

Incremental alignment: Details

Beginning with each seed triple, repeat:• Estimate the aligning transformation between corresponding features

in current group of matches• Grow the group by adding other consistent matches in the

neighborhood

Until the transformation is no longer consistent or no more matches can be found

A

Page 34: Reconnaissance d’objets et vision  artificielle

Incremental alignment: Details

Beginning with each seed triple, repeat:• Estimate the aligning transformation between corresponding features

in current group of matches• Grow the group by adding other consistent matches in the

neighborhood

Until the transformation is no longer consistent or no more matches can be found

A

Page 35: Reconnaissance d’objets et vision  artificielle

Incremental alignment: Details

Beginning with each seed triple, repeat:• Estimate the aligning transformation between corresponding features

in current group of matches• Grow the group by adding other consistent matches in the

neighborhood

Until the transformation is no longer consistent or no more matches can be found

A

Page 36: Reconnaissance d’objets et vision  artificielle

Strategy 3: Hough transformSuppose our features are scale- and rotation-covariant

• Then a single feature match provides an alignment hypothesis (translation, scale, orientation)

David G. Lowe. “Distinctive image features from scale-invariant keypoints”, IJCV 60 (2), pp. 91-110, 2004.

model

Page 37: Reconnaissance d’objets et vision  artificielle

Strategy 3: Hough transformSuppose our features are scale- and rotation-covariant

• Then a single feature match provides an alignment hypothesis (translation, scale, orientation)

• Of course, a hypothesis obtained from a single match is unreliable• Solution: let each match vote for its hypothesis in a Hough space

with very coarse bins

model

David G. Lowe. “Distinctive image features from scale-invariant keypoints”, IJCV 60 (2), pp. 91-110, 2004.

Page 38: Reconnaissance d’objets et vision  artificielle

Hough transform• An early type of voting scheme• General outline:

• Discretize parameter space into bins• For each feature point in the image, put a vote in every bin in

the parameter space that could have generated this point• Find bins that have the most votes

P.V.C. Hough, Machine Analysis of Bubble Chamber Pictures, Proc. Int. Conf. High Energy Accelerators and Instrumentation, 1959

Image space Hough parameter space

Page 39: Reconnaissance d’objets et vision  artificielle

Parameter space representation• A line in the image corresponds to a point in Hough

space

Image space Hough parameter space

Source: K. Grauman

Page 40: Reconnaissance d’objets et vision  artificielle

Parameter space representation• What does a point (x0, y0) in the image space map to in

the Hough space?• Answer: the solutions of b = –x0m + y0

• This is a line in Hough space

Image space Hough parameter space

Source: K. Grauman

Page 41: Reconnaissance d’objets et vision  artificielle

Parameter space representation• Where is the line that contains both (x0, y0) and (x1,y1)?

• It is the intersection of the lines b = –x0m + y0 and b = –x1m + y1

Image space Hough parameter space

(x0, y0)

(x1, y1)

b = –x1m + y1

Source: K. Grauman

Page 42: Reconnaissance d’objets et vision  artificielle

Hough transform details (D. Lowe’s system)

Training phase: For each model feature, record 2D location, scale, and orientation of model (relative to normalized feature frame)

Test phase: Let each match between a test and a model feature vote in a 4D Hough space• Use broad bin sizes of 30 degrees for orientation, a factor of

2 for scale, and 0.25 times image size for location• Vote for two closest bins in each dimension

Find all bins with at least three votes and perform geometric verification • Estimate least squares affine transformation • Use stricter thresholds on transformation residual• Search for additional features that agree with the alignment

Page 43: Reconnaissance d’objets et vision  artificielle

Affine projections induce affine transformations from planes onto their images.

Page 44: Reconnaissance d’objets et vision  artificielle

Affine transformationsAn affine transformation maps a parallelogram ontoanother parallelogram

11001''

22221

11211

vu

baabaa

vu

Page 45: Reconnaissance d’objets et vision  artificielle

Fitting an affine transformationEquation for affine transformation:

2 equations in 6 unknowns

9 entries, 6 degrees of freedom

11001''

22221

11211

vu

baabaa

vu

''

10000001

2

22

21

1

12

11

vu

baabaa

vuvuU a = u’

In general uniquely determinedby 3 correspondences

Linear least squares formore correspondences

Page 46: Reconnaissance d’objets et vision  artificielle

Strategy 4: HashingMake each invariant image feature into a low-dimensional “key”

that indexes into a table of hypotheses

model

hash table

Page 47: Reconnaissance d’objets et vision  artificielle

Strategy 4: HashingMake each invariant image feature into a low-dimensional “key”

that indexes into a table of hypothesesGiven a new test image, compute the hash keys for all features

found in that image, access the table, and look for consistent hypotheses

model

hash table

test image

Page 48: Reconnaissance d’objets et vision  artificielle

Strategy 4: HashingMake each invariant image feature into a low-dimensional “key”

that indexes into a table of hypothesesGiven a new test image, compute the hash keys for all features

found in that image, access the table, and look for consistent hypotheses

This can even work when we don’t have any feature descriptors: we can take n-tuples of neighboring features and compute invariant hash codes from their geometric configurations

AB

CD

Page 49: Reconnaissance d’objets et vision  artificielle

Beyond affine transformationsWhat is the transformation between two views of a

planar surface?

What is the transformation between images from two cameras that share the same center?

Page 50: Reconnaissance d’objets et vision  artificielle

Perspective projections induce projective transformations between planes

Page 51: Reconnaissance d’objets et vision  artificielle

Beyond affine transformationsHomography: plane projective transformation

(transformation taking a quad to another arbitrary quad)

Page 52: Reconnaissance d’objets et vision  artificielle

Fitting a homographyRecall: homogenenous coordinates

Converting to homogenenousimage coordinates

Converting from homogenenousimage coordinates

Page 53: Reconnaissance d’objets et vision  artificielle

Fitting a homographyRecall: homogenenous coordinates

Equation for homography:

Converting to homogenenousimage coordinates

Converting from homogenenousimage coordinates

11 333231

232221

131211

yx

hhhhhhhhh

yx

Page 54: Reconnaissance d’objets et vision  artificielle

Fitting a homographyEquation for homography:

iT

T

T

ii xhhh

xHx

3

2

1

11 333231

232221

131211

i

i

i

i

yx

hhhhhhhhh

yx

0 ii xHx

iT

iiT

i

iT

iiT

iT

iT

i

ii

yxx

y

xhxhxhxhxhxh

xHx

12

31

23

00

00

3

2

1

hhh

xxxxxx

TTii

Tii

Tii

TTi

Tii

Ti

T

xyxy

3 equations, only 2 linearly independent

9 entries, 8 degrees of freedom(scale is arbitrary)

Page 55: Reconnaissance d’objets et vision  artificielle

Direct linear transform

H has 8 degrees of freedom (9 parameters, but scale is arbitrary)

One match gives us two linearly independent equationsFour matches needed for a minimal solution (null space

of 8x9 matrix)More than four: homogeneous least squares

0

00

00

3

2

1111

111

hhh

xxxx

xxxx

Tnn

TTn

Tnn

Tn

T

TTT

TTT

xy

xy

0hA

Page 56: Reconnaissance d’objets et vision  artificielle

Application: Panorama stitching

Images courtesy of A. Zisserman.

Page 57: Reconnaissance d’objets et vision  artificielle

Recognizing panoramas

M. Brown and D. Lowe, “Recognizing panoramas”, ICCV 2003.

Given contents of a camera memory card, automatically figure out which pictures go together and stitch them together into panoramas

Page 58: Reconnaissance d’objets et vision  artificielle

1. Estimate homography (RANSAC)

Page 59: Reconnaissance d’objets et vision  artificielle

1. Estimate homography (RANSAC)

Page 60: Reconnaissance d’objets et vision  artificielle

1. Estimate homography (RANSAC)

Page 61: Reconnaissance d’objets et vision  artificielle

2. Find connected sets of images

Page 62: Reconnaissance d’objets et vision  artificielle

2. Find connected sets of images

Page 63: Reconnaissance d’objets et vision  artificielle

2. Find connected sets of images

Page 64: Reconnaissance d’objets et vision  artificielle

3. Stitch and blend the panoramas

Page 65: Reconnaissance d’objets et vision  artificielle

Results

Page 66: Reconnaissance d’objets et vision  artificielle

Issues in alignment-based applicationsChoosing the geometric alignment model

• Tradeoff between “correctness” and robustness (also, efficiency)

Choosing the descriptor• “Rich” imagery (natural images): high-dimensional patch-based

descriptors (e.g., SIFT)• “Impoverished” imagery (e.g., star fields): need to create

invariant geometric descriptors from k-tuples of point-based features

Strategy for finding putative matches• Small number of images, one-time computation (e.g., panorama

stitching): brute force search• Large database of model images, frequent queries: indexing or

hashing• Heuristics for feature-space pruning of putative matches

Page 67: Reconnaissance d’objets et vision  artificielle

Issues in alignment-based applicationsChoosing the geometric alignment modelChoosing the descriptorStrategy for finding putative matchesHypothesis generation strategy

• Relatively large inlier ratio: RANSAC• Small inlier ratio: locality constraints, Hough transform

Hypothesis verification strategy• Size of consensus set, residual tolerance depend on inlier ratio

and expected accuracy of the model• Possible refinement of geometric model• Dense verification

Page 68: Reconnaissance d’objets et vision  artificielle

Tell & Carlsson (2000); Kadir & Brady (2001); Matas et al. (2001); Tuytelaars & Van Gool (2002)

Repeatibility, covariance, invariance

Affine Patches for 3D Alignment

Page 69: Reconnaissance d’objets et vision  artificielle

Idea : • The (smooth) surface of a solid is never globally planar,• but it is always locally planar

Rothganger et al. (CVPR’03) Tomasi & Kanade (1992)

S = M£N

S ! M , NE Ã|S -M N|

Duda & Hart (1972); Weiss (1987); Burns et al. (1992); Mundy et al. (1992, 1994); Rothwell et al. (1992)Ayache & Faugeras (1982); Hebert & Faugeras (1983); Gaston et al. (1984); Huttenlocher & Ullman (1987)

Johnson & Hebert (1998); Lowe (1999)

Modeling andrecognizing 3Drigid solids

Page 70: Reconnaissance d’objets et vision  artificielle

20 images

Page 71: Reconnaissance d’objets et vision  artificielle
Page 72: Reconnaissance d’objets et vision  artificielle

Dataset: 51 test images with 1 to 5 of the 8 objects present in each image.

Page 73: Reconnaissance d’objets et vision  artificielle
Page 74: Reconnaissance d’objets et vision  artificielle
Page 75: Reconnaissance d’objets et vision  artificielle

Some successes

The four failures


Recommended