Descriptive Geometry Meets Computer Vision The Geometry …Table of contents 1. Remarks on linear...

transcript

Descriptive Geometry Meets Computer Vision

The Geometry of Two Images (# 82)

Hellmuth Stachel

stachel@dmg.tuwien.ac.at — http://www.geometrie.tuwien.ac.at/stachel

12th International Conference on Geometry and Graphics, August 6–10, 2006, Salvador/Brazil

Table of contents

1. Remarks on linear images

2. Geometry of two images

3. Numerical reconstruction of two images

12th International Conference on Geometry and Graphics, August 6–10, 2006, Salvador/Brazil 1

1. Remarks on linear images

linear image nonlinear (curved) image

Central projection

The central projection (according to A. Durer)

can be generalized by a central axonometry.

Central axonometric principle

in space E3:

PSfrag replacements

cartesian basis O; E1, E2, E3

and points at infinity U1, U2, U3

PSfrag replacements

in the image plane E2:

central axonometric reference systemOc; Ec

1, Ec2, E

c1 , U c

2 , U c3

Definition of linear images

There is a unique collinear transformation

κ : E3 → E

2 mit O 7→ Oc, Ei 7→ Eci , Ui 7→ U c

i , i = 1, 2, 3.

Any two-dimensional image of E3 under a collinear transformation is called linear.

{collinear points have collinear or coincident imagescross-ratios of any four collinear points are preserved.

Definition of linear images

There is a unique collinear transformation

κ : E3 → E

2 mit O 7→ Oc, Ei 7→ Eci , Ui 7→ U c

i , i = 1, 2, 3.

Any two-dimensional image of E3 under a collinear transformation is called linear.

{collinear points have collinear or coincident imagescross-ratios of any four collinear points are preserved.

Central projection in coordinates

Notation:

Z . . . center

H . . . principal point

d . . . focal length

x1, x2, x3 . . .camera frame

x′1, x

′2 . . . imagecoordinate frame

PSfrag replacementsimage plane

vanishing planeΠΠ

, or homogeneous

0 0 0 10 d 0 00 0 d 0

ξ0...

Transformation from the camera frame (x1, x2, x3) into arbitrary world coordinates(x1, x2, x3) and translation from the particular image frame (x′

1, x′2) into arbitrary

(x′1, x

′2) gives in homogeneous form

ξ′0ξ′1ξ′2

1 0 0h′

1 d f1 0h′

2 0 d f2

0 0 0 10 1 0 00 0 1 0

1 0 0 0o1...

︸︷︷︸

matrix A

ξ0...ξ3

, or homogeneous

0 0 0 10 d 0 00 0 d 0

ξ0...

Transformation from the camera frame (x1, x2, x3) into arbitrary world coordinates(x1, x2, x3) and translation from the particular image frame (x′

1, x′2) into arbitrary

(x′1, x

′2) gives in homogeneous form

ξ′0ξ′1ξ′2

1 0 0h′

1 d f1 0h′

2 0 d f2

0 0 0 10 1 0 00 0 1 0

1 0 0 0o1...

︸︷︷︸

matrix A

ξ0...ξ3

Left hand matrix: (h′1, h

′2) are image coordinates of the principal point H,

(f1, f2) are possible scaling factors, and d is the focal length.

These parameters are called the intrinsic calibration parameters.

Right hand matrix: R is an orthogonal matrix.

The position of the camera frame with respect to the world coordinates definesthe extrinsic calibration parameters.

Photos with known interior orientation are called calibrated images, others (likecentral axonometries) are uncalibrated.

Left hand matrix: (h′1, h

′2) are image coordinates of the principal point H,

(f1, f2) are possible scaling factors, and d is the focal length.

These parameters are called the intrinsic calibration parameters.

Right hand matrix: R is an orthogonal matrix.

The position of the camera frame with respect to the world coordinates definesthe extrinsic calibration parameters.

Photos with known interior orientation are called calibrated images, others (likecentral axonometries) are uncalibrated.

Unknown interior calibration parameters

ZZZZZZZZZZZZZZZZZ

PSfrag replacements

collinear

bundle tran

sformation

ZZZZZZZZZZZZZZZZZ

the bundles Z and Zof the rays of sight arecollinear

Given: Two linear images or two photographs.

Wanted: Dimensions of the depicted 3D-object.

Historical ‘Stadtbahn’ station Karlsplatz in Vienna (Otto Wagner, 1897)

The geometry of two images is a classical subject of Descriptive Geometry.Its results have become standard (Finsterwalder, Kruppa, Krames,Wunderlich, Hohenberg, Tschupik, Brauner, Havlicek, H.S., . . . ).

Why now ? Advantages of digital images:

• less distorsion, because no paper prints are needed,

• exact boundary is available, and

• precise coordinate measurements are possible using standard software.

The geometry of two images is a classical subject of Descriptive Geometry.Its results have become standard (Finsterwalder, Kruppa, Krames,Wunderlich, Hohenberg, Tschupik, Brauner, Havlicek, H.S., . . . ).

Why now ? Advantages of digital images:

• less distorsion, because no paper prints are needed,

• exact boundary is available, and

• precise coordinate measurements are possible using standard software.

Computer Vision

Why now ?

The geometry of two images is important for Computer Vision, a topic with themain goal to endow a computer with a sense of vision.

Basic problems:

• Which information can be extracted from digital images ?

• How to preprocess and represent this information ?

Sensor-guided robots, automatic vehicle control, ‘Big Brother’, . . .

Computer Vision

Why now ?

The geometry of two images is important for Computer Vision, a topic with themain goal to endow a computer with a sense of vision.

Basic problems:

• Which information can be extracted from digital images ?

• How to preprocess and represent this information ?

Sensor-guided robots, automatic vehicle control, ‘Big Brother’, . . .

Computer Vision

Recent textbooks:

Yi Ma, St. Soatto, J. Kosecka, S.S.Sastry: An Invitation to 3-D Vision.Springer-Verlag, New York 2004

R. Hartley, A. Zisserman:Multiple View Geometry in ComputerVision. Cambridge University Press 2000

Fortunately the authors in the cited bookrefer to some of these standard results(Krames, Kruppa, Wunderlich)

Geometry of two images (epipolar geometry)

viewing situation

collinear transformations

two images

π1π1π1π1π1π1π1π1π1π1π1π1π1π1π1π1π1

Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2 Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1

Z21Z21

Z21Z21Z21

Z12Z12

Z12Z12Z12

zzzzzzzzzzzzzzzzz

X1X1X1X1X1X1X1X1X1X1X1X1X1X1X1X1X1

XXXXXXXXXXXXXXXXX

δXδXδXδXδXδXδXδXδXδXδXδXδXδXδXδXδX

l1l2l2l2

l2l2l2l2

π′1π′1

π′1π′1π′1 π′′

2π′′2

π′′2π′′2

π′′2π′′2π′′2

γ1γ1γ1γ1γ1γ1γ1γ1γ1γ1γ1γ1γ1γ1γ1γ1γ1

X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′

X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′

l′l′l′

l′l′

l′l′l′l′

l′l′

l′l′l′l′

l′′l′′l′′

l′′l′′l′′l′′l′′l′′

l′′l′′l′′l′′

l′′l′′l′′l′′Z ′

2Z ′2

Z ′2Z ′2

Z ′2Z ′2Z ′2

Z ′′1Z ′′1

Z ′′1Z ′′1Z ′′1

Geometry of two images (epipolar geometry)

Notations:

line z = Z1Z2 . . . baseline,

Z ′2, Z

′′1 . . . epipoles

(German: Kernpunkte),

δX . . . epipolar plane (it is twiceprojecting),

l′, l′′ . . . pair of epipolar lines(German: Kernstrahlen),

(X ′, X ′′) . . . corresponding views.

Z21Z21

Z21Z21Z21

Z12Z12

Z12Z12Z12

zzzzzzzzzzzzzzzzz

XXXXXXXXXXXXXXXXX

δXδXδXδXδXδXδXδXδXδXδXδXδXδXδXδXδX

l1l2l2l2

l2l2l2l2

π′1π′1

π′1π′1π′1 π′′

2π′′2

π′′2π′′2

π′′2π′′2π′′2

X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′X ′

X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′X ′′

l′l′l′

l′l′

l′l′l′l′

l′l′

l′l′l′l′

l′′l′′l′′

l′′l′′l′′l′′l′′l′′

l′′l′′l′′l′′

l′′l′′l′′l′′Z ′

2Z ′2

Z ′2Z ′2

Z ′2Z ′2Z ′2

Z ′′1Z ′′1

Z ′′1Z ′′1Z ′′1

Epipolar constraint

Theorem (synthetic version): For any two linear images of a scene, there is aprojectivity between two line pencils

Z ′2(δ

′X) ∧− Z ′′

1 (δ′′X)

such that the points X ′, X ′′ are corresponding ⇐⇒ they are located on(corresponding =) epipolar lines.

Theorem (analytic version): Using homogeneous coordinates for both images,there is a bilinear form β of rank 2 such that two points X ′ = x

′R = (ξ′0 : ξ′1 : ξ′2)

and X ′′ = x′′R = (ξ′′0 : ξ′′1 : ξ′′2 ) are corresponding

⇐⇒ β(x′,x′′) =

bij ξ′i ξ′′j = (ξ′0 ξ′1 ξ′2)·

ξ′′0

ξ′′1

ξ′′2

A = x′T · B · x′′ = 0 .

Epipolar constraint

Theorem (synthetic version): For any two linear images of a scene, there is aprojectivity between two line pencils

Z ′2(δ

′X) ∧− Z ′′

1 (δ′′X)

such that the points X ′, X ′′ are corresponding ⇐⇒ they are located on(corresponding =) epipolar lines.

Theorem (analytic version): Using homogeneous coordinates for both images,there is a bilinear form β of rank 2 such that two points X ′ = x

′R = (ξ′0 : ξ′1 : ξ′2)

and X ′′ = x′′R = (ξ′′0 : ξ′′1 : ξ′′2 ) are corresponding

⇐⇒ β(x′,x′′) =

bij ξ′i ξ′′j = (ξ′0 ξ′1 ξ′2)·

ξ′′0

ξ′′1

ξ′′2

A = x′T · B · x′′ = 0 .

Epipolar constraint

Proof (analytic version): Using homogeneous line coordinates, the projectivitybetween the line pencils can be expressed as

β : (u′1λ1 + u

′2λ2)R 7→ (u′′

1λ1 + u′′2λ2)R for all (λ1, λ2) ∈ R

2 \ {(0, 0)}.

x′ and x

′′ are corresponding ⇐⇒ there is a nontrivial pair (λ1, λ2) such that

(u′1λ1 + u

′2λ2)· x

′ = 0

(u′′1λ1 + u

′′2λ2)· x

′′ = 0 .

These two linear homogeneous equations in the unknowns (λ1, λ2) have anontrivial solution ⇐⇒ the determinant vanishes, i.e.,

β(x′,x′′) := (u′1·x

′)(u′′2 ·x

′′) − (u′2·x

′)(u′′1 ·x

′′) =∑2

i,j=0 bij ξ′i ξ′′j = 0.

There are singular points of this correspondance: Z ′2 corresponds to all X ′′, and

vice versa all points X ′ correspond to Z ′′1 =⇒ rk(bij) = 2 .

Epipolar constraint in the calibrated case

Theorem: In the calibrated casethe essential matrix B = (bij) is theproduct of a skew symmetric matrixand an orthogonal one, i.e.,

B = S ·R .

Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2Z2 Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1Z1z′

z′z′

Z21Z21Z21Z21

Z21Z21

Z21Z21Z21Z21Z21

Z12Z12Z12Z12

Z12Z12

Z12Z12Z12Z12Z12Z12Z12Z12Z12

Z12Z12

Z12Z12Z12Z12Z12

XXXXXXXXXXXXXXXXXδXδXδXδXδXδXδXδXδXδXδXδXδXδXδXδXδX

l1l2l2l2l2l2l2l2l2l2l2l2l2l2l2l2l2l2x

′x′

x′x′

x′′

x′′x′′

x′′

x′′x′′

x′′

Proof: We use both camera frames and the homogeneous coordinates

x′ =

−−−→Z1X

′, x′′ =

−−−→Z2X

′′.

For transforming the coordinates from the second camera frame into the first one,there is an orthogonal matrix R such that

x′′1 = z

′ + R·x′′ with RT = R−1 and z′ = (z′1, z′2, z′3)

T =−−−→Z1Z2.

The points X1, X2, Z1,Z2 are coplanar ⇐⇒ the tripleproduct of the vectors x

′, z′ and

x′′1 = Z1X2 vanishes, i.e.,

det(x′, z′,x′′1) = x

′ · (z′×x′′1) = 0.

z′z′

Z21Z21Z21Z21

Z21Z21

Z21Z21Z21Z21Z21

Z12Z12Z12Z12

Z12Z12

Z12Z12Z12Z12Z12Z12Z12Z12Z12

Z12Z12

Z12Z12Z12Z12Z12

′x′

x′x′

x′′

x′′x′′

x′′

x′′x′′

x′′

For transforming the coordinates from the second camera frame into the first one,there is an orthogonal matrix R such that

x′′1 = z

′ + R·x′′ with RT = R−1 and z′ = (z′1, z′2, z′3)

T =−−−→Z1Z2.

The points X1, X2, Z1, Z2

are coplanar ⇐⇒ the tripleproduct of the vectors x

′, z′ and

x′′1 = Z1X2 vanishes, i.e.,

det(x′, z′,x′′1) = x

′ · (z′×x′′1) = 0.

z′z′

Z21Z21Z21Z21

Z21Z21

Z21Z21Z21Z21Z21

Z12Z12Z12Z12

Z12Z12

Z12Z12Z12Z12Z12Z12Z12Z12Z12

Z12Z12

Z12Z12Z12Z12Z12

′x′

x′x′

x′′

x′′x′′

x′′

x′′x′′

x′′

We replace the vector product (z′×x′′1) by

z′×(z′ + R·x′′) = z

′×R·x′′ = S ·R·x′′ mit S =

0 −z′3 z′

z′3 0 −z′

−z′2 z′

Matrix S is skew symmetric and R is orthogonal.

Hence, the coplanarity of x′, x

′′ and z′ is equivalent to

0 = x′ · (z′×x

′′1) = x

′T · S ·R︸︷︷︸B

·x′′, also B = S ·R .

The decomposition of the fundamental matrix B into these two factors definesthe relative position of the second camera frame against the first one !

We replace the vector product (z′×x′′1) by

z′×(z′ + R·x′′) = z

′×R·x′′ = S ·R·x′′ mit S =

0 −z′3 z′

z′3 0 −z′

−z′2 z′

Matrix S is skew symmetric and R is orthogonal.

Hence, the coplanarity of x′, x

′′ and z′ is equivalent to

0 = x′ · (z′×x

′′1) = x

′T · S ·R︸︷︷︸B

·x′′, also B = S ·R .

The decomposition of the fundamental matrix B into these two factors definesthe relative position of the second camera frame against the first one !

Essential matrix

Theorem:The essential matrix B has two equalsingular values σ := σ1 = σ2.

Proof: We have B = S ·R withorthogonal R. The vector

S ·x = z′×x

is orthogonal zu the orthogonal viewx

n, where

‖z′×x‖ = | sin ϕ| ‖x‖ ‖z′‖ =

= ‖xn‖ ‖z′‖ = σ ‖xn‖.

PSfrag replacementsz′

z′×x

Π ⊥ z′

Singular value decomposition

Theorem: [Singular value decomposition]

Any matrix A ∈ M(m, n; R) can be decomposed into a product

A = U ·D ·V T with orthogonal U, V and D = diag(σ1, . . . , σp)

with D ∈ M(m, n; R), σi ≥ 0, and p = min{m,n}.

The positive entries in the main diagonal of D are called singular values of A.

The singular values of A can be seen as principal distortion factors of the affinetransformation represented by A, i.e., the semiaxes of the affine image of the unitsphere.

Hence the singular values of an orthogonal projection are (1, 1).

Theorem: [Singular value decomposition]

Any matrix A ∈ M(m, n; R) can be decomposed into a product

A = U ·D ·V T with orthogonal U, V and D = diag(σ1, . . . , σp)

with D ∈ M(m, n; R), σi ≥ 0, and p = min{m,n}.

The positive entries in the main diagonal of D are called singular values of A.

The singular values of A can be seen as principal distortion factors of the affinetransformation represented by A, i.e., the semiaxes of the affine image of the unitsphere.

Hence the singular values of an orthogonal projection are (1, 1).

LinAlg

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

PSfrag replacements

a0a1a2x

Aα(a0)α(a1)α(a2)α(x)

U ·D·V T

A−→

LinAlg

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

rotation ↓ V T rotation ↑ U

LinAlgLinAlg

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scaling23

What means ‘reconstruction’

Given: Two either calibratedor uncalibrated images.

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling π′1π′1π′1π′1

π′1π′1

π′1π′1π′1π′1π′1 π′′

2π′′2π′′2π′′2

π′′2π′′2

π′′2π′′2π′′2π′′2π′′2

X ′1X ′1X ′1X ′1

X ′1X ′1

X ′1X ′1X ′1X ′1X ′1 X ′′

1X ′′1X ′′1X ′′1

X ′′1X ′′1

X ′′1X ′′1X ′′1X ′′1X ′′1

X ′2X ′2X ′2X ′2

X ′2X ′2

X ′2X ′2X ′2X ′2X ′2

X ′′2X ′′2X ′′2X ′′2

X ′′2X ′′2

X ′′2X ′′2X ′′2X ′′2X ′′2

Wanted: ‘viewing situation’,i.e., determine

• the relative position of thetwo camera frames, and

• the location of any spacepoint X from its images(X ′, X ′′).

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

Z21Z21Z21Z21

Z21Z21

Z21Z21Z21Z21Z21

Z12Z12Z12Z12

Z12Z12

Z12Z12Z12Z12Z12

zzzzzzzzzzzzzzzzz

l1l2l2l2l2l2l2l2l2l2l2l2l2l2l2l2l2l2

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scaling24

First fundamental theorem

Theorem:From two uncalibrated images with given projectivity between epipolar lines thedepicted object can be reconstructed up to a collinear transformation.

Sketch of the proof:The two images can be placedin space such that pairs ofepipolar lines are intersecting.Then for arbitrary Z1, Z2 on thebaseline z = Z2

1Z12 there is a

reconstructed 3D object.

Any other choice of theviewing situation gives a collineartransform of the 3D object.

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

Z21Z21Z21Z21

Z21Z21

Z21Z21Z21Z21Z21

Z12Z12Z12Z12

Z12Z12

Z12Z12Z12Z12Z12

zzzzzzzzzzzzzzzzz

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scaling25

Second fundamental theorem

Theorem (S. Finsterwalder, 1899):From two calibrated images with given projectivity between epipolar lines thedepicted object can be reconstructed up to a similarity.

Sketch of the proof:Now in the two bundles of raysthe pencils of epipolar planesδX are congruent, and they canbe made coincident by a rigidmotion. Then relative to the firstbundle Z1 for any Z2 ∈ z thereis a reconstructed 3D object.

Any other choice of Z2 gives asimilar 3D object.

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

Z21Z21Z21Z21

Z21Z21

Z21Z21Z21Z21Z21

Z12Z12Z12Z12

Z12Z12

Z12Z12Z12Z12Z12

zzzzzzzzzzzzzzzzz

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scaling26

Determination of epipoles — geometric meaning

Problem of Projectivity:

Given: 7 pairs of corresponding points (X ′1, X

′′1 ), . . . , (X ′

7, X′′7 ).

Wanted: A pair of points (S′, S′′) (= epipoles) such that there is a projectivity

S′([S′X ′1], . . . , [S

′X ′7]) ∧− S′′([S′X ′′

1 ], . . . , [S′′X ′′7 ]).

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

X ′1 X ′

X ′3X ′

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′ 27

Determination of epipoles — geometric meaning

Problem of Projectivity:

Given: 7 pairs of corresponding points (X ′1, X

′′1 ), . . . , (X ′

7, X′′7 ).

Wanted: A pair of points (S′, S′′) (= epipoles) such that there is a projectivity

S′([S′X ′1], . . . , [S

′X ′7]) ∧− S′′([S′X ′′

1 ], . . . , [S′′X ′′7 ]).

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

X ′1 X ′

X ′3X ′

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′π′

π′′

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 27

Determination of epipoles — analytic solution

Theorem: If 7 pairs of corresponding points (X ′1, X

′′1 ), . . . , (X ′

7, X′′7 ) are given,

the determination of the epipoles is a cubic problem.

Proof: 7 pairs of corresponding points give 7 linear homogeneous equations

β(x′i,x

′′i ) = x

Ti · B · x′′

i = 0, i = 1, . . . , 7,

for the 9 entries in the (3×3)-matrix B = (bij) — called essential matrix.

det(bij) = 0 gives an additional cubic equation which fixes all bij up to a commonfactor.

For noisy image points it is recommended to use more than 7 points and methodsof least square approximation for obtaining the ‘best fitting matrix’ B:

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 28

Theorem: If 7 pairs of corresponding points (X ′1, X

′′1 ), . . . , (X ′

7, X′′7 ) are given,

the determination of the epipoles is a cubic problem.

Proof: 7 pairs of corresponding points give 7 linear homogeneous equations

β(x′i,x

′′i ) = x

Ti · B · x′′

i = 0, i = 1, . . . , 7,

for the 9 entries in the (3×3)-matrix B = (bij) — called essential matrix.

det(bij) = 0 gives an additional cubic equation which fixes all bij up to a commonfactor.

For noisy image points it is recommended to use more than 7 points and methodsof least square approximation for obtaining the ‘best fitting matrix’ B:

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 28

1) Let A denote the coefficient matrix in the linear system for the entries of B.Then the ‘least square fit’ for this overdetermined system is an eigenvector forthe smallest eigenvalue of the symmetric matrix AT · A.

2) As an essential matrix needs to have rank 2, we use the ’projection into theessential space’. This means, the singular value decomposition of B gives arepresentation

B = U · diag(σ1, σ2, σ3) · VT with orthogonal U, V and σ1 ≥ σ2 ≥ σ3 .

Then in the uncalibrated case B = U ·diag(σ1, σ2, 0) ·V is optimal (with respectto the Frobenius norm) and in the calibrated case

B = U · diag(σ, σ, 0) · V T with σ1 = (σ1 + σ2)/2.

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 29

1) Let A denote the coefficient matrix in the linear system for the entries of B.Then the ‘least square fit’ for this overdetermined system is an eigenvector forthe smallest eigenvalue of the symmetric matrix AT · A.

2) As an essential matrix needs to have rank 2, we use the ’projection into theessential space’. This means, the singular value decomposition of B gives arepresentation

B = U · diag(σ1, σ2, σ3) · VT with orthogonal U, V and σ1 ≥ σ2 ≥ σ3 .

Then in the uncalibrated case B = U · diag(σ1, σ2, 0) · V is optimal (withrespect to the Frobenius norm) and in the calibrated case

B = U · diag(σ, σ, 0) · V T with σ1 = (σ1 + σ2)/2.

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 29

3. Numerical reconstruction of two images

Step 1: Specify at least 7 reference points

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

11111111111111111

22222222222222222

33333333333333333 44444444444444444

55555555555555555

6666666666666666677777777777777777

88888888888888888

99999999999999999

1010101010101010101010101010101010

1111111111111111111111111111111111

1212121212121212121212121212121212

13131313131313131313131313131313131414141414141414141414141414141414

1515151515151515151515151515151515

1616161616161616161616161616161616

1717171717171717171717171717171717

1818181818181818181818181818181818

1919191919191919191919191919191919

2020202020202020202020202020202020

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

11111111111111111

22222222222222222

33333333333333333 44444444444444444

55555555555555555

66666666666666666

7777777777777777788888888888888888

99999999999999999

1010101010101010101010101010101010

1111111111111111111111111111111111

1212121212121212121212121212121212

13131313131313131313131313131313131414141414141414141414141414141414

1515151515151515151515151515151515

1616161616161616161616161616161616

1717171717171717171717171717171717

1818181818181818181818181818181818

1919191919191919191919191919191919

2020202020202020202020202020202020

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 30

Step 2: Compute the essential matrix

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

Step 2: Compute the essential matrix B — including the pairs of epipolar lines

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 31

Step 3: Factorize B = S.R

Theorem: There are exactly two ways of decomposing B = U ·D ·V T withD = diag(σ, σ, 0) into a product S ·R with skew-symmetric S and orthogonal R :

S = ±U ·R+·D ·UT and R = ±U ·RT+·V T with R+ =

0 −1 0

Proof:

a) It is sufficient to factorize U ·D = S ·R′ which implies B = S · (R′·V T ), i.e.,R = R′·V T .

b) D represents the product of the orthogonal projection into the x1x2-plane andthe scaling with factor σ . The rotation U transforms the x1x2-plane into theimage plane of U · D.

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 32

Theorem: There are exactly two ways of decomposing B = U ·D ·V T withD = diag(σ, σ, 0) into a product S ·R with skew-symmetric S and orthogonal R :

S = ±U ·R+·D ·UT and R = ±U ·RT+·V T with R+ =

0 −1 0

Proof:

a) It is sufficient to factorize U ·D = S ·R′ which implies B = S · (R′·V T ), i.e.,R = R′·V T .

b) D represents the product of the orthogonal projection into the x1x2-plane andthe scaling with factor σ . The rotation U transforms the x1x2-plane into theimage plane of U · D.

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 32

c) Any skew symmetric matrix S represents the product of an orthogonal projectionparallel to z

′, a 90◦-rotation about z′ and a scaling with factor ‖z′‖.

d) R+ · D is skew-symmetric with z′ = (0, 0, σ). We transform it by U to obtain

the required position, i.e., S = ±U ·(R+·D)·UT .

R+ commutes with D, =⇒ U ·D =[±U ·R+·D ·UT

︸︷︷︸S

·[±U ·RT

︸︷︷︸

e) B represents an orthogonal axonometry; its column vectors are images ofan orthonormal frame. We know from Descriptive Geometry that apart fromtranslations there are not more than two different frames with given images.

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 33

c) Any skew symmetric matrix S represents the product of an orthogonal projectionparallel to z

′, a 90◦-rotation about z′ and a scaling with factor ‖z′‖.

d) R+ · D is skew-symmetric with z′ = (0, 0, σ). We transform it by U to obtain

the required position, i.e., S = ±U ·(R+·D)·UT .

R+ commutes with D, =⇒ U ·D =[±U ·R+·D ·UT

︸︷︷︸S

·[±U ·RT

︸︷︷︸

e) B represents an orthogonal axonometry; its column vectors are images ofan orthonormal frame. We know from Descriptive Geometry that apart fromtranslations there are not more than two different frames with given images.

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 33

Summary of algorithm

1) Specify n > 7 pairs (X ′i, X

′′i ), i = 1, . . . , n.

2) Set up linear system of equations for the essential matrix B and seek bestfitting matrix (eigenvector of the smallest eigenvalue).

3) Compute the closest rank 2 matrix B with two equal singular values.

4) Factorize B = S · R ; this reveals the relative position of the two cameraframes.

5) In one of the frames compute the approximate point of intersection betweencorresponding rays.

6) Transform the recovered coordinates into world coordinates.

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 34

Remaining problems

• Analysis of precision,

• automated calibration (autofocus and zooming change the focal distance d),

• critical configurations.

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 35

The solution

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

PSfrag replacements

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

U ·D·V T

A−→

α(a0)

α(a1)

α(a2)

D−→

scaling

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

X ′1

X ′2

X ′3

X ′4

X ′5

X ′6

X ′7

X ′′1

X ′′2

X ′′3

X ′′4

X ′′5

X ′′6

X ′′7

S′′

π′′

11111111111111111

22222222222222222

33333333333333333 44444444444444444

55555555555555555

6666666666666666677777777777777777

88888888888888888

99999999999999999

1010101010101010101010101010101010

1111111111111111111111111111111111

1212121212121212121212121212121212

13131313131313131313131313131313131414141414141414141414141414141414

1515151515151515151515151515151515

1616161616161616161616161616161616

1717171717171717171717171717171717

1818181818181818181818181818181818

1919191919191919191919191919191919

2020202020202020202020202020202020

original image

PSfrag replacements

U ·D·V T

A−→

U ·D·V T

A−→

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′

121314

the reconstruction (M ∼ 1 : 100)

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 36

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′

Position of centers

relative to the depicted object

front view

top view

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 37

Literatur

• H. Brauner: Lineare Abbildungen aus euklidischen Raumen. Beitr. AlgebraGeom. 21, 5–26 (1986).

• O. Faugeras: Three-Dimensional Computer Vision. A Geometric Viewpoint.MIT Press, Cambridge, Mass., 1906 .

• O. Faugeras, Q.-T. Luong: The Geometry of Multiple Images. MITPress, Cambridge, Mass., 2001.

• R. Harley, A. Zisserman: Multiple View Geometry in Computer Vision.Cambridge University Press 2000.

• H. Havlicek: On the Matrices of Central Linear Mappings. Math. Bohem.121, 151–156 (1996).

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 38

• E. Kruppa: Zur achsonometrischen Methode der darstellenden Geometrie.Sitzungsber., Abt. II, osterr. Akad. Wiss., Math.-Naturw. Kl. 119, 487–506(1910).

• Yi Ma, St. Soatto, J. Kosecka, S. Sh. Sastry: An Invitation to 3-DVision. Springer-Verlag, New York 2004.

• H. Stachel: Zur Kennzeichnung der Zentralprojektionen nach H. Havlicek.Sitzungsber., Abt. II, osterr. Akad. Wiss., Math.-Naturw. Kl. 204, 33–46(1995).

• J. Szabo, H. Stachel, H. Vogel: Ein Satz uber die Zentralaxonometrie.Sitzungsber., Abt. II, osterr. Akad. Wiss., Math.-Naturw. Kl. 203, 3–11 (1994).

• J. Tschupik, F. Hohenberg: Die geometrische Grundlagen derPhotogrammetrie. In Jordan, Eggert, Kneissl (eds.): Handbuch derVermessungskunde III a/3. 10. Aufl., Metzlersche Verlagsbuchhandlung,Stuttart 1972, 2235–2295.

PSfrag replacements

a0a1a2x

U ·D·V T

A−→

a0a1a2x

U ·D·V T

A−→

a0a1a2x

D−→

scalingX′

7X′′

1X′′

2X′′

3X′′

4X′′

5X′′

6X′′

S′′

π′′

X′′1

X′′2

X′′3

X′′4

X′′5

X′′6

X′′7

S′′

π′′ 39

Descriptive Geometry Meets Computer Vision The Geometry …Table of contents 1. Remarks on linear...

Documents