+ All Categories
Home > Documents > SFM under orthographic projection

SFM under orthographic projection

Date post: 12-Jan-2016
Category:
Upload: patia
View: 77 times
Download: 3 times
Share this document with a friend
Description:
SFM under orthographic projection. Trick Choose scene origin to be centroid of 3D points Choose image origins to be centroid of 2D points Allows us to drop the camera translation:. orthographic projection matrix. 3D scene point. image offset. 2D image point. - PowerPoint PPT Presentation
Popular Tags:
48
SFM under orthographic projection 2D image point orthographic projection matrix 3D scene point image offset t Πp q 1 2 3 2 1 3 1 2 • Trick Choose scene origin to be centroid of 3D points Choose image origins to be centroid of 2D points Allows us to drop the camera translation: Πp q
Transcript
Page 1: SFM under orthographic projection

SFM under orthographic projection

2D image point

orthographicprojectionmatrix

3D scenepoint

imageoffset

tΠpq 12 32 13 12

• Trick• Choose scene origin to be centroid of 3D points• Choose image origins to be centroid of 2D points• Allows us to drop the camera translation:

Πpq

Page 2: SFM under orthographic projection

factorization (Tomasi & Kanade)

n332n2

n21n21 pppqqq

projection of n features in one image:

n3

32mn2m

212

1

21

22221

11211

n

mmnmm

n

n

ppp

Π

Π

Π

qqq

qqq

qqq

projection of n features in m images

W measurement M motion S shape

Key Observation: rank(W) <= 3

Page 3: SFM under orthographic projection

n33m2n2m''

SMW

• Factorization Technique– W is at most rank 3 (assuming no noise)– We can use singular value decomposition to

factor W:

Factorization

– S’ differs from S by a linear transformation A:

– Solve for A by enforcing metric constraints on M

))(('' ASMASMW 1

n33m2n2m SMWknown solve for

Page 4: SFM under orthographic projection

Metric constraints

• Orthographic Camera• Rows of are orthonormal:

• Enforcing “Metric” Constraints• Compute A such that rows of M have these properties

MAM '

10

01Tii

Trick (not in original Tomasi/Kanade paper, but in followup work)

• Constraints are linear in AAT :

• Solve for G first by writing equations for every i in M

• Then G = AAT by SVD

Tii

T

iiii where AAGGAA

TTT ''''10

01

Page 5: SFM under orthographic projection

Results

Page 6: SFM under orthographic projection

Extensions to factorization methods

• Paraperspective [Poelman & Kanade, PAMI 97]• Sequential Factorization [Morita & Kanade, PAMI 97]• Factorization under perspective [Christy & Horaud,

PAMI 96] [Sturm & Triggs, ECCV 96]• Factorization with Uncertainty [Anandan & Irani, IJCV

2002]

Page 7: SFM under orthographic projection

Bundle adjustment

Page 8: SFM under orthographic projection

Richard Szeliski CSE 576 (Spring 2005): Computer Vision

8

Structure from motion

• How many points do we need to match?• 2 frames:

(R,t): 5 dof + 3n point locations 4n point measurements n 5

• k frames:6(k–1)-1 + 3n 2kn

• always want to use many more

Page 9: SFM under orthographic projection

Richard Szeliski CSE 576 (Spring 2005): Computer Vision

9

Bundle Adjustment

• What makes this non-linear minimization hard?• many more parameters: potentially slow• poorer conditioning (high correlation)• potentially lots of outliers

Page 10: SFM under orthographic projection

Richard Szeliski CSE 576 (Spring 2005): Computer Vision

10

Lots of parameters: sparsity

• Only a few entries in Jacobian are non-zero

Page 11: SFM under orthographic projection

Richard Szeliski CSE 576 (Spring 2005): Computer Vision

11

Robust error models• Outlier rejection

• use robust penalty appliedto each set of jointmeasurements

• for extremely bad data, use random sampling [RANSAC, Fischler & Bolles, CACM’81]

Page 12: SFM under orthographic projection

Richard Szeliski CSE 576 (Spring 2005): Computer Vision

13

Structure from motion: limitations• Very difficult to reliably estimate metric

structure and motion unless:• large (x or y) rotation or• large field of view and depth variation

• Camera calibration important for Euclidean reconstructions

• Need good feature tracker• Lens distortion

Page 13: SFM under orthographic projection

Issues in SFM

• Track lifetime• Nonlinear lens distortion• Prior knowledge and scene constraints• Multiple motions

Page 14: SFM under orthographic projection

Track lifetime

every 50th frame of a 800-frame sequence

Page 15: SFM under orthographic projection

Track lifetime

lifetime of 3192 tracks from the previous sequence

Page 16: SFM under orthographic projection

Track lifetime

track length histogram

Page 17: SFM under orthographic projection

Nonlinear lens distortion

Page 18: SFM under orthographic projection

Nonlinear lens distortion

effect of lens distortion

Page 19: SFM under orthographic projection

Prior knowledge and scene constraints

add a constraint that several lines are parallel

Page 20: SFM under orthographic projection

Prior knowledge and scene constraints

add a constraint that it is a turntable sequence

Page 21: SFM under orthographic projection

Applications of Structure from Motion

Page 22: SFM under orthographic projection

Jurassic park

Page 23: SFM under orthographic projection

PhotoSynth

http://labs.live.com/photosynth/

Page 24: SFM under orthographic projection

So far focused on 3D modeling• Multi-Frame Structure from Motion: • Multi-View Stereo

UnknownUnknowncameracameraviewpointsviewpoints

Page 25: SFM under orthographic projection

Next• Recognition

Page 26: SFM under orthographic projection

Today• Recognition

Page 27: SFM under orthographic projection

Recognition problems• What is it?

• Object detection

• Who is it?• Recognizing identity

• What are they doing?• Activities

• All of these are classification problems• Choose one class from a list of possible candidates

Page 28: SFM under orthographic projection

How do human do recognition? • We don’t completely know yet• But we have some experimental observations.

Page 29: SFM under orthographic projection

Observation 1:

Page 30: SFM under orthographic projection

Observation 1:

The “Margaret Thatcher Illusion”, by Peter Thompson

Page 31: SFM under orthographic projection

Observation 1:

The “Margaret Thatcher Illusion”, by Peter Thompson

• http://www.wjh.harvard.edu/~lombrozo/home/illusions/thatcher.html#bottom • Human process up-side-down images separately

Page 32: SFM under orthographic projection

Observation 2:

Jim Carrey Kevin Costner

• High frequency information is not enough

Page 33: SFM under orthographic projection

Observation 3:

Page 34: SFM under orthographic projection

Observation 3:

• Negative contrast is difficult

Page 35: SFM under orthographic projection

Observation 4:

• Image Warping is OK

Page 36: SFM under orthographic projection

The list goes on• Face Recognition by Humans: Nineteen Results All

Computer Vision Researchers Should Know About http://web.mit.edu/bcs/sinha/papers/19results_sinha_etal.pdf

Page 37: SFM under orthographic projection

Face detection

• How to tell if a face is present?

Page 38: SFM under orthographic projection

One simple method: skin detection

• Skin pixels have a distinctive range of colors• Corresponds to region(s) in RGB color space

– for visualization, only R and G components are shown above

skin

Skin classifier• A pixel X = (R,G,B) is skin if it is in the skin region• But how to find this region?

Page 39: SFM under orthographic projection

Skin detection

• Learn the skin region from examples• Manually label pixels in one or more “training images” as skin or not skin• Plot the training data in RGB space

– skin pixels shown in orange, non-skin pixels shown in blue– some skin pixels may be outside the region, non-skin pixels inside. Why?

Skin classifier• Given X = (R,G,B): how to determine if it is skin or not?

Page 40: SFM under orthographic projection

Skin classification techniques

Skin classifier• Given X = (R,G,B): how to determine if it is skin or not?

• Nearest neighbor– find labeled pixel closest to X– choose the label for that pixel

• Data modeling– fit a model (curve, surface, or volume) to each class

• Probabilistic data modeling– fit a probability model to each class

Page 41: SFM under orthographic projection

Probability• Basic probability

• X is a random variable• P(X) is the probability that X achieves a certain value

• or

• Conditional probability: P(X | Y)

– probability of X given that we already know Y

continuous X discrete X

called a PDF-probability distribution/density function-a 2D PDF is a surface, 3D PDF is a volume

Page 42: SFM under orthographic projection

Probabilistic skin classification

• Now we can model uncertainty• Each pixel has a probability of being skin or not skin

Skin classifier• Given X = (R,G,B): how to determine if it is skin or not?• Choose interpretation of highest probability

– set X to be a skin pixel if and only if

Where do we get and ?

Page 43: SFM under orthographic projection

Learning conditional PDF’s

• We can calculate P(R | skin) from a set of training images• It is simply a histogram over the pixels in the training images

– each bin Ri contains the proportion of skin pixels with color Ri

This doesn’t work as well in higher-dimensional spaces. Why not?

Approach: fit parametric PDF functions • common choice is rotated Gaussian

– center – covariance

» orientation, size defined by eigenvecs, eigenvals

Page 44: SFM under orthographic projection

Learning conditional PDF’s

• We can calculate P(R | skin) from a set of training images• It is simply a histogram over the pixels in the training images

– each bin Ri contains the proportion of skin pixels with color Ri

But this isn’t quite what we want• Why not? How to determine if a pixel is skin?• We want P(skin | R) not P(R | skin)

• How can we get it?

Page 45: SFM under orthographic projection

Bayes rule

• In terms of our problem:what we measure

(likelihood)domain knowledge

(prior)

what we want(posterior)

normalization term

The prior: P(skin)• Could use domain knowledge

– P(skin) may be larger if we know the image contains a person– for a portrait, P(skin) may be higher for pixels in the center

• Could learn the prior from the training set. How?– P(skin) may be proportion of skin pixels in training set

Page 46: SFM under orthographic projection

Bayesian estimation

• Bayesian estimation• Goal is to choose the label (skin or ~skin) that maximizes the posterior

– this is called Maximum A Posteriori (MAP) estimation

likelihood posterior (unnormalized)

0.5• Suppose the prior is uniform: P(skin) = P(~skin) =

= minimize probability of misclassification

– in this case ,– maximizing the posterior is equivalent to maximizing the likelihood

» if and only if

– this is called Maximum Likelihood (ML) estimation

Page 47: SFM under orthographic projection

Skin detection results

Page 48: SFM under orthographic projection

• This same procedure applies in more general circumstances• More than two classes

• More than one dimension

General classification

H. Schneiderman and T.Kanade

Example: face detection• Here, X is an image region

– dimension = # pixels – each face can be thought

of as a point in a highdimensional space

H. Schneiderman, T. Kanade. "A Statistical Method for 3D Object Detection Applied to Faces and Cars". IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2000) http://www-2.cs.cmu.edu/afs/cs.cmu.edu/user/hws/www/CVPR00.pdf


Recommended