Face Recognition Image Understanding Xuejin Chen.

transcript

Face Recognition

Image Understanding Xuejin Chen

Image Understanding, Xuejin Chen

Face Recogntion

• Good websites– http://www.face-rec.org/

• Eigenface [Turk & Pentland]

Eigenface

• Projecting a new image into the subspace spanned by the eigenfaces (face space)

• Classifying the face by comparing its position in face space with the positions of known individuals

• Face image used as the training set

• Average face• 7 eigenfaces

calculated from the training images

Calculating Eigenfaces

• Face image : intensity matrix -D vector • PCA: to find the vectors that best account for the distribution

of face images within the entire image space– Training images– Average face

– Each face differs from the average by – Seeks a set of M orthogonal vectors that best describe the

distribution of the data• The kth vector is chosen such that

1 2 3, , ,..., M

( , )I x y N N 2N

nnM Ψ

i i Φ Ψ

1 M Tk k nnM

0,Tk k lk

otherwise

• The vector and scalar are the eigenvectors and eigenvalues, respectively, of the covariance matrix

, ,...,

M T Tn nn

Φ Φ Φ

Intractable task for typical image size2 2N N

Need a computationally feasible method to find these eigenvectors

• If image number M < N^2, only M meaningful eigenvectors

• Consider eigenvectors vi of A’A such that

Solve MxM matrix 16x16 16384*16384 (an 128x128 image)

Ti i i

AA A A

• Construct MxM matrix• Compute M eigenvectors of L• The M eigenfaces is the linear combination of

the M eigenvectors

, T Tmn m nL A A L Φ Φ

, 1,...,M

l lk kk

u v Φi i

The computations are greatly reduced

• Average face• 7 eigenfaces

calculated from the training images

Classify a Face Image

• 40 eigenfaces are sufficient for a good description for an ensemble of M=115 images [Sirovich and Kirby 1987]

• A smaller M’ is sufficient for identification • C• hoose M’ significant eigenvectors of L matrix (M=16,

M’=7)

Classify a Face Image• New image : transformed into its eigenface

components ( )T

k k u Γ -Ψ 1 2 ', ,...,TM

• describe the contribution of each eigenface in representing the input image, treating the eigenfaces as a basis set for face image

• Find a class of the face– A simple distance

Average of a class or individual

1 2 ', ,...,TM

• Creating the weight vector is: projecting the original face image onto the low-D face space

• Distance from image to face space

22f Φ Φ

f i ii

Φ Γ Ψ Φ u

Four possibilities for an image and pattern vector

• Near face space and near a face class

• Near face space but not near a known face class

• Distant from face space but near a face class

• Distant from face space and not near a known face class

Distance to face space

(a) 29.8

(b) 58.5

(c) 5217.4

Summary of Eigenface Recognition1. Collect a set of characteristic face image of the known individuals.

– The set should include a number of images for each person, with some variation in expression and lighting (M=40)

2. Calculate the 40x40 matrix L, find its eigenvalues and eigenvectors, choose M’(~10) eigenvectors with highest eigenvalues

3. Compute eigenfaces

4. For each known individual, calculate the class vector by averaging the eigenface pattern vector .

– Choose a threshold that defines the maximum allowable distance from any face class and

– a threshold that defines the maximum allowable distance from face space

Summary of Eigenface Recognition

5. For each new face image to be identified, calculate – its pattern vector , – the distance to face space , – the distance to each known class . – if the minimum distance

• Classify the input face as the individual associated with class vector

– If the minimum distance • The image may be classified as unknown, and • Optionally used to begin a new face class

• Optionally, if the new image is classified as a known individual, the image may be added to the original set of familiar face image, and the eigenfaces may be recalculated (steps 1-4)

i, k f

Locating and Detecting Faces

• Assume a centered face image, the same size as the training images and the eigenfaces

• Using face space to locate the face in image• Images of faces do not change radically when

projected into the face space, while the projection of nonface images appears quite different– > detect faces in a scene

Use face space to locate face

• At every location in the image, calculate the distance between the local subimage and face space, which is used as a measure of ‘faceness’ a face map

Expensive calculation

Face Map

• A subimage at (x,y) -Φ = Γ Ψ fΦ

22f = Φ Φ

f f Φ Φ Φ Φ

T T Tf f f Φ Φ Φ Φ Φ Φ Φ

0, f f Φ Φ ΦTf fΦ Φ

T Tf f Φ Φ Φ Φ

Face Map

• : linear combination of orthogonal eigenface vectors

f i ii

LTf f ii

1( , ) ( , ) ( , ) ( , )

x y x y x y x y

Face Map2 2

1( , ) ( , ) ( , ) ( , )

x y x y x y x y

1 1( , ) ( , )

L L Ti ii ix y x y

( , )L T

Γ Ψ u

L T Ti ii

L Ti ii

Γ u Ψ u

u Ψ u

Correlation operator

Face Map

1( , ) ( , ) ( , ) ( , )

x y x y x y x y

( , ) ( , ) ( , ) ( , )TT x y x y x y x y Φ Φ Γ Ψ Γ Ψ

( , ) ( , ) 2 ( , )

x y x y x y

Γ Γ Ψ Γ Ψ Ψ

Γ Γ Γ Ψ Ψ Ψ

1( , ) ( , ) ( , ) 2 ( , ) [ ( , ) ]

LT Ti ii

x y x y x y x y x y

Γ Γ Γ Ψ Ψ Ψ Γ u Ψ u

PrecomputedL+1 correlations

Can be implemented by a simple neural networks

Learning to Recognize New Faces

• An image is close to face space, but not classified as one of the familiar faces, labeled as “unknown”

• If a collection of “unknown” pattern vectors cluster in the pattern space, a new face is postulated

• Check similarity: the distance from each image to the mean is smaller than a threshold

• Add the new face to database (optionally)

Background Issue

• Eigenface analysis can not distinguish the face from background

• Segmentation? • Multiply the image by a 2D Gaussian window

centered on the face – Deemphasize the outside of the face – Also practical for hairstyle changing

Scale and Orientation Issue

• Recognition performance decreases quickly as the size is misjudged

• Motion estimation?• Multiscale eigenfaces / multiscale input image• Non-upright faces

– Orientation estimation using symmetry operators

Distribution in Face Space

• Nearest-neighbor classification assumes Gaussian distribution of an individual feature vector

• No prior to assume any distribution

• Nonlinear networks to learn the distribution by example [Fleming and Cottrell, 1990]

Multiple Views

• Define a number of face classes for each person– Frontal view– Side view at ± 45° – Right and left profile views

Experiments

• Database – Over 2500 face images under controlled

conditions – 16 subjects

• All combinations of 3 head orientations, 3 head sizes, 3 lighting conditions

• Construct 6-level Gaussian pyramid from 512x512 to 16x16

Variation of face images for one individual

Experiments with Lighting, Size, Orientation

• Training sets– One image of each person, under the same lighting condition,

size, orientation– Use seven eigenfaces

• Mean accuracy as the difference between the training conditions, test conditions– Difference in illumination– Image size, – Head orientation – Combinations of illumination, size, orientation

• Changing lighting conditions --- few errors

• Image size changing -- performance dramatically drops– Need multiscale approach

(a) Lighting 96% (b) Size 85% (c) Orientation 64%

(d) Orientation & lighting (e) Orienation & Size 1 (f) Orientation & Size 2

(g) Size & Lighting 1 (h) Size & Lighting 2

Experiments with varying thresholds

• Smaller threshold:– Few errors, but more false negative

• Larger threshold– More errors

• To achieve 100% accurate recognition, boost unknown rate to– 19% while varying lighting– 39% for orientation– 60% for size

• Set the unknown rates to 20%, the correct recognition rate– 100% for lighting– 94% for orientation – 74% for size

Neural Networks• Can be implemented using parallel computing elements

Collection of networks to implement computation of the pattern vector, projection into face space, distance from face space, and

identification

Conclusion

• Not general recognition algorithm• Practical and well fitted to face recognition• Fast and simple • Do not require perfect identification

– Low false-positive rate– A small set of likely matches for user-interaction

Eigenface

• Tutorial

Bayesian Face Recognition

Baback Moghaddam, Tony Jebaraand Alex Pentland

Pattern Recognition33(11), Nov. 2000

Novelty

• A direct visual matching of face images • Probabilistic measure of similarity • Bayesian (MAP) analysis of image differences• Simple computation of nonlinear Bayesian

similarity

A Bayesian Approach

• Many face recognition systems rely on similarity metrics – nearest-neighbor, cross-correlation– Template matching

• Which types of variation are critical in expressing similarity?

Probabilistic Similarity Measure

• Intensity difference

• Two classes of facial image variations– Intrapersonal variations– Extrapersonal variations

• Similarity measure

1 2I I

1 2( , ) ( ) ( | )I IS I I P P

Can be estimated using likelihoods given by Bayes rule

Non-Euclidean similarity measure

A Bayesian Approach

• First instance of non-Euclidean similarity measure for face recognition

• A generalized extension of– Linear Discriminant Analysis – FisherFace

• Has computational and storage advantages over most linear methods for large database

Probabilistic Similarity Measures

• Previous Bayesian analysis of facial appearance

• 3 different inter-image representations were analyzed using the binary formulation – XYI-warp modal deformation spectra– XY-warp optical flow fields– Simplified I-(intensity)-only image-based

difference1 2I I

• Intrapersonal variations – Images of the same individual with different expression,

lighting conditions..

• Extrapersonal variations– Variations when matching two individuals

• Both are Gaussian-distributed, learn the likelihoods

( | ), ( | )I EP P

• Similarity score

– The priors can be set as the portion of image number in the database or specified knowledge

• Maximum Posterior (MAP)

( | ) ( )( , )

( | ) ( ) ( | ) ( )I I

I I E E

P PS I I

P P P P

( | ) ( | )I EP P

1( , )

2S I I

• M individuals: M classes– Many classification -> binary pattern classification

• Maximum likelihood measure

• Almost as effective as MAP in most cases

' ( | )IS P

Subspace Density Estimation

• Intensity difference vector: high dimensional – No sufficient independent training examples – Computational cost is very large– Intrinsic dimensionality or major degree-of-freedom of

intensity difference is significantly smaller than N

• PCA– Divides the vector space R^N into two complementary

subspaces [Moghaddam & Pentaland]

• Two complementary subspaces

A typical eigenvalue spectrum

• Likelihood estimation

( ) / 2/ 2 1/ 2

1 ( )exp exp

2 2ˆ( | )22

i iM N M

ˆ( | ) ( | )F F

True marginal density in F

Estimated marginal density in F’

: the principal componentsiy2 ( ): the residual (DFFS) : the weighting parameterfound by minimizing the cross-entropy

ii MN M

Dual Eigenfaces

Intrapersonal

Extrapersonal

Variations mostly due to expression changes

Variations such as hair, facial hair, glasses…

Dual Eigenfaces

• Intensity differences of extrapersonal type span a larger vector space

• Intrapersonal eigenfaces corresponds to a more tightly constrained subspace

• The key idea of probability similarity measure

Efficient Similarity Computation

• Two classes: one intrapersonal and the other as extrapersonal variations with Gaussian distribution

1/ 2/ 2

( | )(2 )

Zero-mean since the pair k jI I j kI I

Use the principal components

Face Recognition and Detection 55CSE 576, Spring 2008

Computation

• To get the similarity– Subtracting– Project to principal eigenfaces of both

extrapersonal and intrapersonal Gaussians– Expotentials for likelihood– Iterated all the operations over all members of the

database (many I-k images) until find the maximum score

Offline Transformations

• Preprocess the I_k images with whitening transformations

• Consequently every image is stored as two vectors of whitened subspace coefficients

2= : for intrapersonalj I I jV I

= : for extrapersonalj E E jV I

are the matrics of the largest eigenvalues and eigenvectors of or E IV Λ

Offline Transformations

• Euclidean distances are computed times for each similarity

• Likelihood

• Avoid unnecessary and repeated image differencing and online projection

Experiments

• ARPA FERET database– Images taken at different times, location, imaging

conditions (clothes, lighting)• Training Set

– 74 pairs of images (2/person)• Test set

– 38 pairs of images

• Differences in clothing, hair, lighting..

(a) Training Set (b) Test Set

Face Alignment (Detection)

Bayesian Matching

• Training data– 74 intrapersonal differences– 296 extrapersonal differences– Separate PCA analysis on each

• How they distribute?– Completely enmeshed distributions with the same

principle components

Hard to distinguish low-amplitude extrapersonal difference from intrapersonal difference

Separate PCA

• Dealing with low-D hyper-ellipsoids with are intersecting near the origin of a very high-D space

• Key distinguishing factor is their relative orientation

Dual Eigenfaces

Intrapersonal

Extrapersonal

Variations mostly due to expression changes

Variations such as hair, facial hair, glasses…

Dual Eigenfaces

• Intensity differences of extrapersonal type span a larger vector space

• Intrapersonal eigenfaces corresponds to a more tightly constrained subspace

• The key idea of probability similarity measure

Dual Eigenfaces

• Compute two sets of , compute likelihood estimates

• Use principal dimensions• Set the priors as equal

( | ), ( | )I EP P

10, 30I EM M

( ) ( )I EP P

Performance

• Improvement over the accuracy obtained with a standard eigenface nearest-neighbor matching rule

• Maximum likelihood gets a similar result with MAP – 2~3% deficit on recognition rate– Computational cost is cut by a factor of 2

Performance

Computation Simplification

• Exact mapping of the probabilistic similarity score without requiring repeated image-differencing and eigenface projections

• Nonlinar matching simple Euclidean norms of their whitened feature vectors which can be precomputed offline

Discussion

• Model larger variations in facial appearance?– Pose, facial decorations?

• Regular glasses • Sunglasses, significant changes in beards, hair

– Add more variation in interpersonal training? …

– Views • View-based multiple model

Conclusions

• Good performance of probabilistic matching

• Advantageous in intra/extra density estimates explicitly characterize the type of appearance variations – Discovering the principle modes of variations

• Optimal non-linear decision rule – Do not need to compute and store eigenfaces for each individual – One or two global eigenfaces are sufficient

• Maximum Likelihood vs. MAP

View-Based and Modular Eigenspaces for Face Recognition

Alex Pentland, Baback Moghaddam and Thad Starner

CVPR’94

Part-based eigenfeatures

• Learn a separate eigenspace for each face feature

• Boosts performance of regular eigenfaces

Morphable Face Model

• Use subspace to model elastic 2D or 3D shape variation (vertex positions), in addition to appearance variation

Shape S

Appearance T

Morphable Face Model

• 3D models from Blanz and Vetter ‘99

iiimodel a

iiimodel b

Project 3 Eigenfaces

• Given a skeleton, you need to fill the functions– PCA to compute eigenfaces– Projection into the face space– Determining if a vector represents a face– Verifying a user based on a face, finding a face

match given a set of user face information– Finding the size and position of a face in an image

Project 3 Eigenfaces

• Skeleton code is large, please take time to get familiar with the classes and methods– Vector– Image operations– Minimum modification: faces.cpp, eigenfaces.cpp

Face Recognition Image Understanding Xuejin Chen.

Documents