CAP5415-Computer Vision Lecture 20-Face …bagci/teaching/computervision16/Lec20.pdfCAP5415-Computer...

CAP5415-ComputerVisionLecture20-FaceRecognition,Haar Features,

LocalBinaryPatterns,andBoosting

[email protected]

Lecture20

1

Reminders• ProjectDeadlines/Return

– 6December,from1pm– 4.30pm– 8December,from1pm– 4.30pm– Location:HEC221– 5minutesPowerpoint presentation+Demo– Return

• Your.ppt/pdfpresentationandcode.

2

Lecture20

FaceDetectionandRecognition

3

Lecture20

Detection Recognition “Sally”

• Whywasn’tMassachusetss BomberidentifiedbytheMassachusettsDepartmentofMotorVehiclessystemfromthevideosurveillanceimages?

• HewasenrolledinMADMVDatabase!

4

Lecture20

DMV Face Recognition

System ?SlideCreditstoAnimetrics,Dr.MarcValliant,VP&CTO

• Today’sFRtechnologywillreliablyfindcontrolledfacialphoto inamugshot databaseofcontrolleddatabase.

5

Lecture20

SlideCreditstoAnimetrics,Dr.MarcValliant,VP&CTO

ControlledFacialPhoto• Today’sFRtechnologywillreliablyfindcontrolledfacialphotoinamugshot databaseofcontrolleddatabase.

• However,thereareconfoundingvariablesinuncontrolledfacialphotos

6

Lecture20


ControlledFacialPhoto• Today’sFRtechnologywillreliablyfindcontrolledfacialphotoinamugshot databaseofcontrolleddatabase.

• However,thereareconfoundingvariablesinuncontrolledfacialphotos– Resolution (not enough pixels)– Facial Pose – angulated– Illumination– Occluded facial areas

7

Lecture20


FurtherDifficulties

8

Lecture20

Threegoals

9

Lecture20

FeatureComputation

• features must be computed as quickly as possible

FeatureSelection

• select the most discriminating features

Realtimeliness

• must focus on potentially positive image areas (that contain faces)

FaceDetection• Beforefacerecognitioncanbeappliedtoageneralimage,the

locationsandsizesofanyfacesmustbefirstfound.

• Rowley,Baluja,Kanade (1998) 10

Lecture20

FaceDetection/RecognitionusingMobileDevices

11

Lecture20

Facedetection(cameraautomaticallyAdjustthefocusbasedondetectedFaces)

Auto-loginwithrecognizedfaces

FaceDetection

Feature-basedEye,mouth,..

Template-basedAAM,…

Appearance-BasedPatches,…

12

Lecture20

Someoftherepresentativeworks

13

Lecture20

Rectangle(Haar-like)Features

14

Lecture20

“Rectangle filters”

Value =

∑ (pixels in white area) –∑ (pixels in black area)

FastComputationwithIntegralImages

15

Lecture20

• This can quickly be computed in one pass through the image

(x,y)( ) ( )

( ) ( ) ( )( ) ( ) ( )

' , '

formal definition:, ', '

Recursive definition:, , 1 ,

, 1, ,

x x y yii x y i x y

s x y s x y i x y

ii x y ii x y s x y

≤ ≤

=

= − +

= − +

∑

0 1 1 11 2 2 31 2 1 11 3 1 0

IMAGE

0 1 2 31 4 7 112 7 11 163 11 16 21

INTEGRAL IMAGE

FeatureSelection

16

Lecture20

• For a 24x24 detection region, the number of possible rectangle features is ~160,000!

FeatureSelection

17

Lecture20

• For a 24x24 detection region, the number of possible rectangle features is ~160,000! PCA

LocalBinaryPatterns(LBP):AlternativeFeatures

• Gray-scaleinvarianttexturemeasure• Derivedfromlocalneighborhood• Powerfultexturedescriptor• Computationallysimple• Robustagainstmonotonicgray-scalechanges

18

Lecture20

LocalBinaryPatterns(LBP):AlternativeFeatures

19

Lecture20

(LBPfromdynamic/videotexture)

SimpleFRforMobileDevices

20

Lecture20

(LBP:localbinarypatterns)

PrincipalComponentAnalysis(PCA)

21

Lecture20

• Mappingfromtheinputsintheoriginald-dimensionalspacetoanew(k<d)-dimensionalspace,withminimumlossofinformation.


22

Lecture20


• PCAisanunsupervisedmethod, itdoesnotuseoutput information.


23

Lecture20


• PCAisanunsupervisedmethod, itdoesnotuseoutput information.

PCAcentersthesampleandthenrotatestheaxestolineupwiththedirectionsofhighestvariance.


• Theprojectionofx onthedirectionof wis

24

Lecture20

z = w

Tx

Originalaxes

**

***

*

* **

****

*

*

*

** ***

*

**

Datapoints

FirstprincipalcomponentSecondprincipalcomponent



• Theprincipalcomponentusw1 suchthatthesample,afterprojectiononw1,ismostspreadoutsothatthedifferencebetweenthesamplepointsbecomesmostapparent.

25

Lecture20

z = w

Tx




• Tohaveuniquesolution,

26

Lecture20

z = w

Tx

||w1|| = 1




• Tohaveuniquesolution,• with

27

Lecture20

z = w

Tx

||w1|| = 1

z1 = w1Tx

Cov(x) = ⌃




• Tohaveuniquesolution,• with• Then,

28

Lecture20

z = w

Tx

||w1|| = 1

z1 = w1Tx

Cov(x) = ⌃V ar(z1) = w1

T⌃w1




• Tohaveuniquesolution,• with• Then,• SEEKw1 suchthatVar(z1)ismaximized!

29

Lecture20

z = w

Tx

||w1|| = 1

z1 = w1Tx

Cov(x) = ⌃V ar(z1) = w1

T⌃w1

SolutionofPCA• WriteitasaLagrangeproblem,takederivativesw.r.t tow,then

wheremisthesamplemean

(=D diagonal)(S:spectraldecomp.)

30

Lecture20

z = W

T (x�m)

Cov(z) = WTSW

XTX = WDWT

SolutionofPCA

• Letussaywewanttoreducedimensionalitytok<d,wetakethefirstkcolumnsofW(withthehighesteigenvalues).

31

Lecture20

XTX = WDWT

SolutionofPCA

• Letussaywewanttoreducedimensionalitytok<d,wetakethefirstkcolumnsofW(withthehighesteigenvalues).

i=1,…k,t=1,...,N

32

Lecture20

XTX = WDWT

zti = w

Ti x

t

(XTX)wi = �iwi

X = USVT

U = evec(XXT )

V = evec(XTXT )

S2 = eval(XXT )

Screeplot:AbilityofPCstoexplainvariationindata

• Enough PCs (principal components) to have a cumulative variance explained by the PCs thatis >50-70%

• Kaiser criterion: keep PCs with eigenvalues >1

33

Lecture20

λ

λN

Recap:PCAcalculationsincartoon

34

Lecture20

StepsinPCA:#1CalculateAdjustedDataSet

…ndims

datasamples

DataSet:D Meanvalues:M

-…

AdjustedDataSet:A

=Mi iscalculatedbytakingthemeanofthevaluesindimensioni


35

Lecture20

StepsinPCA:#2CalculateCo-variancematrix,C,fromAdjustedDataSet,A

Co-varianceMatrix:C

n

n

Cij =cov(i,j)

Note:Sincethemeansofthedimensions intheadjusteddataset,A,are0,thecovariancematrixcansimplybewrittenas:

C=AAT/(n-1)


36

Lecture20

StepsinPCA:#3CalculateeigenvectorsandeigenvaluesofC

Eigenvectors

Eigenvalues

Ifsomeeigenvaluesare0orverysmall,wecanessentially discardthoseeigenvaluesandthecorrespondingeigenvectors,hencereducingthedimensionality ofthenewbasis.

Eigenvectors

Eigenvalues

xx

MatrixEMatrixE


37

Lecture20

StepsinPCA:#4Transformingdatasettothenewbasis

F=ETA

where:• Fisthetransformeddataset• ET isthetransposeoftheEmatrixcontainingtheeigenvectors• Aistheadjusteddataset

Notethatthedimensions ofthenewdataset,F,arelessthanthedatasetA

TorecoverAfromF:

(ET)-1F=(ET)-1ETA(ET)TF=AEF=A

*Eisorthogonal,thereforeE-1 =ET

HolisticFR:Eigenfaces

38

Lecture20

Eigenfaces,fisherfaces,tensorfaces…..

GaborFeature-basedFR• EarlierFRmethodsaremostlyfeature-based.• Themostsuccessfulfeature-basedFRistheelasticbunchgraph

matchingsystemwithGaborfiltercoefficientsasfeatures:

39

Lecture20

(scale)

GaborFeatures

40

Lecture20

Scale(5) Orientation(8)

PCAonFaces:“Eigenfaces”

41

Lecture20

Averageface

Firstprincipalcomponent

Othercomponents

Forallexceptaverage,“gray” =0,“white” >0,“black” <0

Eigenfaces example

42

Lecture20

Trainingfaces

Eigenfaces example

43

Lecture20

Top eigenvectors: u1,…uk

Mean: μ

Applicationtofaces

44

Lecture20

• Representing faces onto this basis

Facereconstruction:

SimplestApproachtoFR

45

Lecture20

− The simplest approach is to think of it as a template matching problem

− Problems arise when performing recognition in a high-dimensional space.

− Significant improvements can be achieved by first mapping the data into a lower dimensionality space.

FRusingeigenfaces

46

Lecture20

FRusingeigenfaces

47

Lecture20

− The distance er is called distance within face space (difs)

− The Euclidean distance can be used to compute er, however, the Mahalanobis distance has shown to work better:

2

1|| || ( )

Kk k

i iiw w

=

Ω−Ω = −∑

Mahalanobis distance

Euclidean distance

FaceDetection

48

Lecture20

(iPhoto)

FaceDetection

49

Lecture20

NikonS60

FaceDetection

50

Lecture20

NikonS60finds12faces…

TheViola/JonesFaceDetector• Aseminalapproachtoreal-timeobjectdetection• Trainingisslow,butdetectionisveryfast• Keyideas

– Integralimages forfastfeatureevaluation– Boosting forfeatureselection– Attentional cascade forfastrejectionofnon-facewindows

51

Lecture20

P.ViolaandM.Jones.Rapidobjectdetectionusingaboostedcascadeofsimplefeatures. CVPR2001.

P.ViolaandM.Jones.Robustreal-timefacedetection. IJCV57(2),2004.

TheViola/JonesFaceDetector-Training

• Initially,weighteachtrainingexampleequally• Ineachboostinground:

• Findtheweaklearner thatachievesthelowestweighted trainingerror• Raisetheweightsoftrainingexamplesmisclassifiedbycurrentweak

learner

• Computefinalclassifieraslinearcombinationofallweaklearners(weightofeachlearnerisdirectlyproportionaltoitsaccuracy)• Exactformulasforre-weightingandcombiningweaklearnersdependon

theparticularboostingscheme (e.g.,AdaBoost)

52

Lecture20

P.ViolaandM.Jones.Rapidobjectdetectionusingaboostedcascadeofsimplefeatures. CVPR2001.

P.ViolaandM.Jones.Robustreal-timefacedetection. IJCV57(2),2004.

TheViola/JonesFaceDetector-Testing

53

Lecture20

• Firsttwofeaturesselectedbyboosting:

Thisfeaturecombinationcanyield100%detectionrateand50%falsepositiverate

TheViola/JonesFaceDetector-Testing

54

Lecture20

• A200-featureclassifiercanyield95%detectionrateandafalsepositiverateof1in14084

Notgoodenough!

Attentional Cascade

55

Lecture20

FACEIMAGESUB-WINDOW

Classifier 1T

Classifier 3T

F

NON-FACE

TClassifier 2

T

F

NON-FACE

F

NON-FACE

•Westartwithsimpleclassifierswhichrejectmanyofthenegativesub-windowswhiledetectingalmostallpositivesub-windows•Positiveresponsefromthefirstclassifiertriggerstheevaluationofasecond(morecomplex)classifier,andsoon•Anegativeoutcomeatanypointleadstotheimmediaterejectionofthesub-window

vsfalse neg determined by

% False Pos

% D

etec

tion

0 50

0

100

Receiveroperatingcharacteristic

CascadedClassifiers(Boosting)

56

Lecture20

input

Base-learners

Output

BoostingforFR

57

Lecture20

Weak Classifier 1

BoostingforFR

58

Lecture20

WeightsIncreased

BoostingforFR

59

Lecture20

Weak Classifier 2

BoostingforFR

60

Lecture20

WeightsIncreased

BoostingforFR

61

Lecture20

Weak Classifier 3

BoostingforFR

62

Lecture20

Final classifier is a combination of weak classifiers

AdaBoost Algorithm

63

Lecture20

• Given ,• Initialize• For

– For each classifier that minimizes the error with respect to the distribution

• is the weighted error rate of classifier– If , then stop – Choose , typically – Update

• where is a normalized factor (choose so that Dt+1 will sum_x=1)

1 1( , ),..., ( , )m mx y x y , { 1, 1}i ix X y Y∈ ∈ = − +

11( ) , 1,..., ,D i i mm

= =

1,...,t T=: { 1, 1}th X → − +

tD

argmint

t th H

h ε∈

= ( )[ ( )]t t i t iD i y h xε = ≠∑

0.5tε ≥t Rα ∈ 11 ln

2t

tt

εαε−=

tε th

1( ) exp( ( ))( ) t t i t i

tt

D i y h xD iZα

+−=

tZ

BoostingforFR

64

Lecture20

• Define weak learners based on rectangle features

1( ) ( )

T

t tt

H x sign a h x=

⎛ ⎞= ⎜ ⎟⎝ ⎠∑

Boosting&SVM• Advantages of boosting

– Integrates classification with feature selection– Complexity of training is linear instead of

quadratic in the number of training examples– Flexibility in the choice of weak learners, boosting

scheme– Testing is fast– Easy to implement

• Disadvantages– Needs many training examples– Often does not work as good as SVM

65

Lecture20

References&SliceCredits

66

Lecture20

• Animetrics,Dr.MarcValliant,VP&CTO• M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive

Neuroscience, vol. 3, no. 1., 1991.• Y.FreundandR.Schapire,Ashortintroductiontoboosting,JournalofJapanese

SocietyforArtificialIntelligence,14(5):771-780,September,1999.• S.Li, et al. Handbook of Face Recognition, Springer.• Paul A. Viola and Michael J. Jones, Intl. J. Computer Vision

57(2), 137–154, 2004, (originally in CVPR’2001)• Some slides adapted from Bill Freeman, MIT 6.869, April 2005)• Friedman, J., Hastie, T. and Tibshirani, R. Additive Logistic Regression: a

Statistical View of Boostinghttp://www-stat.stanford.edu/~hastie/Papers/boost.ps

Date post:	24-Aug-2020
Category:	Documents
Upload:	others
View:	4 times
Download:	0 times

CAP5415-Computer Vision Lecture 20-Face …bagci/teaching/computervision16/Lec20.pdfCAP5415-Computer...

Documents