ECE 484 Digital Image Processing Lec 18 - Transform Domain Image Processing III
Supervised Subspace Learning
Zhu LiDept of CSEE, UMKC
Office: FH560E, Email: [email protected], Ph: x 2346.http://l.web.umkc.edu/lizhu
Z. Li, ECE 484 Digital Image Processing, 2019 p.1
slides created with WPS Office Linux and EqualX equation editor
Outline
Recap: Eigenface NMF LEM
Fisherface - Linear Discriminant Analysis Graph Embedding - Laplacian Embedding
Z. Li, ECE 484 Digital Image Processing, 2019 p.2
PCA Algorithm
Center the data: X = X – repmat(mean(x), [n, 1]);
Principal component #1 points in the direction of the largest varianceEach subsequent principal
component… is orthogonal to the previous ones,
and points in the directions of the largest
variance of the residual subspace
Solved by finding Eigen Vectors of the Scatter/Covarinace matrix of data: S = cov(X); [A, eigv]=Eig(S)
Z. Li, ECE 484 Digital Image Processing, 2019 p.3
Eigenfaces Implementation
Training imagesx1,…,xN
Z. Li, ECE 484 Digital Image Processing, 2019 p.4
load faces-ids-n6680-m417-20x20.mat;[A, s, lat]=princomp(faces);
h=20; w=20;
figure(30); subplot(1,2,1); grid on; hold on; stem(lat, '.'); f_eng=lat.*lat; subplot(1,2,2); grid on; hold on; plot(cumsum(f_eng)/sum(f_eng), '.-');
Eigenface
Holistic approach: treat an hxw image as a point in Rhxw: Face data set: 20x20 face icon images, 417 subjects, 6680 face images Notice that all the face images are not filling up the R20x20 space:
Eigenface basis: imagesc(reshape(A1(:,1), [h, w]));
kd kd
eig
val
Info
pre
serv
ing
ratio
Z. Li, ECE 484 Digital Image Processing, 2019 p.5
Eigenface Projections
Project face images to Eigenface space Matlab: x=faces*A(:,1:kd)
eigf1
eigf2
eigf3
= 10.9* + 0.4* + 4.7*
Z. Li, ECE 484 Digital Image Processing, 2019 p.6
Nonnegative Matrix Factorization (NFM)
Most images we observes are non-negative, can we:
Z. Li, ECE 484 Digital Image Processing, 2019 p.7
matlab:[W,H]=nnmf(V, k)
Geometry of NMF
NMF has a very nice geometrical intepretation. For the observation of N samples in F dimensional feature
space, we can say with NMF, all observed samples are actually lie in a cone spaned by W
as the definition of cone is the non-negative linear combinations of its K basis.
Z. Li, ECE 484 Digital Image Processing, 2019 p.8
NMF Applications
Face recognition: K=2 NMF decompositition
Z. Li, ECE 484 Digital Image Processing, 2019 p.9
560 by 400 560 by 2
2 by 400
20 by 28 20 by 28
-2.19
-0.02
-3.19
1.02
2 by 12 by 1
Laplacian Eigenmap
Minimizing the following
Is equivalent to
Where D is the degree matrix (diagonal) with
Z. Li, ECE 484 Digital Image Processing, 2019 p.10
Laplacian Eigen Map Solution
Numerically, solve the eigen problem:
where the first d smallest eigen values’ corresponding eigenvectors, will form a d-dimensional feature of {yk}
Z. Li, ECE 484 Digital Image Processing, 2019 p.11
Outline
Recap: Eigenface NMF LEM
Fisherface - Linear Discriminant Analysis Graph Embedding - Laplacian Embedding
Z. Li, ECE 484 Digital Image Processing, 2019 p.12
Limitation of PCA
What PCA does best ? Label agnostic data
compression/dimension reduction Suppress noises It is a linear rotation (note that
A*AT = I), minimum amount of distortion to the data Help classification (sort of)
What PCA can not Can not fit data that is not linear
(can try kernelized PCA) Does not taking labelling into
consideration (Fisherface !)
?
?
??
Z. Li, ECE 484 Digital Image Processing, 2019 p.13
Why is Face Recognition Hard?
Many faces of Madonna the bad girl…
Z. Li, ECE 484 Digital Image Processing, 2019 p.14
• Identify similar faces (inter-class similarity)• Accommodate intra-class variability due to:
• head pose• illumination conditions• expressions• facial accessories• aging effects• Cartoon/sketch
Face Recognition Challenges
Z. Li, ECE 484 Digital Image Processing, 2019 p.15
• Different persons may have very similar appearance
Twins Father and son
www.marykateandashley.com news.bbc.co.uk/hi/english/in_depth/americas/2000/us_elections
Inter-class similarities
Z. Li, ECE 484 Digital Image Processing, 2019 p.16
• Faces with intra-subject variations in pose, illumination, expression, accessories, color, occlusions, and brightness
Intra-Class Variations
Z. Li, ECE 484 Digital Image Processing, 2019 p.17
P. Belhumeur, J. Hespanha, D. Kriegman, Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection, PAMI, July 1997, pp. 711--720.
• An n-pixel image x Rn can be projected to a low-dimensional feature space y Rm by
y = Wx
where W is an n by m matrix.
• Recognition is performed using nearest neighbor in Rm.
• How do we choose a good W?
Fisherface solution
Ref:
Z. Li, ECE 484 Digital Image Processing, 2019 p.18
PCA & Fisher’s Linear Discriminant
• Between-class scatter
• Within-class scatter
• Total scatter
• Where– c is the number of classes– i is the mean of class i
– | i | is number of samples of i..
1
2
1 2
Z. Li, ECE 484 Digital Image Processing, 2019 p.19
Eigen vs Fisher Projection
• PCA (Eigenfaces)
Maximizes projected total scatter
• Fisher’s Linear Discriminant
Maximizes ratio of projected between-class to projected within-class scatter, solved by the generalized Eigen problem:
1 2
PCA
Fisher
Z. Li, ECE 484 Digital Image Processing, 2019 p.20
A problem…
Compute Scatter matrix in I-dimensional space We need at least d data points to compute a non-singular
scatter matrix. E.g, d=3, we need at least 3 points, and, these points should
not be co-plane (gives rank 2, instead of 3). In face recognition, d is number of pixels, say 20x20=400,
while number of training samples per class is small, say, nk = 10. What shall we do ?
Z. Li, ECE 484 Digital Image Processing, 2019 p.21
Computing the Fisher Projection Matrix
• The wi are orthonormal• There are at most c-1 non-zero generalized Eigenvalues, so m <= c-1• Rank of Sw is N-c, where N is the number of training samples (=10 in AT&T data set), and c=40, so Sw can be singular and present numerical difficulty.
Z. Li, ECE 484 Digital Image Processing, 2019 p.22
Dealing with Singularity of Sw
• Since SW is rank N-c, project
training set via PCA first to subspace spanned by first N-c principal components of the training set.• Apply FLD to N-c dimensional subspace yielding c-1 dimensional feature space.
• Fisher’s Linear Discriminant projects away the within-class variation (lighting, expressions) found in training set.• Fisher’s Linear Discriminant preserves the separability of the classes.
Z. Li, ECE 484 Digital Image Processing, 2019 p.23
Experiment results on 417-6680 data set
Compute Eigenface model and Fisherface model% Eigenface: A1load faces-ids-n6680-m417-20x20.mat;[A1, s, lat]=princomp(faces);
% Fisherface: A2n_face = 600; n_subj = length(unique(ids(1:n_face))); %eigenface kdkd = 32; opt.Fisherface = 1; [A2, lat]=LDA(ids(1:n_face), opt, faces(1:n_face,:)*A1(:,1:kd));% eigenfacex1 = faces*A1(:,1:kd); f_dist1 = pdist2(x1, x1);% fisherfacex2 = faces*A1(:,1:kd)*A2; f_dist2 = pdist2(x2, x2);
Z. Li, ECE 484 Digital Image Processing, 2019 p.24
Fisher vs Eigenface performance
Eigenface model: kd=32 Data set: 400 faces/58 subjects, 600 faces/86 subjects
Z. Li, ECE 484 Digital Image Processing, 2019 p.25
Outline
Recap: Eigenface NMF LEM
Fisherface - Linear Discriminant Analysis Graph Embedding - Laplacian Embedding
Z. Li, ECE 484 Digital Image Processing, 2019 p.26
Locality Preserving Projection
Recall the dimension reduction formulations: find w, s.t y=wx: PCA:
LDA:
p.27
S = SB + SW
Z. Li, ECE 484 Digital Image Processing, 2019
LPP Formulation – Affinity
To preserve local affinity relationship Affinity map
Selection of heat kernel size and threshold are important Hint: affinity matrix should be sparse
p.28
affinity histogramZ. Li, ECE 484 Digital Image Processing, 2019
LPP Formulation – Affinity Supervised
How to utilize the label info ? Heat map mapping is good for intra-class affinity modelling,
but how about intra-class affinity ? One direct solution is to set affinity to zero for intra class
pairs
p.29
% LPP - compute affinityf_dist1 = pdist2(x1, x1);% heat kernel sizemdist = mean(f_dist1(:)); h = -log(0.15)/mdist; S1 = exp(-h*f_dist1); id_dist = pdist2(ids, ids);subplot(2,2,3); imagesc(id_dist); title('label distance');S2=S1; S2(find(id_dist~=0)) = 0; subplot(2,2,4); imagesc(S1); colormap('gray'); title('affinity-supervised');
Z. Li, ECE 484 Digital Image Processing, 2019
LPP- Affinity Preserving Projection
Find a projection that best preserves the affinity matrix
p.30Z. Li, ECE 484 Digital Image Processing, 2019
LPP Formulation
L = D-S: nxn matrix, called graph Laplacian
Normalizing factor: nxn D Diagonal matrix, entry Dii = sum of col/row affinity The larger the value, the more important data point is
Constraint on D:
p.31Z. Li, ECE 484 Digital Image Processing, 2019
Generalized Eigen Problem
Now the formulation is,
Lagranian
by KKT (Karush-Khun-Tucker) Condition, it is solved by a generalized Eigen problem
p.32Z. Li, ECE 484 Digital Image Processing, 2019
X He, S Yan, Y Hu, P Niyogi, HJ Zhang, “Face Recognition Using Laplacianface”, IEEE Trans PAMI, vol. 27 (3), 328-340, 2005.
Matlab Implementation
laplacianface.m
p.33
%LPPn_face = 1200; n_subj = length(unique(ids(1:n_face)));
% eigenface kd=32; x1 = faces(1:n_face,:)*A1(:,1:kd); ids=ids(1:n_face); % LPP - compute affinityf_dist1 = pdist2(x1, x1);% heat kernel sizemdist = mean(f_dist1(:)); h = -log(0.15)/mdist; S1 = exp(-h*f_dist1); figure(32); subplot(2,2,1); imagesc(f_dist1); colormap('gray'); title('d(x_i, d_j)');subplot(2,2,2); imagesc(S1); colormap('gray'); title('affinity'); %subplot(2,2,3); grid on; hold on; [h_aff, v_aff]=hist(S(:), 40); plot(v_aff, h_aff, '.-'); % utilize supervised infoid_dist = pdist2(ids, ids);subplot(2,2,3); imagesc(id_dist); title('label distance');S2=S1; S2(find(id_dist~=0)) = 0; subplot(2,2,4); imagesc(S1); colormap('gray'); title('affinity-supervised');
% laplacian facelpp_opt.PCARatio = 1; [A2, eigv2]=LPP(S2, lpp_opt, x1);
Z. Li, ECE 484 Digital Image Processing, 2019
Laplacian Face
Now, we can model face as a LPP projection:
p.34
Eigenface Laplacian Face
Z. Li, ECE 484 Digital Image Processing, 2019
Laplacian vs Eigenface
1200 faces, 144 subjects
p.35Z. Li, ECE 484 Digital Image Processing, 2019
LPP and PCA Graph Embedding is an unifying theory on dimension
reduction PCA becomes special case of LPP, if we do not enforce local
affinity
p.36Z. Li, ECE 484 Digital Image Processing, 2019
LPP and LDA
How about LDA ? Recall within class scatter:
p.37
This is i-th classData covariance
Li has diagonal entry of 1/ni,Equal affinity among data points
Z. Li, ECE 484 Digital Image Processing, 2019
LPP and LDA
Now consider the between class scatter C is the data covariance,
regardless of label L is graph Laplacian computed
from the affinity rule that,
p.38Z. Li, ECE 484 Digital Image Processing, 2019
LDA as a special case of LPP
The same generalized Eigen problem
p.39Z. Li, ECE 484 Digital Image Processing, 2019
Graph Embedding Interpretation of PCA/LDA/LPP
Affinity graph S, determines the embedding subspace W, via
PCA and LDA are special cases of Graph Embedding PCA:
LDA
LPP
p.40Z. Li, ECE 484 Digital Image Processing, 2019
Applications: facial expression embedding
Facial expressions embedded in a 2-d space via LPP
p.41
frown
sad
happy
neutral
Z. Li, ECE 484 Digital Image Processing, 2019
Application: Compression of SIFT
Compression of SIFT, preserve matching relationship, rather than reconstruction:
Z. Li, ECE 484 Digital Image Processing, 2019 p.42
Summary
Supervised Subspace Learning Utilizing label info in learning Unified under the Graph Embedding scheme Image pixels provide an initial affinity map, refined by the
label info Graph embedding leads to an Eigen problem that solves for
the project matrix. PCA, LDA and LPP are all unified under graph embedding
scheme LPP is different from LEM, by having an explicit projection
Z. Li, ECE 484 Digital Image Processing, 2019 p.43