+ All Categories
Home > Documents > Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA....

Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA....

Date post: 23-Jul-2020
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
18
Unsupervised Learning Dimensionality Reduction Principal Component Analysis (PCA) Applications Dimensionality Reduction
Transcript
Page 1: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Unsupervised LearningDimensionality Reduction

Principal Component Analysis (PCA)Applications

Dimensionality Reduction

Page 2: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Machine Learning

Supervised learning: generate a function from training data ((input, output) pairs).

regression: predict a continuous value as a function of the inputclassification: predict the class label of the input object

Unsupervised learning: fit a model to observations without a priori output.

clustering: natural groupings identifying patterns in the data

Page 3: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Unsupervised Learning

Input: unlabeled data samples {x(t)}t=1..m

Why study unlabeled data?Collecting labeled data can be costlyCluster first, label later Changing pattern characteristicsIdentify features that will be useful for categorizationExploratory data analysis

Page 4: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Dimensionality Reduction

Reducing the number of random variables under consideration.A technique for simplifying a high-dimensional data set by reducing its dimension for analysis.Projection of high-dimensional data to a low-dimensional space that preserves the “important” characteristics of the data.

Page 5: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Principal Component Analysis (PCA)

An orthogonal linear transformation.Transforms into a new coordinate system

Greatest variance along the first axisSecond greatest variance along the secondetc

Also known as:Karhunen-Loève Transform (KLT)Hotelling Transform

Page 6: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

PCA (1)

Look for the unit vector w that maximizes the variance of the projected data items, wTx:

Solution: w is the dominant eigenvector of the covariance matrix Σx.

Page 7: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

PCA vs. LDA

Page 8: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

PCA (2)

What about projecting onto two vectors?Maximize:

w1 and w2 are the two dominant eigenvectors of Σx. This idea generalizes to any dimension k.

Page 9: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

PCA (3)Let X denote the n × m data matrix (each column is a centered vector x(t)-μx ∈ Rn)

Definition: the principle directions (axes) of {x(t)}t=1..T are the eigenvectors of the covariance matrix XXT.

Let W = {w1,…,wk} be the k leading principle axes, then the projection WTX maximizes:

Page 10: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

PCA (4)Let X denote the centered n × m data matrix as before.

Consider the Singular Value Decomposition of X:

The PCA transform projects X down into the reduced subspace spanned only by the first L singular vectors, WL:

Page 11: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Computing the Principle Axes

Let A = XXT

Bad idea: compute all eigenvectors of A and keep the leading k.Power method:

Compute (v,λ), the leading eigenvector of AUpdate

Repeat the above with A = B, until we have k vectors.

Page 12: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Power method (continued)

If v1 is the leading eigenvector of A, then it is an eigenvector of B (with eigenvalue 0):

Any other eigenvector vk of A is also an eignevector of B:

Page 13: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Alternative methodLet Xn × m, where m << n. The matrix XXT is very large, but XTX is much smaller!

Idea: compute eigenvectors of XTX instead!

Let (v,λ) be an eigenpair of XTX, then (Xv,λ) is an eigenpair of XXT.

Proof:

Page 14: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Applications in Comp. Graphics

Bounding box computation

x

y

minX maxX

maxY

minY

Page 15: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Applications in Comp. Graphics

Bounding box computation

x’y’

Page 16: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Applications in Comp. Graphics

Normal Estimation in point clouds

normal

Page 17: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Applications in Comp. Graphics

Normal Estimation in point clouds

Page 18: Unsupervised Learning Dimensionality Reduction Principal ...csip/CSIP2007-PCA.pdf · PCA vs. LDA. PCA (2) ... Maximize: w 1 and w 2 are the two dominant eigenvectors of Σ x. This

Applications in Comp. Graphics

Morphable face modelshttp://gravis.cs.unibas.ch/Sigg99.html


Recommended