Spatial Data: Dimensionality Reduction · Spatial Data: Dimensionality Reduction CS444 Techniques,...

Post on 01-Jan-2021

4 views 0 download

transcript

Spatial Data: Dimensionality Reduction

CS444Techniques, Lecture 3

In this subfield, we think of a data point as a

vector in R^n

(what could possibly go wrong?)

“Linear” dimensionality reduction:

Reduction is achieved by is a single matrix for

every point.

Regular Scatterplots• Every data point is a vector:

• Every scatterplot is produced by a very simple matrix:

2

664

v0v1v2v3

3

775

1 0 0 00 0 1 0

1 0 0 00 1 0 0

What about other matrices?

Grand Tour (Asimov, 1985)

http://cscheid.github.io/lux/demos/tour/tour.html

Is there a best matrix?

How do we think about that?

Linear Algebra review• Vectors

• Inner Products

• Lengths

• Angles

• Bases

• Linear Transformations and Eigenvectors

Principal Component Analysis

Sepal.LengthSepal.Width

Petal.LengthPetal.Width

−0.2

−0.1

0.0

0.1

0.2

−0.10 −0.05 0.00 0.05 0.10 0.15PC1

PC2

Species

setosa

versicolor

virginica

Principal Component Analysis• Algorithm:

• Given data set as matrix X in R^(d x n),

• Center matrix:

• Compute eigendecomposition of

• The principal components are the first few rows of

X̃ = X(I �~1

n~1T ) = XH

X̃T X̃ = U⌃UT

X̃T X̃

U⌃1/2

What if we don’t have coordinates, but distances?

“Classical” Multidimensional Scaling

http://www.math.pku.edu.cn/teachers/yaoy/Fall2011/lecture11.pdf

Borg and Groenen, Modern Multidimensional Scaling

Borg and Groenen, Modern Multidimensional Scaling

“Classical” Multidimensional Scaling

• Algorithm:

• Given , create

• PCA of B is equal to the PCA of X

• Huh?!

Dij = |Xi �Xj |2 B = �1

2HDHT

“Nonlinear” dimensionality

reduction

(ie: projection is not a matrix operation)

Data might have “high-order” structure

http://isomap.stanford.edu/Supplemental_Fig.pdf

We might want to minimize something else besides “difference

between squared distances”

t-SNE: difference between neighbor ordering

Why not distances?

The curse of Dimensionality

• High dimensional space looks nothing like low-dimensional space

• Most distances become meaningless