+ All Categories
Home > Documents > Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... •...

Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... •...

Date post: 09-Jul-2020
Category:
Upload: others
View: 14 times
Download: 0 times
Share this document with a friend
72
Department of Computer Science CSCI 5622: Machine Learning Chenhao Tan Lecture 16: Dimensionality Reduction Slides adapted from Jordan Boyd-Graber, Chris Ketelsen 1
Transcript
Page 1: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Department of Computer ScienceCSCI 5622: Machine Learning

Chenhao TanLecture 16: Dimensionality Reduction

Slides adapted from Jordan Boyd-Graber, Chris Ketelsen

1

Page 2: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Midterm

A. Review session

B. Flipped classroom

C. Go over the example midterm

D. Clustering!

2

Page 3: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Learning objectives

• Understand what unsupervised learning is for

• Learn principal component analysis

• Learn singular value decomposition

3

Page 4: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Supervised learning

4

Unsupervised learning

Data: X Labels: Y Data: X

Page 5: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Supervised learning

5

Unsupervised learning

Data: X

Latent structure: Z

Data: X Labels: Y

Page 6: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

When do we need unsupervised learning?

6

Page 7: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

When do we need unsupervised learning?

• Acquiring labels is expensive

• You may not even know what labels to acquire

7

Page 8: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

When do we need unsupervised learning?• Exploratory data analysis

• Learn patterns/representations that can be useful for supervised

learning (representation learning)

• Generate data

• …

8

Page 9: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

When do we need unsupervised learning?

9

https://qz.com/1090267/artificial-intelligence-can-now-show-you-how-those-pants-will-fit/

Page 10: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Unsupervised learning

10

• Dimensionality reduction

• Clustering

• Topic modeling

Page 11: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Unsupervised learning

11

• Dimensionality reduction

• Clustering

• Topic modeling

Page 12: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

12

Page 13: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

13

Data’s features almost certainly correlated

Page 14: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

14

Makes it hard to see hidden structure

Page 15: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

15

To make this easier, let try to reduce this to 1-dimension

Page 16: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

16

We need to shift our perspective

Change the definition of up-down-left-right

Choose new features as linear combinations of old features

Change of feature-basis

Page 17: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

17

We need to shift our perspective

Change the definition of up-down-left-right

Choose new features as linear combinations of old features

Change of feature-basis

Important: Center and normalize data before performing PCAWe will assume that this has already been done in this lecture.

Page 18: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

18

Proceed incrementally:

• If we could choose one combination to describe data?

• Which combination leads to the least loss of information?

• Once we've found that one, look for another one, perpendicular

to the first, the retains the next most amount of information-

• Repeat until done (or good enough)

Page 19: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

19

Page 20: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

20

Page 21: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

21

Page 22: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

22

Page 23: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

23

Page 24: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

24

Page 25: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

25

Page 26: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

26

Page 27: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

27

Page 28: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

28

Page 29: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

29

Page 30: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

30

Page 31: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

31

Page 32: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

32

The best vector to project onto is called the 1st principal componentWhat properties should it have?

Page 33: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

33

The best vector to project onto is called the 1st principal componentWhat properties should it have?• Should capture largest variance in data• Should probably be a unit vector

Page 34: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

34

The best vector to project onto is called the 1st principal componentWhat properties should it have?• Should capture largest variance in data• Should probably be a unit vectorAfter we’ve found the first, look the second which:• Captures largest amount of leftover variance• Should probably be a unit vector• Should be orthogonal to the one that came before it

Page 35: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

35

Page 36: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

36

Page 37: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

37

Main idea: The principal components give a new perpendicular coordinate system to view data where each principle component describes successively less and less information.

Page 38: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

38

Main idea: The principal components give a new perpendicular coordinate system to view data where each principle component describes successively less and less information.

So far: All we’ve done is a change of basis on the feature space.

But when do we reduce the dimension?

Page 39: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

39

But when do we reduce the dimension?

Picture data points in a 3D feature space

What if the points lied mostly along a single vector?

Page 40: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

40

The other two principal components are still there

But they do not carry much information

Page 41: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis -Motivation

41

The other two principal components are still there

But they do not carry much information

Throw them away and work with low dimensional representation!

Reduce 3D data to 1D

Page 42: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

42

Page 43: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

43

Page 44: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

44

Page 45: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

45

Page 46: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

46

But how do we find w?

Page 47: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

47

But how do we find w?

Page 48: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

48

Page 49: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

49

Page 50: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

50

Page 51: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

51

Page 52: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

52

Page 53: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

53

Page 54: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

54

Page 55: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

55

Page 56: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

56

Page 57: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

57

Page 58: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

58

Page 59: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Principal Component Analysis – The How

59

Page 60: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

PCA – Dimensionality reduction

60

Questions:• How do we reduce dimensionality?• How much stuff should we keep?

Page 61: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

PCA – Dimensionality reduction

61

Page 62: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

PCA – Dimensionality reduction

62

Page 63: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Quiz

63

Page 64: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

PCA - applications

64

Page 65: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

PCA - applications

65

Page 66: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

PCA - applications

66

Page 67: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

PCA - applications

67

Page 68: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

PCA - applications

68

Page 69: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

PCA - applications

69

Page 70: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Connecting PCA and SVD

70

Page 71: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

SVD Applications

71

Page 72: Department of Computer Science CSCI 5622: Machine …CSCI 5622: Machine Learning ChenhaoTan ... • Should probably be a unit vector. Principal Component Analysis - Motivation 34 The

Wrap up

72

Dimensionality reduction can be a useful way to • explore data• visualize data• represent data


Recommended