+ All Categories
Home > Documents > Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

Date post: 22-Feb-2016
Category:
Upload: kevyn
View: 69 times
Download: 0 times
Share this document with a friend
Description:
Institute of Empirical Research in Economics (IEW). Laboratory for Social & Neural Systems Research (SNS). Pattern Recognition and Machine Learning. Course schedule. Date Topic Chapter 13-10-2010 Density Estimation, Bayesian Inference 2 - PowerPoint PPT Presentation
49
Laboratory for Social & Neural Systems Research (SNS) PATTERN RECOGNITION AND MACHINE LEARNING Institute of Empirical Research in Economics (IEW) 22-09-2010 1 Computational Neuroeconomics and Neuroscience
Transcript
Page 1: Pattern Recognition  and  Machine Learning

Laboratory for Social & Neural Systems Research (SNS)

PATTERN RECOGNITION AND MACHINE LEARNING

Institute of Empirical Research in Economics (IEW)

22-09-2010 1Computational Neuroeconomics and Neuroscience

Page 2: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

2

Course schedule

22-09-2010

Date Topic Chapter13-10-2010 Density Estimation, Bayesian Inference 2Adrian Etter, Marco Piccirelli, Giuseppe Ugazio

20-10-2010 Linear Models for Regression 3Susanne Leiberg, Grit Hein

27-10-2010 Linear Models for Classification 4Friederike Meyer, Chaohui Guo

03-11-2010 Kernel Methods I: Gaussian Processes 6Kate Lomakina

10-11-2010 Kernel Methods II: SVM and RVM 7Christoph Mathys, Morteza Moazami

17-11-2010 Probabilistic Graphical Models 8Justin Chumbley

Page 3: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

3

Course schedule

22-09-2010

Date Topic Chapter24-11-2010 Mixture Models and EM 9Bastiaan Oud, Tony Williams

01-12-2010 Approximate Inference I: Deterministic Approximations 10Falk Lieder

08-12-2010 Approximate Inference II: Stochastic Approximations 11Kay Brodersen

15-12-2010 Inference on Continuous Latent Variables: PCA, Probabilistic PCA, ICA 12

Lars Kasper

22-12-2010 Sequential Data: Hidden Markov Models, Linear Dynamical Systems 13

Chris Burke, Yosuke Morishima

Page 4: Pattern Recognition  and  Machine Learning

Sandra Iglesias

Laboratory for Social & Neural Systems Research (SNS)

CHAPTER 1: PROBABILITY, DECISION, AND INFORMATION THEORY

Institute of Empirical Research in Economics (IEW)

22-09-2010 4Computational Neuroeconomics and Neuroscience

Page 5: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

5

Outline

- Introduction- Probability Theory

- Probability Rules- Bayes’Theorem- Gaussian Distribution

- Decision Theory- Information Theory

22-09-2010

Page 6: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

6

Pattern recognition

22-09-2010

computer algorithms automatic discovery of regularities in data

use of these regularities to take actions such as classifying the data into different categories

classify data (patterns) based either on - a priori knowledge or- statistical information extracted from the patterns

Page 7: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

7

Machine learning

22-09-2010

'How can we program systems to automatically learn and to improve with experience?'

the machine is programmed to learn froman incomplete set of examples (training set)

the core objective of a learner is to generalize from its experience

Page 8: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

8

Polynomial Curve Fitting

22-09-2010

𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛: sin (2𝜋𝑥)

Page 9: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

9

Sum-of-Squares Error Function

22-09-2010

Page 10: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

10

Plots of polynomials

22-09-2010

Page 11: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

11

Over-fitting

Root-Mean-Square (RMS) Error:

22-09-2010

Page 12: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

12

RegularizationPenalize large coefficient values

22-09-2010

M = 9 M = 9

Page 13: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

13

Regularization: vs.

22-09-2010

M = 9

Page 14: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

14

Outline

- Introduction- Probability Theory- Decision Theory- Information Theory

22-09-2010

Page 15: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

15

Probability Theory

Uncertainty

Probability theory: consistent framework for the

quantification and manipulation of uncertainty

22-09-2010

Noise on measurements

Finite size of data sets

Page 16: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

16

Probability Theory

Marginal Probability

Conditional ProbabilityJoint Probability

22-09-2010

Page 17: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

17

Probability Theory

22-09-2010

i = 1, …,Mj = 1, …,Lnij: number of trials in which

X = xi and Y = yj

ci: number of trials in which X = xi irrespective of the value of Y

rj: number of trials in which X = xi irrespective of the value of Y

Page 18: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

18

Probability Theory

Marginal Probability

Conditional ProbabilityJoint Probability

22-09-2010

Page 19: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

19

Probability Theory

Marginal Probability

Conditional ProbabilityJoint Probability

22-09-2010

Page 20: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

20

Probability Theory

Marginal Probability

Conditional ProbabilityJoint Probability

22-09-2010

Page 21: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

21

Probability Theory

Sum Rule

22-09-2010

Page 22: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

22

Probability Theory

Product Rule

22-09-2010

Page 23: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

23

The Rules of Probability

Sum Rule

Product Rule

22-09-2010

Page 24: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

24

Bayes’ Theorem

22-09-2010

T. Bayes (1702-1761)

P.-S. Laplace (1749-1827)

p(X,Y) = p(Y,X)

Page 25: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

25

Bayes’ Theorem

posterior likelihood × prior

22-09-2010

T. Bayes (1702-1761)

P.-S. Laplace (1749-1827)

𝑝ሺ𝒘|𝐷ሻ= 𝑝ሺ𝐷|𝒘ሻ𝑝ሺ𝒘ሻ𝑝ሺ𝐷ሻ

Polynomial curve fitting problem

Page 26: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

26

Probability Densities

22-09-2010

Page 27: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

27

Expectations

Expectation for a discrete distribution:

22-09-2010

Expectation for a continuous distribution:

Expectation of f(x) is the average value of some function f(x) under a probability distribution p(x)

Page 28: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

28

The Gaussian Distribution

22-09-2010

Page 29: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

29

Gaussian Parameter Estimation

Likelihood function

22-09-2010

Page 30: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

30

Maximum (Log) Likelihood

22-09-2010

Page 31: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

31

Curve Fitting Re-visited

22-09-2010

Page 32: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

32

Maximum Likelihood

Determine by minimizing sum-of-squares error, .

22-09-2010

Page 33: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

33

Outline

- Introduction- Probability Theory- Decision Theory- Information Theory

22-09-2010

Page 34: Pattern Recognition  and  Machine Learning

34

Decision Theory• Used with probability theory to make optimal decisions• Input vector x, target vector t

• Regression: t is continuous• Classification: t will consist of class labels

• Summary of uncertainty associated is given by• Inference problem: is to obtain from data• Decision problem: make specific prediction for value of t and take

specific actions based on t

Inference step Decision stepDetermine either or . For given x, determine

optimal t.

22-09-2010 Computational Neuroeconomics and Neuroscience

Page 35: Pattern Recognition  and  Machine Learning

35

Medical Diagnosis Problem• X-ray image of patient• Whether patient has cancer or not• Input vector x: set of pixel intensities• Output variable t: whether cancer or not

• C1 = cancer; C2 = no cancer• General inference problem is to determine which

gives most complete description of situation• In the end we need to decide whether to give treatment or

not Decision theory helps do this

22-09-2010 Computational Neuroeconomics and Neuroscience

𝑝(𝑥,𝐶𝑘)

Page 36: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

36

Bayes’ Decision• How do probabilities play a role in making a

decision?• Given input x and classes Ck using Bayes’ theorem

• Quantities in Bayes theorem can be obtained from p(x,Ck) either by marginalizing or conditioning with respect to the appropriate variable

22-09-2010

𝑝ሺ𝐶𝑘|𝑥ሻ= 𝑝ሺ𝑥|𝐶𝑘ሻ𝑝ሺ𝐶𝑘ሻ𝑝ሺ𝑥ሻ

Page 37: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

37

Minimum Expected LossExample: classify medical images as ‘cancer’ or ‘normal’

• Unequal importance of mistakes• Loss or Cost Function given by Loss Matrix• Utility is negative of Loss

• Minimize Average Loss

Decision

Trut

h

22-09-2010

• Regions are chosen to minimize

Page 38: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

38

Why Separate Inference and Decision?

Classification problem broken into two separate stages:– Inference stage: training data is used to learn a model for

– Decision stage: posterior probabilities used to make optimal class assignments

Three distinct approaches to solving decision problems1. Generative models:2. Discriminative models 3. Discriminant functions

22-09-2010

𝑝ሺ𝐶𝑘,𝑥ሻ= 𝑝ሺ𝑥,𝐶𝑘ሻ𝑝ሺ𝐶𝑘ሻ𝑝ሺ𝑥ሻ

Page 39: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

39

Generative models

1. solve inference problem of determiningclass-conditional densities for eachclass separately and the prior probabilities

2. use Bayes’ theorem to determine posterior probabilities

3. use decision theory to determine class membership

22-09-2010

𝑝ሺ𝐶𝑘,𝑥ሻ= 𝑝ሺ𝑥,𝐶𝑘ሻ𝑝ሺ𝐶𝑘ሻ𝑝ሺ𝑥ሻ

𝑝ሺ𝐶𝑘|𝑥ሻ= 𝑝ሺ𝑥|𝐶𝑘ሻ𝑝ሺ𝐶𝑘ሻ𝑝ሺ𝑥ሻ

𝑝ሺ𝐶𝑘|𝑥ሻ= 𝑝ሺ𝑥|𝐶𝑘ሻ𝑝ሺ𝐶𝑘ሻ𝑝ሺ𝑥ሻ

Page 40: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

40

Discriminative models

1. solve inference problem to determineposterior class probabilities

2. Use decision theory to determine class membership

22-09-2010

𝑝ሺ𝐶𝑘|𝑥ሻ= 𝑝ሺ𝑥|𝐶𝑘ሻ𝑝ሺ𝐶𝑘ሻ𝑝ሺ𝑥ሻ

Page 41: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

41

Discriminant functions

1. Find a function f(x) that maps each input xdirectly to a class label

e.g. two-class problem: f (·) is binary valuedf =0 represents C1, f =1 represents C2

Probabilities play no role

22-09-2010

Page 42: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

42

Decision Theory for Regression

Inference stepDetermine

Decision stepFor given x, make optimal prediction, y(x), for t

Loss function:

22-09-2010

Page 43: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

43

Outline

- Introduction- Probability Theory- Decision Theory- Information Theory

22-09-2010

Page 44: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

44

Information theory

• Quantification of informationDegree of surprise: highly improbable a lot of information

highly probable less informationcertain no information

• Based on probability theory• Most important quantity: entropy

22-09-2010

Page 45: Pattern Recognition  and  Machine Learning

Entropy

22-09-2010 45Computational Neuroeconomics and Neuroscience

H[x]

p(x)0

Entropy is the average amount of information expected, weighted with the probability of the random variable quantifies the uncertainty involved when we encounter this random variable

Page 46: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

46

The Kullback-Leibler Divergence

22-09-2010

• Non-symmetric measure of the difference between two probability distributions

• Also called relative entropy

Page 47: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

47

Mutual Information

22-09-2010

Two sets of variables: x and y

If independent:

If not independent:

𝑝ሺ𝑥,𝑦ሻ= 𝑝ሺ𝑥ሻ𝑝ሺ𝑦ሻ

Page 48: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

48

Mutual Information

22-09-2010

Mutual information mutual dependence shared information related to the conditional entropy

Page 49: Pattern Recognition  and  Machine Learning

Computational Neuroeconomics and Neuroscience

49

Course schedule

22-09-2010

Date Topic Chapter22-09-2010 Probability, Decision, and Information Theory 113-10-2010 Density Estimation, Bayesian Inference 220-10-2010 Linear Models for Regression 327-10-2010 Linear Models for Classification 403-11-2010 Kernel Methods I: Gaussian Processes 610-11-2010 Kernel Methods II: SVM and RVM 717-11-2010 Probabilistic Graphical Models 824-11-2010 Mixture Models and EM 901-12-2010 Approximate Inference I: Deterministic Approximations 1008-12-2010 Approximate Inference II: Stochastic Approximations 1115-12-2010 Inference on Continuous Latent Variables: PCA,

Probabilistic PCA, ICA 1222-12-2010 Sequential Data: Hidden Markov Models, Linear Dynamical

Systems 13


Recommended