+ All Categories
Home > Documents > Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I...

Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I...

Date post: 30-May-2020
Category:
Upload: others
View: 23 times
Download: 0 times
Share this document with a friend
69
Probabilistic Graphical Models and Their Applications Bjoern Andres and Bernt Schiele Max Planck Institute for Informatics slides adapted from Peter Gehler October 26, 2016 Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 1 / 69
Transcript
Page 1: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Probabilistic Graphical Modelsand Their Applications

Bjoern Andres and Bernt Schiele

Max Planck Institute for Informatics

slides adapted from Peter Gehler

October 26, 2016

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 1 / 69

Page 2: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Intro

Organization

I Lecture 2 hours/weekI Wed: 14:00 – 16:00, Room: E1.4 024

I Exercises 2 hours/weekI Thu: 10:00 – 12:00, Room E1.4 024I Starts next Thursday

I Course web page: http://www.d2.mpi-inf.mpg.de/gmI SlidesI Pointers to Books and PapersI Homework assignments

I “Semesterapparat” in library

I Mailing list: see webpage how to subscribe

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 2 / 69

Page 3: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Intro

Exercises & Exam

I Exercises:I Typically one assignment per weekI Typically from Wednesday → WednesdayI Theoretical and Practical ExercisesI Starts this Thursday with Matlab primerI Final Grade: 50% exercises, 50% oral exam

(oral exam has to be passed obviously !)

I ExamI Oral exam at the end of the semesterI Can be taken in English or German

I TutorsI Eldar Insafutdinov ([email protected])I Evgeny Levinkov ([email protected])

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 3 / 69

Page 4: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Intro

Related Classes @UdS

I High-Level Computer Vision (SS), Fritz & Schiele

I Machine Learning (WS), Hein

I Statistical Learning I+II (SS,WS), Lengauer

I Optimization I+II, Convex Optimization (SS,WS), . . .

I Pattern and Speech Recognition (WS), Klakow

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 4 / 69

Page 5: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Intro

Offers in our Research Group

I Master- and Bachelor Theses

I HiWi-positions, etc.

in

I Topics in machine learning

I Topics in computer vision

I Topics in machine learning applied to computer vision

I Come, talk to us

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 5 / 69

Page 6: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Intro

Literature

I All books in a “Semesterapparat”I Main book for the graphical model part

I Barber, Bayesian Reasoning and Machine Learning, CambridgeUniversity Press, 2011, ISBN-13: 978-0521518147,http://tinyurl.com/3flppuo

I Extra ReferencesI Bishop, Pattern Recognition and Machine Learning, Springer New

York, 2006, ISBN-13: 978-0387310732I Koller, Friedman, Probabilistic Graphical Models: Principles and

Techniques, The MIT Press, 2009, ISBN-13: 978-0262013192I MacKay, Information Theory, Inference and Learning Algorithms,

Cambridge Universsity Press, 2003, ISBN-13: 978-0521642989

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 6 / 69

Page 7: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Intro

Literature

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 7 / 69

Page 8: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical
Page 9: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Intro

Topic overview 2016/17

I Recap: Probability and Decision theory (today)I Graphical Models

I Basics (Directed, Undirected, Factor Graphs)I InferenceI Learning

I InferenceI Deterministic Inference (Sum-Prodcut, Junction Tree)I Approximate Inference (Loopy BP, Sampling, Variational)

I Application to Computer Vision ProblemsI Body Pose EstimationI Object DetectionI Semantic SegmentationI Image DenoisingI . . .

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 9 / 69

Page 10: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Intro

Today’s topics

I Overview: Machine LearningI What is machine learning ?I Different problem settings and examples

I Probability theory

I Decision theory, inference and decision

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 10 / 69

Page 11: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Machine Learning

Overview

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 11 / 69

Page 12: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Machine learning – what’s that?

I Do you use machine learning systems already ?

I Can you think of an application ?

I Can you define the term “machine learning”?

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 12 / 69

Page 13: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

I Goal of machine learning:I Machines that learn to perform a task from experience

I We can formalize this as

y = f(x;w) (1)

y is called output variable,x the input variable andw the model parameters (typically learned)

I Classification vs regression:I regression: y continuousI classification: y discrete (e.g. class membership)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 13 / 69

Page 14: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

I Goal of machine learning:I Machines that learn to perform a task from experience

I We can formalize this as

y = f(x;w) (2)

y is called output variable,x the input variable andw the model parameters (typically learned)

I learn... adjust the parameter w

I ... a task ... the function f

I ... from experience using a training dataset D, where of eitherD = {x1, . . . , xn} or D = {(x1, y1), . . . , (xn, yn)}

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 14 / 69

Page 15: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Different Scenarios

I Unsupervised Learning

I Supervised Learning

I Reinforcement Learning

I Let’s discuss

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 15 / 69

Page 16: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Supervised Learning

I Given are pairs of training examples from X × Y

D = {(x1, y1), (x2, y2), . . . , (xn, yn)} (3)

I Goal is to learn the relationship between x and y

I Given a new example point x predict y

y = f(x;w) (4)

I We want to generalize to unseen data

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 16 / 69

Page 17: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Supervised Learning – Examples

Face Detection

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 17 / 69

Page 18: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Supervised Learning – Examples

Image ClassificationAndres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 18 / 69

Page 19: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Supervised Learning – Examples

Semantic Image Segmentation

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 19 / 69

Page 20: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Supervised Learning – Examples

Body Part Estimation (in Kinect)Figure from Decision Tree Fields, Nowozin et al., ICCV11

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 20 / 69

Page 21: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Supervised Learning – Examples

I Person identification

I Credit card fraud detection

I Industrial inspection

I Speech recognition

I Action classification in videos

I Human body pose estimation

I Visual object detection

I Prediction survival rate of a patient

I ...

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 21 / 69

Page 22: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Supervised Learning - Models

Flashing more keywords

I Multilayer Perceptron (Backpropagation)

I (Deep) Convolutional Neural Networks (Backpropagation)

I Linear Regression, Logistic Regression

I Support Vector Machine (SVM)

I Boosting

I Graphical models

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 22 / 69

Page 23: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Unsupervised Learning

I We are given some input data points

D = {x1, x2, . . . , xn} (5)

I Goals:I Determine the data distribution p(x) → density estimationI Visualize the data by projections → dimensionality reductionI Find groupings of the data → clustering

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 23 / 69

Page 24: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Unsupervised Learning – Examples

Image Priors for Denoising

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 24 / 69

Page 25: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Unsupervised Learning – Examples

Image Priors for Inpainting

Image from “A generative perspective on MRFs in low-level vision”,Schmidt et al., CVPR2010

black line: statistics form original images, blue and red: statistics after applying

two different algorithms

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 25 / 69

Page 26: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Unsupervised Learning – Examples

Human Shape ModelSCAPE: Shape Completion and Animation of People, Anguelov et al.

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 26 / 69

Page 27: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Unsupervised Learning – Examples

I Clustering scientific publications according to topics

I A generative model for human motion

I Generating training data for Microsoft Kinect xbox controller

I Clustering flickr imagesI Novelty detection, predicting outliers

I Anomality detection in visual inspectionI Video surveillance

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 27 / 69

Page 28: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Unsupervised Learning – Models

Just flashing some keywords (→ Machine Learning)

I Mixture Models

I Neural Networks

I K-Means

I Kernel Density Estimation

I Principal Component Analysis (PCA)

I Graphical Models (here)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 28 / 69

Page 29: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Reinforcement Learning

I Setting: given a situation, find an action to maximize a rewardfunction

I Feedback:I we only get feedback of how well we are doingI we do not get feedback what the best action would be

(“indirect teaching”)

I Feedback given as reward:I each action yields reward, orI a reward is given at the end (e.g. robot has found his goal, computer

has won game in Backgammon)

I Exploration: try out new actions

I Exploitation: use known actions that yield high rewards

I Find a good trade-off between exploration and exploitation

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 29 / 69

Page 30: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Variations of the general theme

I All problems fall in these broad categories

I But your problem will surely have some extra twists

I Many different variations of the aforementioned problems are studiedseparately

I Let’s look at some ...

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 30 / 69

Page 31: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Semi-Supervised Learning

I We are given a dataset of l labeled examplesDl = {(x1, y1), . . . , (xl, yl)}

as in supervised learning

I Additionally we are given a set of u unlabeled examplesDu = {xl+1, . . . , xl+u}

as in unsupervised learning

I Goal is y = f(x;w)

I Question: how can we utilize the extra information in Du?

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 31 / 69

Page 32: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Semi-Supervised Learning: Two Moons

I Two labeled examples (red and blue) and additional unlabeled blackdots

Two moons

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 32 / 69

Page 33: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Transductive Learning

I We are given a set of labeled examples

D = {(x1, y1), . . . , (xn, yn)} (6)

I Additionally we know the test data points {xte1 , . . . , xtem}(not their labels!)

I Can we do better, including this knowledge?

I This should be easier than making predictions for the entire set X

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 33 / 69

Page 34: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

On-line Learning

I The training data is presented step-by-step and is never availableentirely

I At each time-step t we are given a new datapoint xt(or (xt, yt))

I When is online learning a sensible scenario?I We want to continuously update the model – we can train a model

with little data, but the model should become better over time whenmore data is available (similar to how humans learn)

I We have limited storage for data and the model – a viable setting forlarge-scale datasets (e.g. the size of the internet)

I How do we learn in this scenario?

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 34 / 69

Page 35: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Large-Scale Learning

I Learning with millions of examples

I Study fast learning algorithms (e.g. parallelizable, special hardware)

I Problems of storing the data, computing the features, etc.

I There is no strict definition for “large-scale”

I Small-scale learning: limiting factor is number of examples

I Large-scale learning: limited by maximal time for computation(and/or maximal storage capacity)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 35 / 69

Page 36: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Active Learning

I We are given a set of examples

D = {x1, . . . , xn} (7)

I Goal is to learn y = f(x;w)

I Each label yi costs something, e.g. Ci ∈ R+

I Question: How to learn well while paying little?

I This is almost always the case, labeling is expensive

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 36 / 69

Page 37: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Structured Output Learning

I We are given a set of training examplesD = {(x1, y1), . . . , (xn, yn)},

but y ∈ Y contains more structure than y ∈ Ror y ∈ {−1, 1}

I Consider binary image segmentationI y is entire image labelingI Y is the set of all labelings 2#pixels

I Other examples: y could be a graph, a tree,a ranking, . . .

I Goal is to learn a function f(x, y;w) and predicty = argmax

y∈Yf(x, y;w)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 37 / 69

Page 38: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Machine Learning

Some final comments

I All topics are under active development and research

I Supervised classification: basically understood

I Broad range of applications, many exciting developments

I Adopting a “ML view” has far reaching consequences, it touchesproblems of empirical sciences in general

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 38 / 69

Page 39: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Probability Theory

Probability Theory

Brief Review

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 39 / 69

Page 40: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Probability Theory

Brief Review

I A random variable (RV) X can take values from some discrete set ofoutcomes X .

I We usually use the short-hand notation

p(x) for p(X = x) ∈ [0, 1] (8)

for the probability that X takes value x

I Withp(X), (9)

we denote the probability distribution over X

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 40 / 69

Page 41: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Probability Theory

Brief Review

I Two random variables (RVs) are called independent if

p(X = x, Y = y) = p(X = x)p(Y = y) (10)

I Joint probability (of X and Y )

p(x, y) instead p(X = x, Y = y) (11)

I Conditional probability

p(x|y) instead p(X = x|Y = y) (12)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 41 / 69

Page 42: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Probability Theory

The Rules of Probability

I Sum rulep(X) =

∑y∈Y

p(X,Y = y) (13)

we “marginalize out y”. p(X = x) is also called amarginal probability

I Product Rulep(X,Y ) = p(Y |X)p(X) (14)

I And as a consequence: Bayes Theorem or Bayes Rule

p(Y |X) =p(X|Y )p(Y )

p(X)(15)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 42 / 69

Page 43: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Probability Theory

Vocabulary

I Joint Probability

p(xi, yj) =nij

N

I Marginal Probability

p(xi) =ciN

I Conditional Probability

p(yj | xi) =nij

ci

ci =∑j

nij︸ ︷︷ ︸yj nij

xi

N =∑ij

nij

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 43 / 69

Page 44: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Probability Theory

Probability Densities

I Now X is a continuous random variable, eg taking values in RI Probability that X takes a value in the interval (a, b) is

p(X ∈ (a, b)) =

∫ b

ap(x)dx (16)

and we call p(x) the probability density over x

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 44 / 69

Page 45: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Probability Theory

Probability Densities

I p(x) must satisfy the following conditions

p(x) ≥ 0 (17)∫ ∞−∞

p(x)dx = 1 (18)

I The probability that x lies in (−∞, z) is given by the cumulativedistribution function

P (z) =

∫ z

−∞p(x)dx (19)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 45 / 69

Page 46: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Probability Theory

Probability Densities

Figure : Probability density of a continuous variable

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 46 / 69

Page 47: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Probability Theory

Expectation and Variances

I Expectation

E[f ] =∑x∈X

p(x)f(x) (20)

E[f ] =

∫x∈X

p(x)f(x)dx (21)

I Sometimes we denote the distribution that we take the expectationover as a subscript, eg

Ep(·|y)[f ] =∑x∈X

p(x|y)f(x) (22)

I Variancevar[f ] = E

[(f(x)− E [f(x)])2

](23)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 47 / 69

Page 48: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Decision Theory

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 48 / 69

Page 49: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Digit Classification

I Classify digits “a” versus “b”

Figure : The digits “a” and “b”

I Goal: classify new digits such that the error probability is minimized

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 49 / 69

Page 50: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Digit Classification - Priors

Prior Distribution

I How often do the letters “a” and “b” occur ?

I Let us assume

C1 = a p(C1) = 0.75 (24)

C2 = b p(C2) = 0.25 (25)

The prior has to be a distribution, in particular∑k=1,2

p(Ck) = 1 (26)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 50 / 69

Page 51: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Digit Classification - Class Conditionals

I We describe every digit using some feature vectorI the number of black pixels in each boxI relation between width and height

I Likelihood: How likely has x been generated from p(· | a), resp.p(· | b)?

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 51 / 69

Page 52: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Digit Classification

I Which class should we assign x to ?

I The answer

I Class a

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 52 / 69

Page 53: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Digit Classification

I Which class should we assign x to ?

I The answer

I Class b

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 53 / 69

Page 54: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Digit Classification

I Which class should we assign x to ?

I The answer

I Class a, since p(a)=0.75

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 54 / 69

Page 55: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Bayes Theorem

I How do we formalize this?

I We already mentioned Bayes Theorem

p(Y |X) =p(X|Y )p(Y )

p(X)(27)

I Now we apply it

p(Ck|x) =p(x|Ck)p(Ck)

p(x)=

p(x|Ck)p(Ck)∑j p(x|Cj)p(Cj)

(28)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 55 / 69

Page 56: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Bayes Theorem

I Some terminology! Repeated from last slide:

p(Ck|x) =p(x|Ck)p(Ck)

p(x)=

p(x|Ck)p(Ck)∑j p(x|Cj)p(Cj)

(29)

I We use the following names

Posterior =Likelihood× Prior

Normalization Factor(30)

I Here the normalization factor is easy to compute. Keep an eye out forit, it will haunt us until the end of this class(and longer :) )

I It is also called the Partition Function, common symbol Z

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 56 / 69

Page 57: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Bayes Theorem

Likelihood

Likelihood × Prior

Posterior = Likelihood×PriorNormalization Factor

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 57 / 69

Page 58: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

How to Decide?

I Two class problem C1, C2, plotting Likelihood × Prior

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 58 / 69

Page 59: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Minmizing the Error

p(error) = p(x ∈ R2, C1) + p(x ∈ R1, C2) (31)

= p(x ∈ R2|C1)p(C1) + p(x ∈ R1|C2)p(C2) (32)

=

∫R2

p(x|C1)p(C1)dx +

∫R1

p(x|C2)p(C2)dx (33)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 59 / 69

Page 60: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

General Loss Functions

I So far we considered misclassification error only

I This is also referred to as 0/1 loss

I Now suppose we are given a more general loss function

∆ : Y × Y → R+ (34)

(y, y) 7→ ∆(y, y) (35)

I How do we read this?

I ∆(y, y) is the cost we have to pay if y is the true class but we predicty instead

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 60 / 69

Page 61: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Example: Predicting Cancer

∆ : Y × Y → R+ (36)

(y, y) 7→ ∆(y, y) (37)

I Given: X-Ray image, Question: Cancer yes or no?Should we have another medical check of the patient?

diagnosis :cancer normal

truth : cancer 0 1000normal 1 0

I For discrete sets Y this is a loss matrix

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 61 / 69

Page 62: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Digit Classification

I Which class should we assign x to? (p(a) = p(b) = 0.5)

I The answer

I It depends on the loss

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 62 / 69

Page 63: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Minmizing Expected Loss (or Error)

I The expected loss for x (averaged over all decisions)

E[∆] =∑

k=1,...,K

∑j=1,...,K

∫Rj

∆(Ck, Cj)p(x,Ck)dx (38)

I And how do we predict? Decide on one y!

y∗ = argminy∈Y

∑k=1,...,K

∆(Ck, y)p(Ck|x) (39)

= argminy∈Y

Ep(·|x)[∆(·, y)] (40)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 63 / 69

Page 64: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Inference and Decision

I We broke down the process into two stepsI Inference: obtaining the probabilities p(Ck|x)I Decision: Obtain optimal class assignment

I Two steps !!

I The probabilites p(·|x) represent our belief of the world

I The loss ∆ tells us what to do with it!

I 0/1 loss implies deciding for max probability (exercise)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 64 / 69

Page 65: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Three Approaches to Solve Decision Problems

1. Generative models: infer the class conditionals

p(x|Ck), k = 1, . . . ,K (41)

then combine using Bayes Theorem p(Ck|x) = p(x|Ck)p(Ck)p(x)

2. Discriminative models: infer posterior probabilities directly

p(Ck|x) (42)

3. Find a discriminative function minimizing Expected Loss ∆

f : X → {1, . . . ,K} (43)

Let’s discuss these optionsAndres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 65 / 69

Page 66: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Generative Models

Pros:

I The name generative is because we cangenerate samples from the learnt distribution

I We can infer p(x|Ck) (or p(x) for short)

Cons:

I With high dimensionality of x ∈ X we need alarge training set to determine theclass-conditionals

I We may not be interested in all quantities

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 66 / 69

Page 67: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Discriminative Models

Pros:

I No need to model p(x|Ck)(i.e. in general easier)

Cons:

I No access to model p(x|Ck)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 67 / 69

Page 68: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Discriminative Functions

When solving a problem of interest, do not solve a harder / more generalproblem as an intermediate step.

– Vladimir VapnikPros:

I One integrated system, we directly estimate the quantity of interest

Cons:

I Need ∆ during training time – revision requires re-learning

I No access to probabilities or uncertainty, thus difficult to rejectdecision?

I Prominent example: Support Vector Machines (SVMs)

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 68 / 69

Page 69: Probabilistic Graphical Models and Their Applications · Intro Exercises & Exam I Exercises: I Typically one assignment per week I Typically from Wednesday !Wednesday I Theoretical

Decision Theory

Next Time ...

I ... we will meet our new friends:

Andres & Schiele (MPII) Probabilistic Graphical Models October 26, 2016 69 / 69


Recommended