Date post: | 02-Jun-2018 |
Category: |
Documents |
Upload: | ahmedlolo175 |
View: | 224 times |
Download: | 0 times |
of 25
8/10/2019 Ch 2_ Part 1.pptx
1/25
8/10/2019 Ch 2_ Part 1.pptx
2/25
Statistical decision theory deals with situationswhere decisions have to be made under a state ofuncertainty, and its goal is to provide a rationalframework for dealing with such situations.
The Bayesian approach is a particular way offormulating and dealing with statistical decision
problems. More specifically, it offers a method offormalizing a priori beliefs and of combining themwith the available observations,
Introduction
8/10/2019 Ch 2_ Part 1.pptx
3/25
Introduction (cont..)
The sea bass/salmon example
State of nature, prior
State of nature is a random variable
The catch of salmon and sea bass is equiprobable
P(1) = P(2) (uniform priors)
P(1) + P( 2) = 1
Pattern Classification, Chapter 2 (Part 1) 3
Salmon
Sea Bass
8/10/2019 Ch 2_ Part 1.pptx
4/25
The a priori or prior probability reacts our knowledgeof how likely we expect a certain state of nature beforewe can actually observe said state of nature.
What is a reasonable Decision Rule if
the only available information is the prior, and
the cost of any incorrect classification is equal?
Decide 1if P(1) > P(2)otherwise decide 2What can we say about this decision rule?
Seems reasonable, but it will always choose the same fish.
If the priors are uniform, this rule will behave poorly.Pattern Classification, Chapter 2 (Part 1) 4
Decision Rule From Only Priors
8/10/2019 Ch 2_ Part 1.pptx
5/25
Class-Conditional Density The class-conditional probability density function is
the probability density function for x, our feature,given that the state of nature is :
P(x | 1)and P(x | 2)describe the difference inlightness between populations of sea and salmon
8/10/2019 Ch 2_ Part 1.pptx
6/25
Class Conditional Probabilities
8/10/2019 Ch 2_ Part 1.pptx
7/25
Combine prior and class-conditional probabilities
Posterior probability is the probability of a certain stateof nature given our observables.
Pattern Classification, Chapter 2 (Part 1) 7
Posterior Probability: Bayes formula
8/10/2019 Ch 2_ Part 1.pptx
8/25
Posterior Probability: Bayes formula
Feature x
8/10/2019 Ch 2_ Part 1.pptx
9/25
Likelihood ratio test (LRT)
The rule for a 2-class problem
if choose 1 else choose 2
Or, in a more compact form
Applying Bayes rule
Pattern Classification, Chapter 2 (Part 1) 9
x)|(x)|( 21
8/10/2019 Ch 2_ Part 1.pptx
10/25
For a given observation x, we would be inclined to let theposterior govern our decision:
The probability of error is :
Pattern Classification, Chapter 2 (Part 1) 10
Probability of Error
8/10/2019 Ch 2_ Part 1.pptx
11/25
Pattern Classification, Chapter 2 (Part 1) 11
8/10/2019 Ch 2_ Part 1.pptx
12/25
To answer this question, it is convenient to express[] in terms of the posterior
The optimal decision rulewill minimize at everyvalue of in feature space, so that the integral above isminimized .
dxxpxerrorperrorP )(]|[][
]|[ xerrorp
How good is the LRT decision rule?
8/10/2019 Ch 2_ Part 1.pptx
13/25
Decide 1if P(1| x) > P(2| x);
otherwise decide 2
Therefore:
P(error | x) = min [P(1| x), P(2| x)]
(Bayes decision)
Pattern Classification, Chapter 2 (Part 1) 13
Minimizing the probability of error
8/10/2019 Ch 2_ Part 1.pptx
14/25
Bayesian Decision TheoryContinuous Features
Generalization of the preceding ideas:
Use of more than one feature (e.g., length and lightness)
Use more than two states of nature
Allowing actionsand not only decide on the state of nature
Introduce a loss of function which is more general than
the probability of error(e.g., errors are not equally costly)
Allowing actionsother than classificationprimarily allowsthe possibility of rejection
The lossfunctionstates how costly each action takenis.
Pattern Classification, Chapter 2 (Part 1) 14
8/10/2019 Ch 2_ Part 1.pptx
15/25
A loss function states exactly how costly each action is.
Let {1, 2,, c}be the set of c states of nature (orcategories)
Let {1, 2,, a}be the set of possible actions
Let the loss function (i| j)be the lossincurred fortaking action iwhen the state of nature is j
A general decision rule is a function ithat tells uswhich action to take for every possible observation.
Pattern Classification, Chapter 2 (Part 1) 15
LossFunctions
8/10/2019 Ch 2_ Part 1.pptx
16/25
R = Sum of all R(i| x) for i = 1,,a Minimizing R implies minimizing R(i| x)
for i = 1,, a The expected loss Risk or conditional risk from
taking action i is:
for i = 1,,a
Pattern Classification, Chapter 2 (Part 1) 16
cj
jjjii xPxR
1 )|()|()|(
Overall risk
8/10/2019 Ch 2_ Part 1.pptx
17/25
Bayes Decision Rule gives us a method for minimizing theoverall risk.
Select the action that minimizes the conditional risk:
R is minimumand R in this case is called the Bayes risk=best performance that can be achieved!
Pattern Classification, Chapter 2 (Part 1) 17
The Minimum Overall Risk
8/10/2019 Ch 2_ Part 1.pptx
18/25
Consider two classes and two actions,
1: deciding 12 : deciding 2ij = (i| j) is the loss incurred for deciding iwhen the
true state of nature is j
Conditional risk:
R(1| x) = 11P(1| x) + 12P(2| x)
R(2| x) = 21P(1 | x) + 22P(2| x)Pattern Classification, Chapter 2 (Part 1) 18
Two-Category Classification Examples
8/10/2019 Ch 2_ Part 1.pptx
19/25
Our rule is the following:if R(1| x)< R(2| x)
action 1: decide 1 is taken
In terms of posteriors, decide 1if:
(21- 11) P(1|x) > (12- 22) P(2|x)
and decide2otherwise
Or, expanding via Bayes Rule, decide !1 if
(21- 11) P(x | 1) P(1) > (12- 22) P(x | 2) P(2)
and decide2otherwise
Pattern Classification, Chapter 2 (Part 1) 19
Two-Category Classification Examples (cont..)
8/10/2019 Ch 2_ Part 1.pptx
20/25
The preceding rule is equivalent to the following rule:
Then take action 1(decide 1)
Otherwise take action 2(decide 2)
Pattern Classification, Chapter 2 (Part 1) 20
)(
)(.
)|(
)|(
1
2
1121
2212
2
1
P
P
xP
xPif
Likelihood Ratio:
8/10/2019 Ch 2_ Part 1.pptx
21/25
If the likelihood ratio exceeds a threshold value
independent of the input pattern x, we can takeoptimal actions
Pattern Classification, Chapter 2 (Part 1) 21
Optimal decision property
8/10/2019 Ch 2_ Part 1.pptx
22/25
Pattern Classification, Chapter 2 (Part 1) 22
Likelihood Ratio Test:An Example
8/10/2019 Ch 2_ Part 1.pptx
23/25
Likelihood Ratio Test: Example 2
Pattern Classification, Chapter 2 (Part 1) 23
8/10/2019 Ch 2_ Part 1.pptx
24/25
Example 2: Answer
Pattern Classification, Chapter 2 (Part 1) 24
8/10/2019 Ch 2_ Part 1.pptx
25/25
Likelihood Ratio Test: Example 3
Select the optimal decision where:
= {1, 2}
P(x | 1)= N(2, 0.5)P(x | 2)=N(1.5, 0.2)
P(1) = 2/3
P(2) = 1/3, and
Pattern Classification Chapter 2 (Part 1) 25
43
21