+ All Categories
Home > Documents > Bayesian Reasoning A/Prof Geraint Lewis A/Prof Peter Tuthill Thomas Bayes (1702-1761) Pierre-Simon...

Bayesian Reasoning A/Prof Geraint Lewis A/Prof Peter Tuthill Thomas Bayes (1702-1761) Pierre-Simon...

Date post: 20-Dec-2015
Category:
View: 218 times
Download: 0 times
Share this document with a friend
Popular Tags:
15
Bayesian Reasoning A/Prof Geraint Lewis A/Prof Peter Tuthill Thomas Bayes (1702- 1761) Pierre-Simon Laplace (1749- 1827) “Probability theory is nothing but common sense, reduced to calculation.” Laplace
Transcript

Bayesian Reasoning

A/Prof Geraint Lewis A/Prof Peter Tuthill

Thomas Bayes (1702-1761) Pierre-Simon Laplace

(1749-1827)

“Probability theory is nothing but common sense, reduced to calculation.”

Laplace

Are you a Bayesian or Frequentist?

“There are 3 kinds of lies: Lies, Damned Lies, and Statistics”Benjamin Disraeli ...and Bayesian Statistics

4

Frequentists

Fig 1. A Frequentist Statistician Fig 2. Bayesian Statistics Conference

What is Inference?If A is true then B is true (Major Premise)

A is true (Minor Premise) therefore B is true (conclusion)

Deductive Inference (Logic) Aristotle 4th Century B.C.

B is False (Minor Premise) therefore A is False (conclusion)

} STRONG SYLLOGISMS

Inductive Inference (Plausible Reasoning)

A is false (Minor Premise) therefore B is less plausible

B is true (Minor Premise) therefore A is more plausible }WEAK

SYLLOGISMS

A = A,B (in Boolean notation)

A BT → T

F ← F

F → f t ← T

What is Inference?

Cause Effects or outcomes

Effects or observations

Possible Causes

Deductive Logic:

Inductive Logic:

What is a Probability?BayesiansFrequentists

P(A) = long run relative frequency of A occurring in identical repeats of an observation

“A” is restricted to propositions about random variables

P(A|B) = Real number measure of the plausibility of proposition A, given (conditional upon) the truth of proposition B

“A” can be any logical propositionAll probabilities are conditional; we must be explicit what our assumptions B are (no such thing as an absolute probability!)

Probability depends on our state of Knowledge

1st draw

7 Red5 Blue

5/12 Blue7/12 Red

2nd draw

?

The Desiderata of Bayesian Probability Theory

• Degrees of plausibility are represented by real numbers (higher degree of belief represented by a larger number)

• With extra evidence supporting a proposition, the plausibility should increase monotonically up to a limit (certainty).

• Consistency. Multiple ways to arrive at a conclusion must all produce the same answer (see book for additional details)

Logic and Probability

• In the certainty limit, where probabilities go to zero (falsehood) or one (truth), then the sum and product rules reduce to formal Boolean deductive logic (strong syllogisms).

• Bayesian Probability is therefore an extension of formal logic into intermediate states of knowledge.

• Bayesian inference gives a measure of our state of knowledge about nature, not a measure of nature itself.

The two rules underlyingprobability theory

SUM RULE: P(A|B) + P(A|B) = 1

PRODUCT RULE: P(A,B|C) = P(A|C) P(B|A,C) = P(B|C) P(A|B,C)

Left Handed

Right Handed

Blue Eyes

Brown Eyes All Kangaroos

Blue, Left

Bayes’ Theorem

Bayes Theorem: P(Hi|D,I) = P(D|I)

P(Hi|I) P(D|Hi I)

Hi = proposition asserting truth of a hypothesis of interest

I = proposition representing prior information

D = proposition representing the data

P(D|Hi I) = Likelihood: probability of obtaining the data given that the hypothesis is true

P(Hi|I) = Prior: probability of hypothesis before new data

P(D|I) = Normalization factor (prob all hypothesis i sum to 1)

Posterior

Example: The Gambler’s coin problem

P(H|D,I) = P(D|I)

P(H|I) P(D|H I)

Likelihood – if we assume the data D gives R heads in N tosses:

Prior – what do we know about the coin? Normalization factor – Ignore this for now as only need relative merit

Assume H=pdf(head) is uniformly distributed 0-1

P(D|H I) HR (1-H)N-R The full distribution, assuming independence of throws, is the Binomial Distribution. We omit terms not containing H, and use a proportionality.

Example: A fair coin? H

Data

H

TT

Example: A fair coin?

The effects of the Prior


Recommended