1 WHY MAKING BAYESIAN NETWORKS BAYESIAN MAKES SENSE. Dawn E. Holmes Department of Statistics and...

Post on 29-Mar-2015

214 views 1 download

Tags:

transcript

11

WHY MAKING BAYESIAN NETWORKS WHY MAKING BAYESIAN NETWORKS

BAYESIAN MAKES SENSE.BAYESIAN MAKES SENSE.

Dawn E. HolmesDawn E. Holmes

Department of Statistics and Applied ProbabilityDepartment of Statistics and Applied ProbabilityUniversity of California, Santa BarbaraUniversity of California, Santa Barbara

CA 93106, USACA 93106, USA

22

Subjective ProbabilitySubjective Probability

Rational degrees of belief.Rational degrees of belief.

KeynesKeynes‘s consensual‘s consensual rational degrees of rational degrees of belief.belief.

33

What is a Bayesian Network?What is a Bayesian Network?

Directed acyclic graphDirected acyclic graph• Nodes are variables (discrete or Nodes are variables (discrete or

continuous)continuous)• Arcs indicate dependence between Arcs indicate dependence between

variables. variables. Conditional Probabilities (local distributions)Conditional Probabilities (local distributions)

Missing arcs implies conditional Missing arcs implies conditional independenceindependence

Independencies + local distributions => Independencies + local distributions => specification of a joint distributionspecification of a joint distribution

44

Bayesian NetworksBayesian Networks

In classical Bayesian network theory a prior In classical Bayesian network theory a prior distribution must be specified. distribution must be specified.

When information is missing, When information is missing, we are able to we are able to find a minimally prejudiced prior find a minimally prejudiced prior distributiondistribution using MaxEnt using MaxEnt

5

A Simple Bayesian Network

A

B C

1 2 3 1 2 3 1 2 3, ,a a a a b b b b c c c cE e e e E e e e E e e e

1 1 2 1 3 11 1 2 1 3 1( | ) ( ) ( | ) ( ) ( | ) ( )b a b a b aP e e b a P e e b a P e e b a

66

Priors for Bayesian NetworksPriors for Bayesian Networks

● Using frequentist probabilities results in a Using frequentist probabilities results in a rigid network.rigid network.

● The results obtained using Bayesian The results obtained using Bayesian networks are only as good as their prior networks are only as good as their prior distribution. distribution.

● The maximum entropy formalism. The maximum entropy formalism.

77

Maximum Entropy and the Principle of Maximum Entropy and the Principle of Insufficient Reason.Insufficient Reason.

The principle of maximum entropy is a The principle of maximum entropy is a generalization of the principle of insufficient generalization of the principle of insufficient reason. reason.

It is capable of determining a probability It is capable of determining a probability distribution for any combination of partial distribution for any combination of partial knowledge and partial ignorance.knowledge and partial ignorance.

88

An iterative algorithm for updating An iterative algorithm for updating probabilities in a multivalued multiway tree probabilities in a multivalued multiway tree in given. in given.

A Lagrange multiplier technique is used to A Lagrange multiplier technique is used to find the probability of an arbitrary state in a find the probability of an arbitrary state in a Bayesian tree using only MaxEnt. Bayesian tree using only MaxEnt.

Two ResultsTwo Results

9

FRAGMENT OF STATE TABLE

1 1 10 a: b cS e e e 1 1 2

1 a: b cS e e e 1 1 32 a: b cS e e e

1 2 13 a: b cS e e e 1 2 2

4 a: b cS e e e 1 2 35 a: b cS e e e

1 3 16 a: b cS e e e 1 3 2

7 a: b cS e e e 1 3 38 a: b cS e e e

10

A Simple Bayesian Network

A

B C

0 1 15

2 3 1

( ) ( )exp(- )

( ) ( )

P S c a

P S c a

11

A Simple BN with Maximum Entropy

A

B C

0 0 1 1

2 2 3 1 1

4 2 1 5 1 1

6 2 1

( ) exp( )exp 1 ( )

exp ( ) exp 1 ( )

exp ( ) exp 1 ( )

exp ( )

P S a

a b a

b a c a

c a

1 1 1 1 1 1 1 10 1 1 1 1 1( ) ( ) ( ) ( | ) ( | ) ( ) ( ) ( )a b c a b a c aP S P e e e P e P e e P e e a b a c a

1212

Maximum Entropy in Bayesian NetworksMaximum Entropy in Bayesian Networks

Maximum entropy provides a technique for Maximum entropy provides a technique for eliciting knowledge from incomplete eliciting knowledge from incomplete information. information.

We use the maximum entropy formalism to We use the maximum entropy formalism to optimally estimate the prior distribution of optimally estimate the prior distribution of a Bayesian network.a Bayesian network.

All and only the information provided by All and only the information provided by expert knowledge is used.expert knowledge is used.

1313

What use are Subjective Bayesian Prior What use are Subjective Bayesian Prior DistributionsDistributions

Why determine the prior distribution for a Why determine the prior distribution for a Bayesian network using maximum entropy?Bayesian network using maximum entropy?

Any problem involving probabilities can be Any problem involving probabilities can be

represented by a Bayesian network.represented by a Bayesian network.

1414

IndependenceIndependence

Proofs must not use techniques outside of Proofs must not use techniques outside of MaxEnt.MaxEnt.

Proofs have already been given elsewhere.Proofs have already been given elsewhere.

1515

Thank You!Thank You!