Bayesian Nets and Applications. Naïve Bayes 2 What happens if we have more than one piece of...

Post on 04-Jan-2016

213 views 0 download

transcript

Bayesian Nets and Applications

2

Naïve Bayes What happens if we have more than one piece of

evidence? If we can assume conditional independence

Overslept and trafficjam are independent, given late A and B are conditionally independent given C just in case B

doesn't tell us anything about A if we already know C: P(late|overslept Λ trafficjam) =

αP(overslept Λ trafficjam)|late)P(late) = αP(overslept)|late)P(trafficjam|late)P(late)

Naïve Bayes where a single cause directly influences a number of effects, all conditionally independent

Independence often assumed even when not so

3

Bayesian Networks A directed acyclic graph in which each node is

annotated with quantitative probability information A set of random variables makes up the network nodes A set of directed links connects pairs of nodes. If there

is an arrow from node X to node Y, X is a parent of Y Each node Xi has a conditional probability

distributionP(Xi|Parents(Xi) that quantifies the effect of the parents on the node

4

Example Topology of network encodes conditional

independence assumptions

5

Smart

Good test taker

Understands material

Hard working

Exam Grade Homework Grade

6

Smart

Good test taker

Understands material

Hard working

Exam Grade Homework Grade

Smart

True False

.5 .5

Hard Working

True False

.7 .3

S Good Test Taker

True False

True .75 .25

False .25 .75

S HW UM

True False

True True .95 .05

True False .6 .4

False True .6 .4

False False .2 .8

7

Conditional Probability Tables

Smart

True False

.5 .5

Hard Working

True False

.7 .3

S Good Test Taker

True False

True .75 .25

False .25 .75

S HW UM

True False

True True .95 .05

True False .6 .4

False True .6 .4

False False .2 .8

GTT UM Exam Grade

A B C D F

True True .7 .25 .03 .01 .01

True False .3 .4 .2 .05 .05

False True .4 .3 .2 .08 .02

False False .05 .2 .3 .3 .15

Homework Grade

UM A B C D F

True .7 .25 .03 .01 .01

False .2 .3 .4 .05 .05

8

Compactness A CPT for Boolean Xi with k Boolean parents has

2k rows for the combinations of parent values Each row requires one number p for Xi=true (the

number for Xi=false is just 1-p) If each variable has no more than k parents, the

complete network requires O(nx2k) numbers Grows linearly with n vs O(2n) for the full joint

distribution Student net: 1+1+2+2+5+5=11 numbers (vs. 26-

1)=31

9

Conditional Probability

10

Global Semantics/Evaluation

Global semantics defines the full joint distribution as the product of the local conditional distributions:

P(x1,…,xn)=∏in

=1P(xi| Parents(Xi))e.g.,

P(EG=AΛGTΛ⌐UMΛSΛHW)

11

Global Semantics

Global semantics defines the full joint distribution as the product of the local conditional distributions:

P(X1,…,Xn)=∏in=1P(Xi|Parents(Xi))

e.g., Observations:S, HW, not UM, will I get an A? P(EG=AΛGTΛ⌐UMΛSΛHW)

= P(EG=A|GT Λ⌐UM)*P(GT|S)*P(⌐UM |HW ΛS)*P(S)*P(HW)

12

Conditional Independence and Network Structure The graphical structure of a Bayesian network

forces certain conditional independences to hold regardless of the CPTs.

This can be determined by the d-separation criteria

13

a

b

c

a

b

c

b

a c

Linear

Converging

Diverging

14

D-separation (opposite of d-connecting)

A path from q to r is d-connecting with respect to the evidence nodes E if every interior node n in the path has the property that either

It is linear or diverging and is not a member of E It is converging and either n or one of its decendents is

in E

If a path is not d-connecting (is d-separated), the nodes are conditionally independent given E

15

Smart

Good test taker

Understands material

Hard working

Exam Grade Homework Grade

16

S and EG are not independent given GTT S and HG are independent given UM

Medical Application of Bayesian Networks:Pathfinder

18

Pathfinder Domain: hematopathology diagnosis

Microscopic interpretation of lymph-node biopsies Given: 100s of histologic features appearing in

lymph node sections Goal: identify disease type

malignant or benign Difficult for physicians

19

Pathfinder System Bayesian Net implementation Reasons about 60 malignant and benign

diseases of the lymph node Considers evidence about status of up to 100

morphological features presenting in lymph node tissue

Contains 105,000 subjectively-derived probabilities

20

21

Commercialization Intellipath Integrates with videodisc libraries of

histopathology slides Pathologists working with the system make

significantly more correct diagnoses than those working without

Several hundred commercial systems in place worldwide

22

Sequential Diagnosis

23

Features Structured into a set of 2-10 mutually

exclusive values Pseudofollicularity

Absent, slight, moderate, prominent

Represent evidence provided by a feature as F1,F2, … Fn

24

Value of information User enters findings from microscopic analysis of tissue

Probabilistic reasoner assigns level of belief to different diagnoses

Value of information determines which tests to perform next

Full disease utility model making use of life and death decision making

Cost of tests Cost of misdiagnoses

25

26

27

Group Discrimination Strategy Select questions based on their ability to

discriminate between disease classes For given differential diagnosis, select most

specific level of hierarchy and selects questions to discriminate among groups

Less efficient Larger number of questions asked

28

29

30

Other Bayesian Net Applications Lumiere – Who knows what it is?

31

Other Bayesian Net Applications Lumiere

Single most widely distributed application of BN Microsoft Office Assistant Infer a user’s goals and needs using evidence about user

background, actions and queries VISTA

Help NASA engineers in round-the-clock monitoring of each of the Space Shuttle’s orbiters subsystem

Time critical, high impact Interpret telemetry and provide advice about likely failures Direct engineers to the best information In use for several years

Microsoft Pregnancy and Child Care What questions to ask next to diagnose illness of a child

32

Other Bayesian Net Applications Speech Recognition

Text Summarization

Language processing tasks in general