+ All Categories
Home > Documents > Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see...

Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see...

Date post: 30-Dec-2018
Category:
Upload: phungkhanh
View: 222 times
Download: 0 times
Share this document with a friend
18
31 Computing Science & Mathematics University of Stirling CSC9T6 Information Systems Computing Science & Mathematics University of Stirling Recap: SUM RULE P(A or B) = P(A B) = P(A) + P(B) with A, B disjoint. P(A or B) = P(A) + P(B) - P(A B) otherwise CONDITIONAL PROBABILITY OF X GIVEN Y P(X=A | Y=A) PRODUCT RULE P( X=A,Y=A ) = P( X=A | Y=A) * P( Y=A ) INDEPENDENT EVENTS P(X,Y) = P(X) P(Y), i.e. P(X | Y) = P(X) Note that: P(A or B) = P(B or A) P(A, B) = P(B, A) P(A | B) P(B | A) Probabilities 32 Computing Science & Mathematics University of Stirling CSC9T6 Information Systems Computing Science & Mathematics University of Stirling Clearly P(X, Y) = P(Y, X) by symmetry of -- BUT P(X | Y) P(Y | X) P(X | Y) P(Y) = P(Y | X) P(X) by definition and product rule Note again that if the knowledge about Y does not change the probability of X, i.e. P(X | Y) = P(X) then the two events are said to be independent, and P(X,Y) = P(X) P(Y), as in the case of picking two cards from different decks (or reinserting the card after each test). It follows P(X | Y) = P(Y | X) P(X) Bayes' Theorem P(Y) Important result. Informally: By knowing the probability of X given Y, and the probability of X and Y, I can derive the probability of Y given X. We will see an example soon. Probabilities: Bayes' Theorem 33 Computing Science & Mathematics University of Stirling CSC9T6 Information Systems Computing Science & Mathematics University of Stirling Summing up: SUM RULE P(A or B) = P(A B) = P(A) + P(B) with A, B disjoint. P(A or B) = P(A) + P(B) - P(A B) otherwise CONDITIONAL PROBABILITY OF X GIVEN Y P(X=A | Y=A) PRODUCT RULE P( X=A,Y=A ) = P( X=A | Y=A) * P( Y=A ) INDEPENDENT EVENTS P(X,Y) = P(X) P(Y), i.e. P(X | Y) = P(X) BAYES' THEOREM P(X | Y) = P(Y | X) P(X) P(Y) Probabilities 34 Computing Science & Mathematics University of Stirling CSC9T6 Information Systems Computing Science & Mathematics University of Stirling We have Box and Fruit, random variables. In the red box (B=r) there are 2 apples (a) and 6 oranges (o). In the blue box (B=b) there are 3 apples and 1 orange. - If I pick a fruit from the red box, what would you expect? - How can you express this? - Map all conditional probabilities of fruit | box. - WORKING Hypothesis: P(B=r) = 4/10 P(B=b) = 6/10 Apples and oranges in boxes [Bishop]
Transcript
Page 1: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

31 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Recap: SUM RULE

P(A or B) = P(A ∪ B) = P(A) + P(B) with A, B disjoint.

P(A or B) = P(A) + P(B) - P(A ∩ B) otherwise CONDITIONAL PROBABILITY OF X GIVEN Y

P(X=A | Y=A) PRODUCT RULE

P( X=A,Y=A ) = P( X=A | Y=A) * P( Y=A ) INDEPENDENT EVENTS

P(X,Y) = P(X) P(Y), i.e. P(X | Y) = P(X) Note that: P(A or B) = P(B or A)

P(A, B) = P(B, A) P(A | B) ≠ P(B | A)

Probabilities

32 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Clearly P(X, Y) = P(Y, X) by symmetry of ∩ -- BUT P(X | Y) ≠ P(Y | X) P(X | Y) P(Y) = P(Y | X) P(X) by definition and product rule

Note again that if the knowledge about Y does not change the probability of X, i.e. P(X | Y) = P(X) then the two events are said to be independent, and P(X,Y) = P(X) P(Y), as in the case of picking two cards from different decks (or reinserting the card after each test).

It follows

P(X | Y) = P(Y | X) P(X) Bayes' Theorem P(Y)

Important result. Informally: By knowing the probability of X given Y, and the probability of X and Y, I can derive the probability of Y given X. We will see an example soon.

Probabilities: Bayes' Theorem

33 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Summing up: SUM RULE P(A or B) = P(A ∪ B) = P(A) + P(B) with A, B disjoint.

P(A or B) = P(A) + P(B) - P(A ∩ B) otherwise CONDITIONAL PROBABILITY OF X GIVEN Y P(X=A | Y=A) PRODUCT RULE P( X=A,Y=A ) = P( X=A | Y=A) * P( Y=A ) INDEPENDENT EVENTS P(X,Y) = P(X) P(Y), i.e. P(X | Y) = P(X) BAYES' THEOREM P(X | Y) = P(Y | X) P(X)

P(Y)

Probabilities

34 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

We have Box and Fruit, random variables. In the red box (B=r) there are 2 apples (a) and 6 oranges (o). In the blue box (B=b) there are 3

apples and 1 orange. - If I pick a fruit from the red box, what would you expect? -  How can you express this? -  Map all conditional probabilities of fruit | box. -  WORKING Hypothesis: P(B=r) = 4/10

P(B=b) = 6/10

Apples and oranges in boxes [Bishop]

Page 2: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

35 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

0.4 0.6 P(F=a) = ?

It is the sum of the events "pick an apple from red" or "pick an apple from blue"

P( (F=a, B=r) or (F=a, B=b) ) = P(F=a, B=r) + P(F=a, B=b) = (sum rule) P(F=a | B=r) P(B=r) + P(F=a | B=b) P(B=b) = (product rule) 0.25 * 0.4 + 0.75 * 0.6 = 0.55

Hence P(F=o) = 1 - 0.55 = .45 (sum rule)

Apples and oranges in boxes [Bishop]

36 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Let us try to "invert" our reasoning.

Suppose the two boxes have the same probability. If I observe an orange, on which box would you bet on?

Apples and oranges in boxes [Bishop]

37 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

0.4 0.6

Can we make this more precise, even in the original case, where boxes have associated probabilities ? How ? Which probability are we looking for ?

P(B=r | F=o) = P(F=o | B=r) P(B=r) = (Bayes' theorem) P(F=o)

(0.75 * 0.4 ) / 0.45 = 0.66666...

P(B=b | F=o) ?

Apples and oranges in boxes [Bishop]

38 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

0.4 0.6

Are B and F independent ?

P(F=o | B= r) = 0.75 ≠ 0.45 = P(F=o)

Apples and oranges in boxes [Bishop]

Page 3: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

39 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

0.4 0.6 A note on the pre- post- "inversion" we are going through...

Suppose the two boxes have the same probability. If I observe an orange, on which box would you bet ? Probably, on the red one. (0.75 ) Remember that, without any knowledge of the fruit, one knows that the blue one is more probable (0.6) Via the Bayes' theorem, a subsequent observation (F=o) modifies our prior probability into a posterior probability.

Apples and oranges in boxes [Bishop]

40 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

P(rain) = 0.4 P(grass is wet) = 0.6 (rain, watering, ...)

P(R, W) ?

Independent? P(W | R) = 1 ≠ 0.6 = P(W) No.

Easily P(W | R) = 1 so P(R, W) = P(W | R) * P(R) = 1 * 0.4 = 0.4

P(R | W) ? P(R | W) = P(W, R)/P(W) = 0.4 / 0.6 = 0.66

NOTE: P(A | B) cannot be calculated from P(A) and P(B) alone.

Grass and rain

41 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Thomas Bayes (1701 - 1761) Logic and theology at Edinburgh. FRS 1742. Probability relevant for gambling and the new concept of insurance. "Essay towards solving a problem in the doctrine of chances" (1764 - three years after his death). [Then Laplace independently rediscovered the theory in a general form].

(Informally) An interpretation of probability as measure of uncertainty, based on so-far observed facts that can be revised. Remember the case of oranges and boxes.

Pierre-Simon Laplace (1749-1827) "Probability theory is nothing else but common sense reduced to calculation". Discussion about the same ideas of Bayes (inverse probability calculation), with applications to life expectancy, jurisprudence, planetary masses, error estimation (1812).

Probabilities

42 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Outline.

1.  Introduction 2.  Probability 3.  Bayesian classification

Page 4: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

43 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

•  Bayesian Classifiers are statistical classifiers –  based on Bayes� Theorem (see following slides)

•  They can predict the probability that a particular sample is a member of a particular class

•  Perhaps the simplest Bayesian Classifier is known as the Naïve Bayesian Classifier based on an independence assumption (see later on …)

•  In very simple terms, this means that we assume that values given for one variable are not influenced by values given to another variable. No relationship exists between them

•  Although the independence assumption is often a bold assumption to make, performance is still often comparable to Decision Trees and Neural Network classifiers (explore on wikipedia! and references therein about "the surprisingly good performance of Naive Bayes in classification" )

What is a Bayesian Classifier?

44 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Bayes� Classifier: An Example Examples of things we can derive from our dataset:

•  4 males took advantage of the Mag.(azine) Promo and these 4 males represent 2/3 of the total male population,

•  3/4�s of females purchased the Mag. Promo.

Mag. Promotion TV Promotion Life Insurance Promotion Credit Card Insurance Sex Y N N N M Y Y Y N F N N N N M Y Y Y Y M Y N Y N F N N N N F Y N Y Y M N Y N N M Y N N N M Y Y Y Y F

For our example, let�s use sex as the output attribute whose value is to be predicted

45 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Bayes� Classifier: An Example Suppose we want to classify a new instance (or customer), called Lee. We are told the

following holds true for our new customer, i.e. this is our evidence E =

Mag. Promo = Y and TV Promo = Y and LI Promo = N and C.C. Ins. = N

We want to know if Lee is male (H1) or female (H2). Note that there was no example of YYNN in the data. We apply Bayes� classifier and compute a probability for each hypothesis

Mag. Promotion TV Promotion Life Insurance Promotion Credit Card Insurance Sex Y N N N M Y Y Y N F N N N N M Y Y Y Y M Y N Y N F N N N N F Y N Y Y M N Y N N M Y N N N M Y Y Y Y F

46 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

•  Firstly we list the distribution of the output attribute values for each input attribute. •  This is done using a distribution table.

Bayes� Classifier: An Example

Mag Promo TV Promo LI Promo C.C. InsSex M F M F M F M FY 4 3 2 2 2 3 2 1N 2 1 4 2 4 1 4 3Ratio: Yes/Total 4/6 3/4 2/6 2/4 2/6 3/4 2/6 1/4Ratio: No/Total 2/6 1/4 4/6 2/4 4/6 1/4 4/6 3/4

So for example, 4 males answered Y to the Mag Promo

2 out of the total of 6 males answered Y to the LI Promo

Ratio for Y/T and N/T sum to 1 for each column

Page 5: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

47 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

H1: Lee is male

P(sex = M | E) = P(E | sex = M) P(sex = M) P(E)

Starting with P(E | sex = M) … This is

P(Mag. Promo = Y, TV Promo = Y, LI Promo = N, C.C. Ins = N | sex = M) We have (mathematical justification):

P( E1 ∩ E2 ∩ E3 ∩ E4 | M ) P(M) = P (A|B)P(B) =P(A∩ B) P( E1 ∩ E2 ∩ E3 ∩ E4 ∩ M ) = P(A∩ B) = P (A|B)P(B) P( E1 | E2 ∩ E3 ∩ E4 ∩ M ) P( E2 ∩ E3 ∩ E4 ∩ M ) = * P( E1 | M) P( E2 ∩ E3 ∩ E4 ∩ M ) = … P( E1 | M) P( E2 | M) P( E3 | M) P( E4 | M) P( M )

* Assumption: E1, … E4 are conditionally independent given C, i.e.. the information added

by knowing that E2, … E4 have happened does not add much to P(E1 | C) and is forgotten. This is not always correct, it is an approximation, but often works well (and fast!).

Bayes� Classifier: Back to hypothesis H1 and H2

Bayes’ Theorem

48 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

H1: Lee is male

P(sex = M | E) = P(E | sex = M) P(sex = M) P(E)

Then we can calculate the conditional probability values for each piece of evidence as

explained:

P(Mag. Promo = Y | sex = M) = 4/6 P(TV Promo = Y | sex = M) = 2/6 P(LI Promo = N | sex = M) = 4/6 P(C.C. Ins = N | sex = M) = 4/6

and P(sex = M) = 6/10 = 3/5

These values are easily obtained from our distribution table. It follows:

P(E | sex = M) P(sex = M) = (4/6) * (2/6) * (4/6) * (4/6) * (3/5) = 8/81 * 3/5

Bayes� Classifier: Back to hypothesis H1 and H2

Bayes’ Theorem

49 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Analogously for H2: Lee is female

P(sex = F | E) = P(E | sex = F) P(sex = F) P(E)

And we have

P(Mag. Promo = Y | sex = F) = 3/4 P(TV Promo = Y | sex = F) = 2/4 P(LI Promo = N | sex = F) = 1/4 P(C.C. Ins = N | sex = F) = 3/4

P(sex = F) = 2/5

It follows:

P(sex = F | E) = (9/128) * (2/5) P(E)

Bayes� Classifier: Back to hypothesis H1 and H2

Bayes’ Theorem

50 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Finally,

P(sex = F | E) = (9/128) * (2/5) < (8/81) * (3/5) = P(sex = M | E) P(E) P(E)

Hence, Bayes� classifier tells us that Lee is most likely a male. Calculating also the value of P(E), i.e. the (conditionally independent !) probabilities of

Mag. Promo, TV Promo, not LI Promo and not CC Promo

P(E) = (7/10) * (4/10) * (5/10) * (7/10) = 0.098 we have:

P(sex = F | E) = 0.2815 < 0.5926 = P(sex = M | E)

Bayes� Classifier: Back to hypothesis H1 and H2

Page 6: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

51 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

1. What proportion of Glasgow customers buy books? 2. What proportion of all customers buy DVDs? 3.  Given a new customer that we knows buys Videos, is it more likely that they

live in Glasgow or Stirling? Classifying according to further evidence !

CDs Books DVDs Videos Region

Y Y N N Stirling

Y N Y N Glasgow

Y N Y Y Glasgow

Y Y Y N Glasgow

N N Y N Stirling

N Y Y Y Stirling

Y N Y N Stirling

Y Y Y Y Glasgow

Items Bought from Amazon

52 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

After a bit of feature extraction ... 1. P(Region = G, Books = Y) = 2/4 = 1/2 2. P(DVDs = Y) = 7/8 3. ??

CDs G S

Books G S

DVDs G S

Videos G S

4 2 0 2

2 2 2 2

4 3 0 1

2 1 2 3

1 ½ 0 ½

½ ½ ½ ½

1 ¾ 0 ¼

½ ¼ ½ ¾

Y N

Ratio Y/Tot

Ratio N/Tot

Items Bought from Amazon

53 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

P(Glasgow | Videos) = P(Videos | Glasgow) P(Glasgow) / P(videos) = 1/2 * 1/2 / 3/8 = 2/3

CDs G S

Books G S

DVDs G S

Videos G S

4 2 0 2

2 2 2 2

4 3 0 1

2 1 2 3

1 ½ 0 ½

½ ½ ½ ½

1 ¾ 0 ¼

½ ¼ ½ ¾

Y N

Ratio Y/Tot

Ratio N/Tot

Items Bought from Amazon

54 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

P(Stirling | Videos) = P(Videos | Stirling) P(Stirling) / P(videos) = 1/3 * 1/2 / 3/8 = 1/3

CDs G S

Books G S

DVDs G S

Videos G S

4 2 0 2

2 2 2 2

4 3 0 1

2 1 2 3

1 ½ 0 ½

½ ½ ½ ½

1 ¾ 0 ¼

½ ¼ ½ ¾

Y N

Ratio Y/Tot

Ratio N/Tot

Items Bought from Amazon

Page 7: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

55 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

P(Stirling | Videos) = P(Videos | Stirling) P(Stirling) / P(videos) = 1/3 * 1/2 / 3/8 = 1/3 P(Glasgow | Videos) = P(Videos | Glasgow) P(Glasgow) / P(videos) = 1/2 * 1/2 / 3/8 = 2/3 ... most likely from Glasgow! Note: P(Stirling | Videos) + P(Glasgow | Videos) = 1

Exercise. Is in general true that P(a|r) + P(b|r) = 1 ? Note that in our case a ∪ b = T. (One either comes from Glasgow or from Stirling, but not from both places. Indeed a ∩ b = ∅, too). Is it true when assuming a ∪ b = T ?

(by def) = P(a,r) /P(r) + P(b,r)/P(r) (by sum) = P(a U b, r) / P(r) = P(T,r)/P(r) = P(r)/P(r) =1

Note: Revisit the Naïve Bayesian Classifier example about promotions. Does the above result hold there? If not, why ? How is the situation different here ?

Items Bought from Amazon

56 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Why use Bayesian Classifiers?

•  There are several classification methods, no one has been found to be superior over all others in every case (i.e. a data set drawn from a particular domain of interest)

•  Methods can be compared based on: –  accuracy –  interpretability of the results –  robustness of the method with different datasets –  training time –  scalability

57 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

An Important Remark

Mind the difference between

calculating probabilities from a known set of outcomes, for example, tossing a coin heads when we know the two outcomes are heads or tails, and calculating the probability of an event from data, and the event is not directly expressed by data.

The probabilities calculated from data are estimates, not true values. If we tossed a coin 10 times to generate data, we might easily get 6 heads and 4 tails. Without knowing about how coins work, we would estimate the probability of getting heads as 6/10 – not a bad estimate, but incorrect. The more data we have, the more reliable our estimates get.

Results can be dramatically sensitive to the specific evidence you have. E.g. suppose you have evidence that P(Head)=0.6, then you can estimate the probability of getting 6 heads in a row, even if it didn’t happen in your data

=0.6 * 0.6 * 0.6 * 0.6 * 0.6 *0.6 = 0.047, i.e. 1 in 21 tries! (with P(Head)=0.5 is 0.016)

58 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Bayesian Belief Networks

Page 8: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

59 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Modern pattern recognition based on probabilities, mainly on sum and product rules.

All could be treated algebraically.

However, graphical models offer advantages, in terms of

•  visualisation, e.g. structure and relationships; •  communication, easily to grasp; •  expressiveness, e.g. graphical manipulations corresponding to

mathematical operations.

Probabilistic graphical models.

60 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

•  What are they?

–  Bayesian Belief Networks (BBNs) are a way of modelling probabilities based on data or knowledge to allow probabilities of new events to be estimated

•  What are they used for?

–  Intelligent decision aids, data fusion, intelligent diagnostic aids, automated free text understanding, data mining

•  Where did they come from?

–  Cross fertilization of ideas between the artificial intelligence, decision analysis, and statistic communities

Bayesian Belief Networks

61 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Definition of a Bayesian Network

•  Factored joint probability distribution as a directed graph:

•  structure for representing knowledge about uncertain variables •  computational architecture for computing (propagating) the impact of evidence

on beliefs

•  Knowledge structure:

•  random variables are depicted as nodes •  arcs represent probabilistic dependence between variables •  conditional probabilities encode the strength of the dependencies

•  Computational architecture:

•  computes posterior probabilities given evidence about selected nodes •  exploits probabilistic independence for efficient computation

B F

62 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Visit to Asia

Tuberculosis

Tuberculosis or Cancer

XRay Result Dyspnea

Bronchitis Lung Cancer

Smoking

Patient Information

Medical Difficulties

Diagnostic Tests

Example from Medical Diagnostics

Page 9: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

63 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

•  Relationship knowledge is modeled by deterministic functions, logic and conditional probability distributions

Patient Information

Diagnostic Tests

Visit to Asia

Tuberculosis

Tuberculosis or Cancer

XRay Result Dyspnea

Bronchitis Lung Cancer

Smoking Tuber

Present

Present

Absent

Absent

Lung Can

Present

Absent

Present

Absent

Tub or Can

True

True

True

False

Medical Difficulties Tub or Can

True

True

False

False

Bronchitis

Present

Absent

Present

Absent

Present

0.90

0.70

0.80

0.10

Absent

0.l0

0.30

0.20

0.90

Dyspnea

Example from Medical Diagnostics

64 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

•  Propagation algorithm processes relationship information to provide likelihood information for occurrence of each state for each node

TuberculosisPresentAbsent

1.0499.0

XRay ResultAbnormalNormal

11.089.0

Tuberculosis or CancerTrueFalse

6.4893.5

Lung CancerPresentAbsent

5.5094.5

DyspneaPresentAbsent

43.656.4

BronchitisPresentAbsent

45.055.0

Visit To AsiaVisitNo Visit

1.0099.0

SmokingSmokerNonSmoker

50.050.0

Example from Medical Diagnostics

65 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

•  As a finding is entered, the propagation algorithm updates the beliefs attached to each relevant node in the network

•  Interviewing the patient produces the information that �Visit to Asia� is �Visit� •  This finding propagates through the network and the belief functions of several nodes are updated

TuberculosisPresentAbsent

5.0095.0

XRay ResultAbnormalNormal

14.585.5

Tuberculosis or CancerTrueFalse

10.289.8

Lung CancerPresentAbsent

5.5094.5

DyspneaPresentAbsent

45.055.0

BronchitisPresentAbsent

45.055.0

Visit To AsiaVisitNo Visit

100 0

SmokingSmokerNonSmoker

50.050.0

Example from Medical Diagnostics

66 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

TuberculosisPresentAbsent

5.0095.0

XRay ResultAbnormalNormal

18.581.5

Tuberculosis or CancerTrueFalse

14.585.5

Lung CancerPresentAbsent

10.090.0

DyspneaPresentAbsent

56.443.6

BronchitisPresentAbsent

60.040.0

Visit To AsiaVisitNo Visit

100 0

SmokingSmokerNonSmoker

100 0

•  Further interviewing of the patient produces the finding �Smoking� is �Smoker� •  This information propagates through the network

Example from Medical Diagnostics

Page 10: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

67 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

TuberculosisPresentAbsent

0.1299.9

XRay ResultAbnormalNormal

0 100

Tuberculosis or CancerTrueFalse

0.3699.6

Lung CancerPresentAbsent

0.2599.8

DyspneaPresentAbsent

52.147.9

BronchitisPresentAbsent

60.040.0

Visit To AsiaVisitNo Visit

100 0

SmokingSmokerNonSmoker

100 0

•  Finished with interviewing the patient, the physician begins the examination •  The physician now moves to specific diagnostic tests such as an X-Ray, which results in a �Normal� finding which propagates through the network

•  Note that the information from this finding propagates backward and forward through the arcs

Example from Medical Diagnostics

68 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

TuberculosisPresentAbsent

0.1999.8

XRay ResultAbnormalNormal

0 100

Tuberculosis or CancerTrueFalse

0.5699.4

Lung CancerPresentAbsent

0.3999.6

DyspneaPresentAbsent

100 0

BronchitisPresentAbsent

92.27.84

Visit To AsiaVisitNo Visit

100 0

SmokingSmokerNonSmoker

100 0

•  The physician also determines that the patient is having difficulty breathing, the finding �Present� is entered for �Dyspnea� and is propagated through the network

•  The doctor might now conclude that the patient has bronchitis and does not have tuberculosis or lung cancer

Example from Medical Diagnostics

69 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Applications •  Industrial

•  Processor Fault Diagnosis - by Intel •  Auxiliary Turbine Diagnosis - GEMS

by GE •  Diagnosis of space shuttle

propulsion systems - VISTA by NASA/Rockwell

•  Situation assessment for nuclear power plant - NRC

•  Military •  Automatic Target Recognition -

MITRE •  Autonomous control of unmanned

underwater vehicle - Lockheed Martin

•  Assessment of Intent

•  Medical Diagnosis •  Internal Medicine •  Pathology diagnosis - Intellipath by

Chapman & Hall •  Breast Cancer Manager with

Intellipath

•  Commercial •  Financial Market Analysis •  Information Retrieval •  Software troubleshooting and advice

- Windows 95 & Office 97 •  Pregnancy and Child Care -

Microsoft •  Software debugging - American

Airlines� SABRE online reservation system

70 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Glandular fever

•  Suppose we know that on average, 1% of the population have had glandular fever.

P(had_GF) = 0.01

•  Suppose we have a test for having had glandular fever such that:

–  For a person who has had GF the test would give a positive result with probability 0.977

–  For a person who has not had GF the test would give a negative result with probability 0.926

Q: How could this information be represented as a BBN Q: How could the BBN be used to find out new information?

Page 11: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

71 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Step 1: "feature extraction"

List information we are given, determine information we can deduce from that information. •  "on average, 1% of the population have had glandular fever"

P(had GF) = 0.01 => P(not had GF) = 0.99 •  "for a person who has had GF the test would give a positive result with probability 0.977"

Note this is not "a person has had GF"! but

P(+ve test | person has had GF) = 0.977

from which we have that the probability of having had GF but receiving a –ve test result is

P(-ve test | person has had GF) = 1 – p(+ve test | person has had GF)] = 0.023

72 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Step 1: "feature extraction"

•  “for a person who has not had GF the test would give a negative result with probability 0.926”

P(-ve test | person has not had GF) = 0.926

from which

P(+ve test | person has not had GF) = 1 – p(-ve test | person has not had GF) = 0.074

Summing up:

P(had GF) = 0.01 P(not had GF) = 0.99

P(+ve test | had GF) = 0.977 P(-ve test | had GF) = 0.023

P(+ve test | not had GF) = 0.074 P(-ve test | not had GF) = 0.926

73 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Define the structure of the BBN. Important issue! Nodes are random variables. Two choices: 1.  Top-level nodes. These have not any probabilistic dependency. Could be understood

as observations. 2.  Dependency relationships. Describe our interpretation of dependencies in our model.

Both the choices contribute to the structure of the network. Different choices are generally possible. The final structure must not have cycles in the dependency relationship. I.e. the directed graph must be and acyclic (DAG) in order to guarantee efficient propagation algorithms.

In our example: •  Nodes: - Had_GF

- Test_Result •  Relationships: - Had_GF node influences state of Test_Result node

Step 2: BBN construction

74 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

1 parent related to 1 child model

Page 12: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

75 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Finally,

3. Establish values for Conditional Probability Tables (CPTs), i.e. a representation of the conditional probability for the values of the random variable represented by a node, conditioned by the parent random variable.

Application of a formula. Values must be consistent! E.g. rows (union of all the possible values) should sum to 1.

Test +ve True+ Test –ve False

Had GF True 0.977 (97.7%) 0.023 (2.3%)

Had GF False 0.074 (7.4%) 0.926 (92.6%)

Step 2: BBN construction

76 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

True 0.01 (1%)

False 0.99 (99%) Had GFNode

True

False

Test +ve Node

True

False

Test +ve True+ Test –ve False

Had GF True 0.977 (97.7%) 0.023 (2.3%)

Had GF False 0.074 (7.4%) 0.926 (92.6%)

Step 2: BBN construction

77 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Use the network to determine new information.

Examples of information we may wish to determine:

Q1: Given a person has had GF, what is the probability of a negative test result? Q2: What is the probability of a +ve test result? Q3: Given a positive test, what is the probability that the person has had GF? We will start by looking the calculation that these questions require. This is actually

computed by the network, once it has been properly designed.

Step 3: Use of the BBN

78 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Q1: Given a person has had GF, what is the probability of a negative test? Well, this is easy as we have now determined this information in Step 2. Formulated in terms of probability, we wish to find out: P(-ve test | has had GF) Looking back a few slides, P(-ve test | has had GF) = 0.023

Step 3: Use of the BBN

Page 13: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

79 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Q2: What is the probability of a +ve test result? Well, we need to think of all the possible situations that could happen which would lead

to a +ve test result. Situation 1. +ve test result and had GF:

p(+ve test result ∩ had GF) Situation 2. +ve test result and not had GF:

p(+ve test result ∩ not had GF) We don’t know the above information, BUT we can now use what we know of conditional

probability to calculate it…

Step 3: Use of the BBN

80 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Q2: What is the probability of a +ve test result? Again from P(A ∩ B) = P(B|A) P(A):

P(+ve test ∩ had GF) = P(had GF | +ve test) * P(+ve test) P(+ve test ∩ not had GF) = P(not had GF | +ve test) * P(+ve test)

Still, P(+ve test) has not yet been defined. BUT from P(A ∩ B) = P(B ∩ A), then

P(had GF ∩ +ve test) = P(+ve test | had GF) * P(had GF) = 0.977 * 0.01 = 0.00977

P(not had GF ∩ +ve test) = P(+ve test | not had GF) * P(not had GF) = 0.074 * 0.99 = 0.07326

Step 3: Use of the BBN

81 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Q2: What is the probability of a +ve test result? Finally, by the sum rule:

P(+ve test) = P(+ve test ∩ had GF) + P(+ve test ∩ had GF) = 0.00977 + 0.07326 = 0.08303

If we chose a random person and gave them a test, they would have a 0.08 probability of showing positive.

Step 3: Use of the BBN

82 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Q3: Given a positive test, what is the probability that the person has had GF?

P(has had GF | +ve test) ?

Again, this isn’t a piece of information that has been already defined. We can caluclate

the answer by exploiting the network and Bayes’ theorem:

P(A | B) = P(B | A) P(A) / P(B)

P(has had GF | +ve test) = P( +ve test |has had GF) P(has had GF) / P(+ve test) = (0.977 * 0.01) / P(+ve test) = 0.118

Given that, from Q2 we have P(+ve test) = 0.08303

Step 3: Use of the BBN

Page 14: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

83 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Q3: Given a positive test, what is the probability that the person has had GF?

P(has had GF | +ve test) = 0.118

So the test isn’t so useful after all:

– remember that 7% of those who’ve never had GF get a positive result – and they far out number those people who have had it and got a positive result!

Step 3: Use of the BBN

84 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

A BBN model with a hierarchy of 3 nodes

85 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

True 0.01 (1%)

False 0.99 (99%)

Had GFNode

True

False

Test +ve Node

True

False

Test +ve True+ Test –ve False

Had GF True 0.977 (97.7%) 0.023 (2.3%)

Had GF False 0.074 (7.4%) 0.926 (92.6%)

Facts we deduced from the last time…

P(-ve test | has had GF) = 0.023

P(+ve test) = 0.08303

P(had GF | +ve test) = 0.118

Extension to the Glandular Fever Model

86 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

•  Suppose we are informed that the school nurse sends home 80% of students that have a positive GF test.

•  She also sends home 5% of students for other medical reasons (i.e. students that have not had a positive GF test).

Q1. How do we incorporate this new information into our network?

Q2. What is the probability of being sent home? Q3. Given that a child is sent home, what is the probability of them having had a

negative test?

Extension to the Glandular Fever Model

Page 15: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

87 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Q1. How do we incorporate new information?

True 0.01 (1%)

False 0.99 (99%)

Had GFNode

True

False

Test +ve Node

True

False

Test +ve True+ Test –ve False

Had GF True 0.977 (97.7%) 0.023 (2.3%)

Had GF False 0.074 (7.4%) 0.926 (92.6%)

Sent home

True

False

Sent home True Sent home False

Test +ve True 0.8 (80%) 0.2 (20%)

Test +ve False 0.05 (5%) 0.95 (95%)

88 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

We need to add up all the combinations of scenarios which would lead us to being sent home:

- due to a +ve test - due to another reason

P(home) = P(home | +ve test) * P(+ve test) + P(home | -ve test) * P(-ve test) = 0.8 * P(+ve test) + 0.05 * P(-ve test)

We need to calculate P(+ve test) and P(-ve test). We already know

P(+ve test) = P(+ve test | GF) * P(GF) + P(+ve test | not had GF) * P(not had GF) = 0.00977 + 0.07326 = 0.08303

From which P(-ve test) = 1 -0.08303 = 0.91697

And finally P(home) = 0.8 * P(+veTest) + 0.05 * P(-ve Test) = 0.8 * 0.08303 + 0.05 * 0.91697 = 0.1122725 = 0.112 (3 d.p.)

Q2. What is the probability of being sent home?

89 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

We’ve already done a similar calculation to this before:

P(-ve test | home) = P(home| -ve test) * P(-ve test) / P(home)

= (0.05 * 0.91697) / 0.1122725 = 0.408 (3 d.p.)

Q3. Given that a child is sent home, what is the probability of them having had a negative test?

90 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Things to remember about calculations in BBNs

1.  Know parent information, want to find out child node information: Use conditional probability

2. Know child information want to find out parent node information:

Use Bayes’ Theorem 3. If have more than 1 parent to a node, remember parents are independent (unless they

share an arc). Thus P(parent A ∩ parent B) = P(parent A) * P(parent B)

Page 16: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

91 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

A BBN model with 2 parents and 1 child

92 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

•  Diane does her shopping each week. The bill for her shop is sometimes under £30, sometimes £30 or over. Two factors influence the cost of her shopping: whether she takes her 2 year old son with her, and whether she takes her 40 year old husband with her.

•  If we know Diane has gone shopping by herself, the likelihood that the bill will be less than £30 is 90%. If we know that Diane was accompanied only by her husband, the likelihood that the bill will be less than £30 is 70%, and if we know that Diane took only her son, the likelihood that the bill will be less than £30 is 80%. Given we know both son and husband accompanied Diane to the shops, then the likelihood that the bill is under £30 reduces to 60%.

•  50% of the time Diane’s husband accompanies her to the shops. 60% of the time Diane is accompanied by her son.

How do you imagine a BBN could represent the above information?

Diane’s shopping model (2p 1c)

93 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

•  Diane does her shopping each week. The bill for her shop is sometimes under £30, sometimes £30 or over. Two factors influence the cost of her shopping: whether she takes her 2 year old son with her, and whether she takes her 40 year old husband with her.

•  If we know Diane has gone shopping by herself, the likelihood that the bill will be less than £30 is 90%. If we know that Diane was accompanied only by her husband, the likelihood that the bill will be less than £30 is 70%, and if we know that Diane took only her son, the likelihood that the bill will be less than £30 is 80%. Given we know both son and husband accompanied Diane to the shops, then the likelihood that the bill is under £30 reduces to 60%.

•  50% of the time Diane’s husband accompanies her to the shops. 60% of the time Diane is accompanied by her son.

How do you imagine a BBN could represent the above information?

Diane’s shopping model (2p 1c)

94 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Questions: Q1. What is the probability of the bill being under £30? Q2. Given that the bill is under £30, what is the probability that Diane’s husband (with or

without her son) accompanied her to the shops?

Diane’s shopping model (2p 1c)

Page 17: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

95 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Firstly, let’s list what we know, ...

P(under_30 | no husband ∩ no son) = 0.9 P(under_30 | husband ∩ no son) = 0.7 P(under_30 | no husband ∩ son) = 0.8 P(under_30 | husband ∩ son) = 0.6 P(husband) = 0.5 P(son) = 0.6 ... and what we can deduce ( 1-P(...) ) P(not_under_30 | no husband ∩ no son) = 0.1 P(not_under_30 | husband ∩ no son) = 0.3 P(not_under_30 | no husband ∩ son) = 0.2 P(not_under_30 | husband ∩ son) = 0.4 P(no_husband) = 0.5 P(no_son) = 0.4

Diane’s shopping model (2p 1c)

96 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

CPT True False

Hub T Son T 60% 40%

Hub T Son F 70% 30%

Hub F Son T 80% 20%

Hub F Son F 90% 10%

Diane’s shopping model (2p 1c)

97 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Firstly, list situations that lead to bill being under £30

P(under_30) = P(under_30 | no husband ∩ no son) * P(no husband ∩ no son) + P(under_30 | no husband ∩ son) * P(no husband ∩ son) + P(under_30 | husband ∩ no son) * P(husband ∩ no son) + P(under_30 | husband ∩ son) * P(husband ∩ son)

What do we know about the presence of Diane’s husband and the presence of her son? Are these 2 events dependent on each other in any way? No… These events are independent. Independent events are where the occurrence of one

event does not impact on the occurrence of another event:

P(A | B) = P(A) therefore, if A and B are independent events,

P(A, B) = P(A | B) * P(B) = P(A) * P(B)

Q1. What is the probability of the bill being under £30?

98 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Firstly, list situations that lead to bill being under £30

P(under_30) = P(under_30 | no husband ∩ no son) * P(no husband ∩ no son) + P(under_30 | no husband ∩ son) * P(no husband ∩ son) + P(under_30 | husband ∩ no son) * P(husband ∩ no son) + P(under_30 | husband ∩ son) * P(husband ∩ son)

P(no husband ∩ no son) = P(no husband) * P(no son) = 0.5 * 0.4 = 0.2

P(no husband ∩ son) = P(no husband) * P(son) = 0.5 * 0.6 = 0.3

P(husband ∩ no son) = P(husband) * P(no son) = 0.5 * 0.4 = 0.2

P(husband ∩ son) = P(husband) * P(son) = 0.5 * 0.6 = 0.3

Q1. What is the probability of the bill being under £30?

Page 18: Probabilities Probabilities: Bayes' Theorem Bayesian Networks... · – based on Bayes Theorem (see following slides) • They can predict the probability that a particular sample

99 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Firstly, list situations that lead to bill being under £30

P(under_30) = P(under_30 | no husband ∩ no son) * P(no husband ∩ no son) + P(under_30 | no husband ∩ son) * P(no husband ∩ son) + P(under_30 | husband ∩ no son) * P(husband ∩ no son) + P(under_30 | husband ∩ son) * P(husband ∩ son)

P(under_30) = (0.9 * 0.2) + (0.8 * 0.3) + (0.7 * 0.2) + (0.6 * 0.3) = 0.74

Q1. What is the probability of the bill being under £30?

100 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Q2. Given that the bill is under £30, what is the probability that Diane’s husband (with or without her son)

accompanied her to the shops? 2 scenarios:

P(husband | under £30) = P(husband ∩ son | under £30) + P(husband ∩ not son | under £30)

P(husband ∩ son | under £30) = { using Bayes’) = P(under £30 | P(husband ∩ son)) * P(husband ∩ son) / P(under £30) = (0.6 * 0.3) / 0.74 = 0.243r (r = recurring)

P(husband ∩ not son | under £30) = {using Bayes’} = P(under £30 | P(husband ∩ not son)) * P(husband ∩

not son))/ P(under £30) = (0.7 * 0.2) / 0.74 = 0.189r

Overall:

P(husband | under £30) = 0.243r + 0.189r = 0.432r (3 d.p.)

101 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

BBNs: •  Development of propagation algorithms followed by availability of easy to use

commercial software •  Growing number of creative applications, e.g. dementia diagnosis, cancer care

symptom modelling, likelihood of car purchase, ... •  Different from other knowledge-based systems tools because uncertainty is handled

in mathematically rigorous yet efficient and simple way •  Different from other probabilistic analysis tools because of

- network representation of problems, - use of Bayesian statistics, and - the synergy between these.

•  Issue: How to build a network ?

Why are BBNs interesting?

102 Computing Science & Mathematics University of Stirling

CSC9T6 Information Systems Computing Science & Mathematics University of Stirling

Outline.

1.  Introduction 2.  Probability 3.  Bayesian classification 4.  Bayesian Belief Networks


Recommended