1
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Probabilistic Graphical Models: Bayesian Networks
Vasant HonavarArtificial Intelligence Research Laboratory
Department of Computer ScienceBioinformatics and Computational Biology Program
Center for Computational Intelligence, Learning, & DiscoveryIowa State University
[email protected]/~honavar/
www.cild.iastate.edu/www.bcb.iastate.edu/www.igert.iastate.edu
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference by enumeration
• Start with the joint probability distribution:�
�
• For any proposition φ, sum the atomic events where it is true: P(φ) = Σω:ω╞φ P(ω)�
2
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference by enumeration
• Start with the joint probability distribution:�
�
• For any proposition φ, sum the atomic events where it is true: P(φ) = Σω:ω╞φ P(ω)�
• P(toothache) = 0.108 + 0.012 + 0.016 + 0.064 = 0.2�
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference by enumeration
• Start with the joint probability distribution:�
• Can also compute conditional probabilities:�P(¬cavity | toothache) = P(¬cavity ∧ toothache)
P(toothache)= 0.016+0.064
0.108 + 0.012 + 0.016 + 0.064= 0.4�
3
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Normalization
Denominator can be viewed as a normalization constant α�P(Cavity | toothache) = α, P(Cavity,toothache)
= α, [P(Cavity,toothache,catch) + P(Cavity,toothache,¬ catch)]= α, [<0.108,0.016> + <0.012,0.064>] = α, <0.12,0.08> = <0.6,0.4>�
General idea: compute distribution on query variable by fixing evidence variables and summing over unobserved variables
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference by enumeration, continued
Typically, we are interested in the posterior joint distribution of the query variables Y given specific values e for the evidence variables E�
Let the hidden or unobserved variables be H = X - Y - E�
Then the required summation of joint entries is done by summing out the hidden variables:�P(Y | E = e) = αP(Y,E = e) = αΣhP(Y,E= e, H = h)�
• The terms in the summation are joint entries because Y, E and Htogether exhaust the set of random variables�
4
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference by enumeration, continued
• Obvious problems:�1. Worst-case time complexity O(dn) where d is the largest
arity�2. Space complexity O(dn) to store the joint distribution�3. How to find the numbers for O(dn) entries?
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Independence• A and B are independent iff
P(A|B) = P(A) or P(B|A) = P(B) or P(A, B) = P(A) P(B)�
P(Toothache, Catch, Cavity, Weather)= P(Toothache, Catch, Cavity) P(Weather)
• 32 entries reduced to 12; for n independent biased coins, O(2n) →O(n)�
• Absolute independence powerful but rare�• How can we manage a large numbers of variables?�
5
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Conditional independence• P(Toothache, Cavity, Catch) has 23 – 1 = 7 independent entries�• If I have a cavity, the probability that the probe catches in it doesn't
depend on whether I have a toothache:�
– P(catch | toothache, cavity) = P(catch | cavity)• The same independence holds if I haven't got a cavity:�
– P(catch | toothache,¬cavity) = P(catch | ¬cavity)�• Catch is conditionally independent of Toothache given Cavity:�
– P(Catch | Toothache,Cavity) = P(Catch | Cavity)�
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Conditional independence
• Catch is conditionally independent of Toothache given Cavity:�
– P(Catch | Toothache,Cavity) = P(Catch | Cavity)�
• Equivalent statements:
– P(Toothache | Catch, Cavity) = P(Toothache | Cavity)�– P(Toothache, Catch | Cavity) = P(Toothache | Cavity)
P(Catch | Cavity)�
6
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Conditional independence
• Write out full joint distribution using chain rule:�P(Toothache, Catch, Cavity)
= P(Toothache | Catch, Cavity) P(Catch, Cavity)�= P(Toothache | Catch, Cavity) P(Catch | Cavity) P(Cavity)�= P(Toothache | Cavity) P(Catch | Cavity) P(Cavity)�
I.e., 2 + 2 + 1 = 5 independent numbers�• Conditional independence
– often reduces the size of the representation of the joint distribution from exponential in n to linear in n�
– Is one of the most basic and robust form of knowledge about uncertain environments�
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
X is conditionally independent of Y given Z if the probability distribution governing X is independent of the value of Y given the value of Z:
P (X |Y, Z ) = P (X |Z ) that is, if
Conditional Independence
)|(),|(),,( kikjikji zZxXPzZyYxXPzyx ======∀
7
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Independence and Conditional Independence
( ) ( )
( ) ( )
variables random to sassignment value possible all for equations, of sets represent these that Note
t.independen are and if
if given tindependenmutually are space. event given a on variables randomof sets disjoint pairwise be and Let
21121
11
1
1
....
,...,
,...
ZZWZWZZ
WZWZZ
WZZ
WZZ
PP
PP i
n
in
n
n
=
=∪∪ ∏=
U
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Independence Properties of Random Variables
( )( ) ( ) ( ) ( ) ( )
( ) ( )( ) ( )( ) ( )( ) ( ) ( )
. of definition from Follows :Proof,,,,,, d.
,,,, c.,,,, b.
,,,, a.:Then . or , is, That
. given are and that denote ,, Let space. event given a on variables random
of sets disjoint pairwise be ,,, Let
ceindependenIII
IIII
II
PPPPPnt independeI
WYZXWYZXYZXYWZXWYZX
YZXWYZXXZYYZX
YXZYXYZYXYZXYZXZYX
ZYXW
UU
UU
U
UU
⇒∧⇒⇒
⇒
==
8
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayes RuleDoes patient have cancer or not?
A patient takes a lab test and the result comes back positive. The test returns a correct positive result in only 98% of the cases in which the disease is actually present, and a correct negative result in only 97% of the cases in which the disease is not present. Furthermore, .008 of the entire population have this cancer.
=¬+=+
=
)|()|(
)(
cancerPcancerP
cancerP
=¬−=−=¬
)|()|()(
cancerPcancerPcancerP
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayes Rule
Does patient have cancer or not?
030980
0080
.)|(.)|(
.)(
=¬+=+
=
cancerPcancerP
cancerP
970020
9920
.)|(.)|(
.)(
=¬−=−
=¬
cancerPcancerP
cancerP
( ) ( )( ) ( ) ( )
( )( ) ( ) ( ) ( )
cancer havenot does not,n likely tha more patient, The79.0)|( ;21.0)|(
0298.00078.0)(0298.0992.003.0 ;0078.0008.098.0
)(
;)(
=+¬=++=+
=×=++¬=×=++
+¬¬+
=+¬+
+=+
cancerPcancerPP
PcancerPPcancerPP
cancerPcancerPcancerP
PcancerPcancerP
cancerP
9
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayes Rule
• Product rule
– P(a∧b) = P(a | b) P(b) = P(b | a) P(a)�– Bayes' rule: P(a | b) = P(b | a) P(a) / P(b)�
• In distribution form �P(Y|X) = P(X|Y) P(Y) / P(X) = αP(X|Y) P(Y)
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayes' Rule and conditional independence
P(Cavity | toothache ∧ catch)
= αP(toothache ∧ catch | Cavity) P(Cavity) = αP(toothache | Cavity) P(catch | Cavity) P(Cavity)
• This is an example of a naïve Bayes (idiot Bayes) model:�– P(Cause,Effect1, … ,Effectn) = P(Cause) πiP(Effecti|Cause)��
• Total number of parameters is linear in n�
10
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayesian NetworksExploiting conditional independence and graphical representation for reasoning under uncertainty
Review of GraphsReview of Independence and conditional independenceDirected graphical models and probability distributions
Querying a probability distribution – inference
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Review of basic concepts of graphs
Undirected graph G1=(V,E1) Directed Graph G2=(V,E2)Vertex Set V = { A, B, C, D, E }Edge Set E1 = { A–B, B–D, D–E, A–C) C–E }Edge Set E2 = { A B, B D, D E, A C) C E }
A
C
B
D
E
A
C
B
D
E
11
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Review of basic concepts of graphs
Adjacency Set of a node – immediate neighbors reachable through undirected (directed) links
In G1, Adj(A) = {C, D}; Adj(B) = {D}, Adj(D)={A,B,E} Adj(E)=∅In G2, Adj(A) = {C, D}; Adj(B) = {D}, Adj(D)={E} Adj(E)={C,D}Paths between two nodes – ordered list of nodes starting with the first node
and ending with the second in which each successive node is in the adjacency list of preceding node
A
C
B
D
E
A
C
B
D
E
G1 G2
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of undirected graphs• Complete graph – there is a link between every
pair of nodes• Complete set – a subset of nodes in a graph is
said to be complete if there is a link between every pair of nodes in the subset
• Clique – A complete set of nodes is said to be a clique if it is maximal – i.e., it is not a proper subset of another complete set
The graph G1(A, B. C. D. E) is a complete set and also the only
clique in the graph G1
A
C D
E
12
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of undirected graphs
Identify the cliques in the graph shownTwo cliques (A, C, D), (B, C, D, E)
A
C D
E
B
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of undirected graphs
Neighbors of a node In G1, Neighbors(A) = {C, D}Boundary of a set of nodes S – union of neighbors of nodes in S except
nodes in SBoundary({C, D}) = {A,E,D} U {A,B,C,E} – {C,D}
= {A,B,C,D,E} – {C,D} = {A, B, E}
A
C
B
D
E
G1
13
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of undirected graphs
A graph is said to be connected if there exists at least one path between any pair of nodes
G1 is connected
G2 is not connected
A
C D
E
G1
A
C D
E
G2
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of undirected graphs
A connected undirected graph is a tree if for every pair of nodes, there is a unique path
G1 is a treeAn undirected graph is said to be multiply
connected if at least one pair of nodes is connected by more than one path – i.e., there is at least one loop
G2 is not a tree (is multiply connected)A
C D
E
G1A
C D
E
G2
14
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of undirected graphs
Chord of a loop – a chord is a link between two nodes in a loop that is not part of the loop
C–D is a chord of the loop A–C–E–D–AA–E is a chord of the loop A–C–E–D–A
A chord of a loop decomposes a loop into two smaller loops
The loop A–E–D–A does not have a chord
A
C D
E
G1
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of undirected graphs
Triangulated Graph – An undirected graph is said to be triangulated if every loop of length 4 or greater has at least one chord
C–D is a chord of the loop A–C–E–D–AG1 is not triangulatedG2 is triangulated
Triangulation does not mean dividing the graph into triangles!
A
C D
E
G1
A
C D
E
G2
15
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of undirected graphs
Triangulation – the process of adding chords to make the graph triangulated
There may be multiple ways to triangulate a graphA triangulation is said to be minimal if it contains the
minimal number of chordsG2 is a minimal triangulationG3 is not a minimal triangulationFinding minimal triangulation is NP-HardThere is a greedy algorithm for triangulating a
graph (Tarzan and Yannakakis, 1984)
G1 A
C D
E
A
C D
E
G2
A
C D
E
G3
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of undirected graphs
Triangulated graphs have the running intersection property
There exists an ordering of cliques C1 … Cn such that Ci ∩{C1∪ C2.. Ci-1} is contained in at least one of the
cliques C1 C2.. Ci-1 for all i=1..nAn ordering of cliques satisfying the running
intersection property is called a chain of cliquesAn undirected graph has an associated chain of
cliques iff it is triangulatedOrdering {A,C,E,D} {C,D,E,F}
G1A
C D
EF
16
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of undirected graphs
Cluster – a subset of nodes of a graphCluster graph • nodes are clusters • There is an edge between two nodes if and only clusters contain
common nodesClique graph of an undirected graph is a cluster graph in which the
clusters correspond to the cliques of the original graphA clique graph is called a join or a junction graph if it contains all the
possible links between two cliques with a common nodeJoin graph of an undirected graph is unique
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of undirected graphsA clique graph is a join tree or junction tree if it is a
tree and every node that belongs to two clusters also belongs to every node on the path between the two clusters
An undirected graph has a join tree if and only if it is triangulated
A
B c
ED F
G H I
D,G
D,H B,D,E E,I
A,B,C B,C,E
C,F
D,G
D,H B,D,E E,I
A,B,C B,C,E
C,F
Graph
Join Graph Join Tree
17
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of undirected graphs
An undirected graph has a join tree if and only if it is triangulated
There is no join tree for this graph
CB
A
D
A,CB,D
A,B
C,D
Non-triangulated graph
Join graph
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of Directed Graphs
Parents(D)={A,B}Children(A)={C,D}Family(D)={A,B,D) (node and its parents)Ancestors(E)={A,B,C,D}Ancestors(A)=∅Ancestral numbering – numbering of
nodes such that the number of any node is less than that of its children
A
C
B
D
E
18
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of Directed Graphs
Undirected graph associated with a directed graph – drop directionality of links
Moral graph – obtained by linking every pair of nodes that share a common child and dropping the directions on links
A
C
B
D
E
A
C
B
D
E
A
C
B
D
E
Directed Graph Undirected Graph Moral Graph
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of Directed GraphsA cycle in a directed graph – closed directed pathA directed graph is acyclic (DAG) if it has no directed cyclesA directed graph is connected if the associated undirected graph is
connectedA connected directed graph is a tree if the associated undirected graph
is a tree. Otherwise it is multiply connectedSimple directed tree – every node has at most one parent; otherwise
polytree
A
C
B
D
E
Directed Acyclic Graph Simple Tree Polytree
A
C D
E
A
C
B
D
E
19
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Representation of graphs
Graphical representation Numerical representation • Adjacency matrix entry (i,j) is 1 if there is an edge from
node i to j.• Successive powers A1, A2.. of adjacency matrix provide the
number of paths of length equal to 1, 2, ..• Attainability matrix – entry (i,j) is 1 if there is a path from
node i to node j• If there is a path between two nodes, there is a path of
length less than N where N is the number of nodes in the graph
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
• Random variable X is conditionally independent of Y given Z if the probability distribution governing X is independent of the value of Y given the value of Z:
• P (X |Y, Z ) = P (X |Z ) that is, if
Building Probabilistic Models –Conditional Independence
)|(),|(),,( kikjikii zZxXPzZyYxXPzyx ======∀
20
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Conditional Independence
( ) ( )( )1,0|1
1|11,1|1====
======LighteningRainThunderP
LighteningThunderPLightningRainThunderP
( ) ( )( )0,0|1
0|10,1|1====
======LighteningRainThunderP
LighteningThunderPLightningRainThunderP
( ) ( )( )1,0|0
1|01,1|0====
======LighteningRainThunderP
LighteningThunderPLightningRainThunderP
( ) ( )( )0,0|0
0|00,1|0====
======LighteningRainThunderP
LighteningThunderPLightningRainThunderP
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Conditional Independence
( ) ( )
( ) ( )
variablesrandom tosassignment valuepossible allfor equations, of setsrepresent e that thesNotet.independen are and if ,
,...,
if given t independenmutually are ,..., space.event given aon
variablesrandom be and ,...Let
21121
121
1
1
ZZWZPWZZP
WZPWZZZP
WZZ
WZZ
i
n
in
n
n
=
= ∏=
21
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Independence and Conditional Independence
( ) ( )
( ) ( )
variablesrandom tosassignment valuepossible allfor equations, of setsrepresent e that thesNote
t.independen are and if
....
if given t independenmutually are ,..., space.event given aon variablesrandom
of setsdisjoint pairwise be and ,...Let
21121
11
1
1
ZZWZWZZ
WZWZZ
WZZ
WZZ
PP
PP i
n
in
n
n
=
=∪∪ ∏=
U
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Independence Properties of Random Variables
( )( ) ( ) ( ) ( ) ( )
( ) ( )( ) ( )( ) ( )( ) ( ) ( )
ce.independen of definition from Follows :Proof d. c. b. a.
:Then or is, That given tindependen are and that denote Let
space. event given a on variables randomof sets disjoint pairwise be Let
WYZXWYZXYZXYWZXWYZX
YZXWYZXXZYYZX
YXZYXYZYXYZXYZXZYX
ZYXW
UU
UU
U
UU
,,,,,,,,,,
,,,,,,,,
.,.,,
,,,
IIIIIII
IIPPPPP
I
⇒∧⇒⇒
⇒
==
22
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Implications of Independence
• Suppose we have 5 Binary features and a binary class label
• Without independence, in order to specify the joint distribution, we need to specify a probability for each possible assignment of values to each variable resulting in a table of size 26=64
• Suppose the features are independent given the class label – we only need 5(2x2)=20 entries
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayesian Networks
CancerSmoking{ }heavylightnoS ,,∈
{ }malignantbenignnoneC ,,∈P( S=no) 0.80P( S=light) 0.15P( S=heavy) 0.05
Smoking= no light heavyP( C=none) 0.96 0.88 0.60P( C=benign) 0.03 0.08 0.25P( C=malig) 0.01 0.04 0.15
23
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Product Rule
• P(C,S) = P(C|S) P(S)
S⇓ C⇒ none benign malignantno 0.768 0.024 0.008light 0.132 0.012 0.006heavy 0.035 0.010 0.005
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Marginalization
S⇓ C⇒ none benign malig totalno 0.768 0.024 0.008 .80light 0.132 0.012 0.006 .15heavy 0.035 0.010 0.005 .05
total 0.935 0.046 0.019
P(Cancer)
P(Smoke)
24
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayes Rule Revisited
)(),(
)()()|()|(
CPSCP
CPSPSCPCSP ==
S⇓ C⇒ none benign maligno 0.768/.935 0.024/.046 0.008/.019light 0.132/.935 0.012/.046 0.006/.019heavy 0.030/.935 0.015/.046 0.005/.019
Cancer= none benign malignantP( S=no) 0.821 0.522 0.421P( S=light) 0.141 0.261 0.316P( S=heavy) 0.037 0.217 0.263
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
A Bayesian Network
Smoking
GenderAge
Cancer
LungTumor
SerumCalcium
Exposureto Toxics
25
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Independence
Age and Gender are independent.
P(A|G) = P(A) A ⊥ G P(G|A) = P(G) G ⊥ A
GenderAge
P(A,G) = P(G|A) P(A) = P(G)P(A)P(A,G) = P(A|G) P(G) = P(A)P(G)
P(A,G) = P(G)P(A)
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Conditional Independence
Smoking
GenderAge
Cancer
Cancer is independent of Age and Gendergiven Smoking.
P(C|A,G,S) = P(C|S) C ⊥ A,G | S
26
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
More Conditional Independence:Naïve Bayes
Cancer
LungTumor
SerumCalcium
Serum Calcium is independent of Lung Tumor, given Cancer
P(L|SC,C) = P(L|C)
Serum Calcium and Lung Tumor are dependent
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Naïve Bayes in general
H
E1 E2 E3 En…...
2n + 1 parameters:nihePheP
hP
ii ,,1),|(),|()(
K=
27
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
More Conditional Independence:Explaining Away
Exposure to Toxics is dependent on Smoking, given Cancer
Exposure to Toxics and Smoking are independentSmoking
Cancer
Exposureto Toxics
E ⊥ S
P(E = heavy | C = malignant) >P(E = heavy | C = malignant, S=heavy)
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Put it all together
=),,,,,,( SCLCSEGAP
Smoking
GenderAge
Cancer
LungTumor
SerumCalcium
Exposureto Toxics
)|()|( CLPCSCP ⋅
⋅⋅ )()( GPAP
⋅⋅ ),|()|( GASPAEP⋅),|( SECP
28
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
General Product (Chain) Rule for Bayesian Networks
)|(),,,(1
21 iPa∏=
=n
iin XPXXXP K
Pai=parents(Xi)
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
• Naive assumption of variables are independent (e.g., Naïve Bayes assumption that the variables are independent given the class) can be too restrictive
• But representing joint distributions is intractable without some independence assumptions
• Bayesian networks explicitly model conditional independence among subsets of variables to yield a graphical representation of probability distributions that admit such independence
Bayesian Networks
29
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayesian network
• Bayesian network is a directed acyclic graph (DAG) in which the nodes represent random variables
• Each node is annotated with a probability distribution P (Xi | Parents(Xi ) ) representing the dependency of that node on its parents in the DAG
• Each node is asserted to be conditionally independent of its non-descendants, given its immediate predecessors.
• Arcs represent direct dependencies
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayesian Networks
Efficient factorized representation of probability distributions via conditional independence
0.9 0.1
e
be
0.2 0.8
0.01 0.990.9 0.1
bebb
e
BE P(A | E,B)Earthquake
Radio
Burglary
Alarm
Call
30
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayesian Networks
• Qualitative partstatistical independence statements represented in the form of a directed acyclic graph (DAG)• Nodes - random
variables • Edges – direct
influence
Quantitative part Conditional probability distributions – one for each random variable conditioned on its parents
0.9 0.1
e
be
0.2 0.8
0.01 0.99
0.9 0.1
bebb
e
BE P(A | E,B)Earthquake
Radio
Burglary
Alarm
Call
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Qualitative part
• Nodes are independent of non-descendants given their parents
d-separation: • a graph theoretic criterion
for reading independence statements
• can be computed in linear time (in the number of edges)
Earthquake
Radio
Burglary
Alarm
Call
31
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Directed graphs and joint probabilities
• Let be a set of random variables
• Let be the set of parents of
• Associate a vertex in the directed a-cyclic graph with a random variable and a function of the form
• Then
•
{ }nXXX ...., 21
iX
( )i
xxf ii π,
( ) ( )i
xxfxxp i
n
iin π
=∏= ,...
11
iX π
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
What independences does a Bayes Net model?• In order for a Bayesian network to model a probability
distribution, the following must be true by definition: • Each variable is conditionally independent of all its non-
descendants in the graph given the value of all its parents.
This implies
But what else does it imply?
∏=
=n
iiin XparentsXPXXP
11 ))(|()( K
Earthquake
Radio
Burglary
Alarm
Call
)|(),|()|()()(),,,,(
ACPBEAPERPBPEPCARBEP =
32
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
What Independences does a Bayes Network model?
Example:
Z
Y
X
Given Y, does learning the value of Z tell us nothing new about X?
i.e., is P(X|Y, Z) equal to P(X | Y)?
Yes. Since we know the value of all ofX’s parents (namely, Y), and Z is not adescendant of X, X is conditionally independent of Z.
Also, since independence is symmetric, P(Z|Y, X) = P(Z|Y).
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Quick proof that independence is symmetric
• Assume: P(X|Y, Z) = P(X|Y) • X and Z are independent given Y
),()()|,(),|(
YXPZPZYXPYXZP =
)()|()(),|()|(
YPYXPZPZYXPZYP
=
(Bayes’s Rule)
(Chain Rule)
(By Assumption)
(Bayes’s Rule))()|(
)()|()|(YPYXP
ZPYXPZYP=
)|()(
)()|( YZPYP
ZPZYP==
33
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
What Independences does a Bayes Network model?
• Let I(X,Y,Z) represent X and Z being conditionally independent given Y.
• I(X,Y,Z)? Yes, just as in previous example: All X’s parents given, and Z is not a descendant.
Y
X Z
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
What Independences does a Bayes Network model?
• I(X,{U},Z)? No.• I(X,{U,V},Z)? Yes.
Z
VU
X
34
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Things get a little more confusing
• X has no parents, so we know all its parents’ values trivially• Z is not a descendant of X• So, I(X,{},Z), even though there’s a undirected path from X to Z
through an unknown variable Y.• What if we do know the value of Y ? Or one of its descendants?
ZX
Y
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
The Burglar Alarm example
• Your house has a twitchy burglar alarm that is also sometimes triggered by earthquakes.
• Earth arguably doesn’t care whether your house is currently being burgled
• While you are on vacation, one of your neighbors calls and tells you your home’s burglar alarm is ringing.
Burglar Earthquake
Alarm
Phone Call
35
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Burgler Alarm Example (Contd)
• But now suppose you learn that there was a medium-sized earthquake in your neighborhood. …Probably not a burglar after all.
• Earthquake “explains away” the hypothetical burglar.
• But then it must NOT be the case thatI(Burglar,{Phone Call}, Earthquake), even though I(Burglar,{}, Earthquake)!
Burglar Earthquake
Alarm
Phone Call
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
d-separation to the rescue
• Fortunately, there is a relatively simple algorithm for determining whether two variables in a Bayesian network are conditionally independent: d-separation.
36
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Blocked Unblocked
d-separation
Two variables are independent if all paths between them are blocked by evidence
Three cases:• Common cause• Intermediate cause• Common Effect
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Blocked UnblockedE
R A
E
R A
d-separation
• Two variables are independent if all paths between them are blockedby evidence
• Three cases:• Common cause• Intermediate cause• Common Effect
Blocked Unblocked
If we do not know whether an earthquake occurred, then radio announcement can influence our belief about the alarm having gone off. If we know that earthquake occurred, then radio announcement gives no information about the alarm
Evidence may be transmitted through a diverging connection unless it is instantiated.
37
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Blocked UnblockedE
C
A
E
C
A
d-separation
Common causeIntermediate causeCommon Effect
Blocked Unblocked
Evidence may be transmitted through a serial connection unless it is blocked
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Blocked UnblockedE B
A
C
E B
A
CE B
A
C
d-separation
Common cause
Intermediate cause
Common Effect
Blocked Unblocked
Evidence may be transmitted through a converging connection only if either the variable or one of its descendents has received evidence
38
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
I(X,Y,Z) denotes X and Z are independent given Y– Surely I(R,B) – Possibly ¬I(R,A,B)– Surely I(R,{E,A}B)– Possibly ¬I(R,B,C)
Example
E B
A
C
R
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
d-separation
Definition: X and Z are d-separated by a set of evidence variables E iff every undirected path from X to Z is “blocked” by evidence E
39
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
d-separation
• Theorem [Verma & Pearl, 1998]: If a set of evidence variables E d-separates X and Z in a Bayesian network’s graph, then I(X, E, Z).
• d-separation can be computed in linear time using a depth-first search like algorithm.
• We now have a fast algorithm for automatically inferring whether finding out about the value of one variable might give us any additional hints about some other variable, given what we already know.
• Variables may actually be independent when they’re not d-separated, depending on the actual probabilities involved
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
d-separation example
A B
C D
E F
G
I
H
J
I(C, {}, D)?I(C, {A}, D)?I(C, {A, B}, D)?I(C, {A, B, J}, D)?I(C, {A, B, E, J}, D)?
40
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Markov Blanket
• A node is conditionally independent of all other nodes in the network given its parents, children, and children’s parents -
Alarm
MaryCallsJohnCalls
EarthquakeBurglary
Burglary is independent of John Calls and Mary Calls given Alarm and Earth Quake
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayesian Networks: Summary
• Bayesian networks offer an efficient representation of probability distributions
• Efficient:• Local models• Independence (d-separation)
• Effective: Algorithms take advantage of structure to • Compute posterior probabilities • Compute most probable instantiation• Decision making
41
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference in Bayesian networks
• BN models compactly the full joint distribution by taking advantage of existing independences between variables
• Inference tasks:• – Diagnostic inference (from effect to cause)
P ( Burglary | JohnCalls = T )• – Predictive inference (from cause to effect)
P ( JohnCalls | Burglary = T )• – Other probabilistic queries (queries on joint distributions).• Can we take advantage of independences to construct
special algorithms and speeding up the inference?
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayesian network inference
P(E)=0.002
Alarm
MaryCallsJohnCalls
EarthquakeBurglaryP(B)=0.001
P(A|B,E)=0.95P(A|B, ¬E)=0.94P(A|¬B,E)=0.29P(A|¬B, ¬E)=0.001
P(J|A)=0.9P(J|¬A)=0.05
P(M|A)=0.7P(M|¬A)=0.01
42
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Example
• Device operating normally or malfunctioning• A sensor indirectly monitors the operation of the device• Sensor reading is either high or low
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Diagnostic inference. Example
Diagnostic inference: compute the probability of device operating normally given the sensor reading is high (S ).
•
43
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference in Bayesian network
Bad news:• – Exact inference problem in BNs is NP-hard (Cooper)• – Approximate inference is NP-hard (Dagum, Luby)In practice, things are not so bad• Exact inference
– Inference in Simple Chains– Variable elimination– Clustering / join tree algorithms
• Approximate inference– Stochastic simulation / sampling methods– Markov chain Monte Carlo methods– Mean field theory
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Computing joint probability distributions using a Bayesian network
Any entry in the joint probability distribution can be calculated from the Bayesian network.
)()(),|()|()|( ),(),|()|()|(
),,(),,|()|( ),,,(),,,|(),,,,(
EPBPEBAPAMPAJPEBPEBAPAMPAJP
EBAPEBAMPAJPEBAMPEBAMJPEBAMJP
¬¬¬¬=¬¬¬¬=
¬¬¬¬=¬¬¬¬=¬¬
(We’re just using the chain rule and conditional independence.)
44
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Computing joint probabilities
• Joint distribution can be used to answer any query about the domain.
• Bayesian network represents the joint distribution• Any query about the domain can be answered using a BN• Tradeoff: A BN can be much more concise, but you need
to calculate, rather than look up in a table, probabilities fromthe joint distribution
))(|()(),...,(2
111 ii
n
iinn XParentsxXPXPxXxXP ==== ∏
=
General formula:
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Computing joint probability distributions using a Bayesian network
Any entry in the joint probability distribution can be calculated from the Bayesian network.
)()(),|()|()|( ),(),|()|()|(
),,(),,|()|( ),,,(),,,|(),,,,(
EPBPEBAPAMPAJPEBPEBAPAMPAJP
EBAPEBAMPAJPEBAMPEBAMJPEBAMJP
¬¬¬¬=¬¬¬¬=
¬¬¬¬=¬¬¬¬=¬¬
(We’re just using the chain rule and conditional independence.)
45
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference in Bayesian networks
Blind approach• Sum out all un-instantiated variables from the full joint,• Express the joint distribution as a product of conditionals
Computational cost• Number of additions: 15• Number of products: 16x4 = 64
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference in Bayesian networksInterleave sums and products• Combines sums and product in a smart way (multiplication
constants can be taken out of the sum)
Computational cost:• Number of additions: 1+2 [1+1+2]=9• Number of products: 2[2+2(1+2)]=16
46
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference in Bayesian networks
• Smart interleaving of sums and products can help us to speed up the computation of joint probability queries
• What if we want to compute P(B = T , J = T )?
• Smart caching of results of computation that would otherwise be repeated can save time
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference in Bayesian networks• When does caching of results becomes handy?• There are other queries when results can be shared• General technique: Variable elimination
•
47
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference in Bayesian networks
• When does caching of results becomes handy?• What if we want to compute a diagnostic query:
• Exactly probabilities we have just computed !!• There are other queries when cashing and ordering of sums and
products can be shared and saves computation
• General technique: Variable elimination
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference in Bayesian networks
General idea of variable elimination
Results cached in tree structure
48
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference in Bayesian Networks
Find P(Q=q|E=e)- Q the query variable(s)- E set of evidence variables
P(q|e) = P(q,e) / P(e)X1,.. Xn are network variables except Q,E
( ) ( )∑=nxxx
nXXXeqeqP...,
...,,,,21
21
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Basic Inference
P(b) = ?
A B
∑∑ ==aa
bP P(a) a) | P(b b) P(a,)(
49
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Basic Inference
∑∑ ==aa
bP P(a) a) | P(b b) P(a,)(
A B C
∑=b
bPbcPcP )()|()(
∑
∑
∑∑
=
=
==
ba
ba
baba
bPbcP
aPabPbcP
aPabPabcPcbaPcP
,
,
,,
)()|(
)()|()|(
)()|(),|(),,()(
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference in trees
Y1 Y2
X
( ) ( ) ( ) ( ) ( ) ( )2121212121212121
YPYPYYXPYYPYYXPYYXPXPyyyyyy
∑∑∑ ===,,,
,|,,|,,)(
50
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Polytrees
A network is singly connected (a polytree) if it contains no undirected loops.
Not a polytree Polytree
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference in polytrees
• Theorem: Inference in polytrees can be performed in time that is polynomial in the number of variables.
• Main idea: in variable elimination, need only maintain distributions over single nodes.
51
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference with Bayesian Networks
• Inference in polytrees can be performed efficiently• Inference with DAG is NP-Hard – Proof by
reduction of SAT to Bayesian network inference
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Approaches to inference
• Exact inference – Inference in Simple Chains– Variable elimination– Clustering / join tree algorithms
• Approximate inference– Stochastic simulation / sampling methods– Markov chain Monte Carlo methods– Mean field theory
52
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference – A more complicated example
RainSprinkler
Cloudy
WetGrass
∑=c,s,r
)c(P)c|s(P)c|r(P)s,r|w(P)w(P
∑ ∑=s,r c
)c(P)c|s(P)c|r(P)s,r|w(P
∑=s,r
1 )s,r(f)s,r|w(P )s,r(f1
Because of the structure of the BN, some sub-expressions in the joint depend only on a small number of variablesBy computing them once and caching the result, we can avoid generating them exponentially many times
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Variable Elimination
• General idea:• Write query in the form
• Iteratively– Move all irrelevant terms outside of innermost sum– Perform innermost sum, getting a new term– Insert the new term into the product
∑ ∑∑∏=kx x x i
iin paxPXP3 2
)|(),( Le
53
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Variable Elimination
• A factor over X is a function from Domain(X) to numbers in the interval [0,1]
• A conditional probability table is a factor• A joint distribution is a factor• Bayesian network inference • Factors are multiplied to generate new ones• Variables in factors are summed out (marginalization)• A variable can be summed out as soon as all the factors in which the
variable appears have been multiplied
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
A More Complex Example
Visit to Asia Smoking
Lung CancerTuberculosis
Abnormalityin Chest Bronchitis
X-Ray Dyspnea
54
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
V S
LT
A B
X D
),|( )|( ),|( )|( )|( )|( )( )( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: v,s,x,t,l,a,b
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: v,s,x,t,l,a,b• Initial factors
Eliminate: v
Note: fv(t) = P(t)In general, result of elimination is not necessarily a probability term
Compute: ∑=v
v vtPvPtf )|()()(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv⇒
55
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: s,x,t,l,a,b• Initial factors
Eliminate: s
Summing on s results in a factor with two arguments fs(b,l)In general, result of elimination may be a function of several variables.
Compute: ∑=s
s slPsbPsPlbf )|()|()(),(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv⇒
),|()|(),|(),()( badPaxPltaPlbftf sv⇒
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
V SLT
A BX D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: x,t,l,a,b• Initial factors
Eliminate: x
Note: fx(a) = 1 for all values of a
Compute: ∑=x
x axPaf )|()(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv⇒
),|()|(),|(),()( badPaxPltaPlbftf sv⇒
),|(),|()(),()( badPltaPaflbftf xsv⇒
56
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
V S
LT
A B
X D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: t,l,a,bInitial factors
Eliminate: tCompute: ∑=
tvt ltaPtflaf ),|()(),(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv⇒
),|()|(),|(),()( badPaxPltaPlbftf sv⇒
),|(),|()(),()( badPltaPaflbftf xsv⇒
),|(),()(),( badPlafaflbf txs⇒
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
V SLT
A BX D
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: l,a,b• Initial factors
Eliminate: lCompute: ∑=
ltsl laflbfbaf ),(),(),(
),|()|(),|()|()|()()( badPaxPltaPsbPslPsPtfv⇒
),|()|(),|(),()( badPaxPltaPlbftf sv⇒
),|(),|()(),()( badPltaPaflbftf xsv⇒
),|(),()(),( badPlafaflbf txs⇒
),|()(),( badPafbaf xl⇒
57
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
• We want to compute P(d)• Need to eliminate: b• Initial factors
Eliminate: a,bCompute:
∑∑ ==b
aba
xla dbfdfbadpafbafdbf ),()(),|()(),(),(
),|(),|()(),()( badPltaPaflbftf xsv⇒
),|()(),( badPafbaf xl⇒
),|(),()(),( badPlafaflbf txs⇒
)(),( dfdbf ba ⇒⇒
V SLT
A BX D
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Basic operations
• Multiplying two factors• Summing out a variable from a product of factors –
marginalization
58
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Example: Multiplying factorsPointwise product
• Pointwise product is NOT– matrix multiplication– element by element multiplication
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Dealing with evidence
• How do we deal with evidence?• Suppose get evidence V = 1, S = 0, D = 1• We want to compute P(L, V = 1, S = 0, D = 1)
V S
LT
A B
X D
59
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Dealing with Evidence
• We start by writing the factors:
• Since we know that V = 1, we don’t need to eliminate V• Instead, we can replace the factors P(V) and P(T|V) with
• These “select” the appropriate parts of the original factors given the evidence
• Note that fp(V) is a constant, and thus does not appear in elimination of other variables
),|()|(),|()|()|()|()()( badPaxPltaPsbPslPvtPsPvP
)|()()( )|()( 11 ==== VTPTfVPf VTpVP
V S
LT
A B
X D
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Variable Elimination
• We now understand variable elimination as a sequence of rewriting operations
• Actual computation is done in elimination step• Computation depends on order of elimination
60
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Dealing with Evidence
• Given evidence V = 1, S = 0, D = 1• Compute P(L, V = 1, S = 0, D = 1 )• Initial factors, after setting evidence:
),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP
V S
LT
A B
X D
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
• Given evidence V = 1, S = 0, D = 1• Compute P(L, V = 1, S = 0, D = 1 )• Initial factors, after setting evidence:
• Eliminating x, we get
),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP
),()(),|()()()( ),|()|()|()|()()( bafafltaPbflftfff badPxsbPslPvtPsPvP
Dealing with EvidenceV S
LT
A B
X D
61
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Dealing with Evidence
• Given evidence V = 1, S = 0, D = 1• Compute P(L, V = 1, S = 0, D = 1)• Initial factors, after setting evidence:
• Eliminating x, we get
• Eliminating t, we get
),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP
),()(),|()()()( ),|()|()|()|()()( bafafltaPbflftfff badPxsbPslPvtPsPvP
),()(),()()( ),|()|()|()()( bafaflafbflfff badPxtsbPslPsPvP
V S
LT
A B
X D
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Dealing with Evidence
• Given evidence V = 1, S = 0, D = 1• Compute P(L, V = 1, S = 0, D = 1)• Initial factors, after setting evidence:
• Eliminating x, we get
• Eliminating t, we get
• Eliminating a, we get
),()|(),|()()()( ),|()|()|()|()()( bafaxPltaPbflftfff badPsbPslPvtPsPvP
),()(),|()()()( ),|()|()|()|()()( bafafltaPbflftfff badPxsbPslPvtPsPvP
),()(),()()( ),|()|()|()()( bafaflafbflfff badPxtsbPslPsPvP
),()()( )|()|()()( lbfbflfff asbPslPsPvP
V S
LT
A B
X D
62
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Variable Elimination Algorithm• Let X1,…, Xm be an ordering on the non-query variables
• For i= m, …, 1
• Leave in the summation for Xi only factors mentioning Xi
• Multiply the factors, getting a factor that contains a number for each value of the variables mentioned, including Xi
• Sum out Xi, getting a factor f that contains a number for each value of the variables mentioned, not including Xi
• Replace the multiplied factor in the summation
∏∑ ∑∑j
jjX XX
XParentsXPm
))(|(...1 2
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
∑=x
kxkx yyxfyyf ),,,('),,( 11 KK
∏=
=m
ilikx i
yyxfyyxf1
,1,1,11 ),,(),,,(' KK
Complexity of variable elimination• Suppose in one elimination step we compute
• This requires multiplications
• For each value for x, y1, …, yk, we do m multiplications• additions
• For each value of y1, …, yk , we do |Domain(X)| additions• Complexity is (not surprisingly) exponential in number of variables
in the intermediate factor!
∏⋅⋅i
iYDomainXDomainm )()(
∏⋅i
iYDomainXDomain )()(
63
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Understanding Variable Elimination
• We want to select “good” elimination orderings that reduce complexity
• This can be done be examining a graph theoretic property of the “induced” graph; we will not cover this in class.
• This reduces the problem of finding good ordering to graph-theoretic operation that is well-understood—unfortunately computing it is NP-hard!
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Exercise: Variable elimination
smart study
prepared fair
pass
p(smart)=.8
p(study)=.6
p(fair)=.9
Query: What is the probability that a student is smart, given that he/she passes the exam?
.9
.5
.7
.1
TFTF
TTFF
P(Pr|…)StSm
TTTTFFFF
Sm
TTFFTTFF
Pr
TFTFTFTF
F
.9
.1
.7
.1
.7
.1
.2
.1
P(Pa|…)
64
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayesian Network Inference in polytrees – Message Passing algorithm
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Decomposing the probabilities
• Suppose we want P(Xi | E) where E is some set of evidence variables.
• Let’s split E into two parts:
– Ei- is the part consisting of assignments to variables
in the subtree rooted at Xi
– Ei+ is the rest of the variables in E
Xi
65
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Decomposing the probabilities
)(λ)(α)|(
)|()|()|(
)|(),|(
),|()|(
ii
ii
iiii
ii
iiiii
iiii
XXπEEP
EXPXEP
EEPEXPEXEP
EEXPEXP
=
=
=
=
+−
+−
+−
++−
+−
Xi
Where:• α is a constant independent of Xi• π(Xi) = P(Xi |Ei
+)• λ(Xi) = P(Ei
-| Xi)
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Using the decomposition for inference
• We can use this decomposition to do inference as follows. First, compute λ(Xi) = P(Ei
-| Xi) for all Xi recursively, using the leaves of the tree as the base case.
66
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Quick aside: “Virtual evidence”
• For theoretical simplicity, but without loss of generality, let us assume that all variables in E (the evidence set) are leaves in the tree.
Xi
Xi
Xi’Observe Xi Equivalent to Observe Xi’
Where P(Xi’| Xi) =1 if Xi’=Xi, 0 otherwise
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Calculating λ(Xi) for non-leaves
• Suppose Xi has one child, Xj =Xc. • Then:
Xi
Xc
∑
∑
∑
∑
=
=
=
==
−
−
−−
j
j
j
j
Xjij
Xjiij
Xjiiij
Xijiiii
XXXP
XEPXXP
XXEPXXP
XXEPXEPX
)(λ)|(
)|()|(
),|()|(
)|,()|()(λ
67
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Calculating λ(Xi) for non-leaves
• Now, suppose Xi has a set of children, C.• Since Xi d-separates each of its subtrees, the contribution of each
subtree to λ(Xi) is independent:
∏ ∑
∏
∈
∈
−
⎥⎥⎦
⎤
⎢⎢⎣
⎡=
==
CX Xjij
CXijiii
j j
j
XXXP
XXEPX
)λ()|(
)(λ)|()(λ
• where λj(Xi) is the contribution to P(Ei-| Xi) of the part of the evidence
lying in the subtree rooted at one of Xi’s children Xj.
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
We are now λ-happy
• We have a way to recursively compute all the λ(Xi)’s, starting from the root and using the leaves as the base case.
• We can think of each node in the network as an autonomous processor that passes a little “λ message” to its parent.
λ λ λ λ
λλ
68
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Computing π(Xi)
Xp
Xi
• Where πi(Xp) is defined as
∑
∑
∑
∑
∑
=
=
=
=
==
+
++
++
p
p
p
p
p
Xpipi
X pi
ppi
Xippi
Xipipi
Xipiiii
XXXP
XEXP
XXP
EXPXXP
EXPEXXP
EXXPEXPX
)(π)|(
)(λ)|(
)|(
)|()|(
)|(),|(
)|,()|()(π
)(λ)|(
pi
p
XEXP
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Bayesian network inference in trees
• Thus we can compute all the π(Xi)’s, and, in turn, all the P(Xi|E)’s.
• Can think of nodes as autonomous processors passing λ and π messages to their neighbors
λ λ λ λ
λλπ π
π π π π
69
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Conjunctive queries
• What if we want, e.g., P(A, B | C) instead of just marginal distributions P(A | C) and P(B | C)?
• Just use chain rule:
– P(A, B | C) = P(A | C) P(B | A, C)– Each of the latter probabilities can be computed using
the technique just discussed.
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Polytrees
• Previous technique can be generalized to polytrees: undirected versions of the graphs are still trees, but nodes canhave more than one parent
70
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Dealing with cycles
• Can deal with undirected cycles in graph by• clustering variables together
• Conditioning
B
A
C
D
A
D
BC
Set to 0 Set to 1
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Dealing with cycles
• Can deal with undirected cycles in graph by• clustering variables together
B
A
C
D
A
D
BC
71
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Join trees or junction treesArbitrary Bayesian network can be transformed via a graph-
theoretic trick into a join tree (also used in databases) in which a similar method can be employed.
AB
E D
F
C
G
In the worst case the join tree nodes must take on values whose number grows exponentially with the number of nodes that are clustered together, but this often works well in practicewhen the number of nodes per cluster is small
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Junction Tree• Why junction tree?
– Variable elimination is inefficient if the undirected graph underlying the Bayesian Nets contains cycles
– We can avoid cycles if we turn highly-interconnected subsets of the nodes into “supernodes” cluster
• Objective– Compute
• is a value of a variable and is evidence for a set of variables
)|( eE == vVPv V e
E
72
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Potentials
• Potentials:
– Denoted by • Marginalization
– , the marginalization of into X
• Multiplication
– , the multiplication of and
Φ :X→ R+ ∪{0}
Xφ
∑=X\Y
YX φφ
YX ⊆φ
YXZ ∪=
YφXφYXZ φφφ =
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of Junction Tree• An undirected tree• Each node is a cluster (nonempty set) of variables• Running intersection property:
– Given two clusters and , all clusters on the path between and contain
• Separator sets (sepsets): – Intersection of the adjacent cluster
X YX Y YX ∩
ADEABD DEFAD DE
Cluster ABD
SepsetDE
73
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of Junction Tree
• Belief potentials: – Map each instantiation of clusters or sepsets into a real
number• Constraints:
– Consistency: for each cluster and neighboring sepset
– The joint distribution
X S
SS\X
X φφ =∑
∏∏=
j
i
j
iPS
XUφφ
)(
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Properties of Junction Tree
• If a junction tree satisfies these properties, it follows that:– For each cluster (or sepset) ,
– The probability distribution of any variable , using any cluster (or sepset) that containsX
)(XX P=φV
X
V
∑=}\{
)(V
VPX
Xφ
74
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Building Junction Trees
DAG
Moral Graph
Triangulated Graph
Junction Tree
Identifying Cliques
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Constructing the Moral Graph
A
B
D
C
E
G
F
H
75
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Constructing The Moral Graph
• Add undirected edges to all co-parents which are not currently joined –Marrying parents
A
B
D
C
E
G
F
H
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Constructing The Moral Graph
• Add undirected edges to all co-parents which are not currently joined –Marrying parents
• Drop the directions of the arcs
A
B
D
C
E
G
F
H
76
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Triangulating
• An undirected graph is triangulated iff every cycle of length >3 contains an edge to connects two nonadjacent nodes
A
B
D
C
E
G
F
H
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Identifying Cliques
• A clique is a subgraph of an undirected graph that is complete (has an edge between each pair of vertices) and maximal
A
B
D
C
E
G
F
H
EGH
ADEABD
ACEDEF
CEG
77
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Junction Tree• A junction tree is a subgraph of the clique
graph that – is a tree – contains all the cliques– satisfies the junction tree property
• Junction tree property: For each pair U, Vof cliques with intersection S, all cliques on the path between U and V contain S.
EGH
ADEABD
ACEDEF
CEG
ADEABD ACEAD AE CEGCE
DEF
DE
EGH
EG
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference
• Choose a root• For each distribution (CPT) in the original Bayes Net, put
this distribution into one of the clique nodes that contains allthe variables referenced by the CPT. (At least one such node must exist because of the moralization step).
• For each clique node, take the product of the distributions (as in variable elimination).
78
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Example: Create Join Tree
X1 X2
Y1 Y2
Junction Tree:
X1,X2X1,Y1 X2,Y2X1 X2
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Example: Initialization
X1,X2X2
X2,Y2Y2
X1,Y1Y1
X1,Y1X1
Potential functionAssociated ClusterVariable
φX1,Y1 = P(X1)
)1|1()1(1,1 XYPXPYX =φ
φX1,X 2 = P(X2 | X1)
φX 2,Y 2 = P(Y2 | X2)
X1,X2X1,Y1 X2,Y2X1 X2
79
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Example: Collect Evidence
• Choose arbitrary clique, e.g. X1,X2, where all potential functions will be collected.
• Call recursively neighboring cliques for messages:• 1. Call X1,Y1.
– 1. Projection:
– 2. Absorption:
∑ ∑−
==φ=φ1
11111 1
111XYX Y
YXX XPYXP},{
, )(),(
),()()|(,, 211121
1
2121XXPXPXXPold
X
XXXXX ==
φ
φφ←φ
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Example: Collect Evidence (cont.)
• 2. Call X2,Y2:– 1. Projection:
– 2. Absorption:
φX 2 = φX 2,Y 2 = P(Y2 | X2) =1Y 2∑
{X 2,Y 2}−X 2∑
X1,X2X1,Y1 X2,Y2X1 X2
φX1,X 2 ← φX1,X 2φX 2
φX 2old = P(X1, X2)
80
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Example: Distribute Evidence
• Pass messages recursively to neighboring nodes• Pass message from X1,X2 to X1,Y1:
– 1. Projection:
– 2. Absorption:
φX1 = φX1,X 2 = P(X1, X2) = P(X1)X 2∑
{X1,X 2}−X1∑
φX1,Y1 ← φX1,Y1φX1
φX1old = P(X1,Y1) P(X1)
P(X1)
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Example: Distribute Evidence (cont.)
• Pass message from X1,X2 to X2,Y2:– 1. Projection:
– 2. Absorption:
φX 2 = φX1,X 2 = P(X1, X2) = P(X2)X1∑
{X1,X 2}−X 2∑
φX 2,Y 2 ← φX 2,Y 2φX 2
φX 2old = P(Y2 | X2) P(X2)
1= P(Y2, X2)
X1,X2X1,Y1 X2,Y2X1 X2
81
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Approximate Inference
• With large and highly connected graphical models, the associated cliques for the junction tree algorithm or the intermediate factors in the variable elimination algorithm will grow in size, generating an exponential blowup in the number of computations performed
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Inference in Bayesian network
Exact inference algorithms:• Variable elimination• Symbolic inference (D’Ambrosio)• Message passing algorithm (Pearl)• Clustering and join tree approach (Lauritzen, Spiegelhalter)Approximate inference algorithms:• Monte Carlo methods:• Forward sampling, Likelihood sampling• Variational methods
82
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Stochastic simulation• Suppose you are given values for some subset of the
variables, G, and want to infer values for unknown variables, U
• Randomly generate a very large number of instantiations from the BN• Generate instantiations for all variables – start at root
variables and work your way “forward”• Only keep those instantiations that are consistent with the
values for G• Use the frequency of values for U to get estimated
probabilities• Accuracy of the results depends on the size of the sample
(asymptotically approaches exact results)
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Stochastic Simulation
RainSprinkler
Cloudy
WetGrass
1. Draw N samples from the BN by repeating 1.1 and 1.21.1. Guess Cloudy at random according to P(Cloudy) 1.2. For each guess of Cloudy, guess
Sprinkler and Rain, then WetGrass2. Compute the ratio of the # runs where
WetGrass and Cloudy are True over the # runs where Cloudy is True
P(WetGrass|Cloudy)?
P(WetGrass|Cloudy) = P(WetGrass, Cloudy) / P(Cloudy)
83
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Stochastic simulation• The probability is approximated using sample frequencies
BN sampling: • Generate sample in a top down manner, following the links
in BN• A sample is an assignment of values to all
variables
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
BN Sampling Example
84
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
BN Sampling Example
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
BN Sampling Example
85
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
BN Sampling Example
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
BN Sampling Example
86
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
BN Sampling Example
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Rejection Sampling
Rejection sampling:• Generate sample for the full joint by sampling BN• Use only samples that agree with the condition, the
remaining samples are rejected• Problem: many samples can be rejected
87
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting
• Avoids inefficiencies of rejection sampling• Idea: generate only samples consistent with an evidence
(or conditioning event)• If the value is set by evidence, there is no sampling• Problem: using simple counts is not enough since these
may occur with different probabilities• Likelihood weighting: with every sample keep a weight with
which it should count towards the estimate
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
88
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
89
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
90
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
91
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
92
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
93
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
94
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood weighting Example
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood Sampling
95
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood Sampling
Vasant Honavar, 2006.
Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory
Likelihood Weighting