Download - Probabilistic Reasoning Systems Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 15 Capturing uncertain knowledge Probabilistic inference.

Probabilistic Reasoning Systems

Copyright, 1996 © Dale Carnegie & Associates, Inc.

Chapter 15

• Capturing uncertain knowledge

• Probabilistic inference

CS 471/598 by H. Liu 2

Knowledge representation

Joint probability distribution can answer any question about the domain can become intractably large as #RV grows can be difficult to specify P for atomic events

Conditional independence can simplify P assign’t

A data structure - a belief network that represents the dependence between variables and gives a concise specification of the joint.

CS 471/598 by H. Liu 3

A belief network is a graph: A set of random variables A set of directed links connects pairs of nodes Each node has a conditional P table that

quantifies the effects that the parents have on the node

The graph has no directed cycles (DAG)

It is usually much easier for an expert to decide conditional dependence relationships than specifying probabilities

CS 471/598 by H. Liu 4

Once the network is specified, we need only specify conditional probabilities for the nodes that participate in direct dependencies, and use those to compute any other probabilities.

An example of burglary-alarm-call (Fig 15.1) The topology of the network can be thought of as

the general structure of the causal process. Many details (Mary listening to loud music, or

phone ringing and confusing John) are summarized in the uncertainty associated with the links from Alarm to JohnCalls and MaryCalls.

CS 471/598 by H. Liu 5

The probabilities actually summarize a potentially infinite set of possible circumstances

Specifying the CPT for each node (P 438) A conditioning case - a possible combination of

values for the parent nodes (2^n) Each row in a CPT must sum to 1 A node with no parents has only one row (priors)

Fig 15.2 shows the complete network for the burglary example.

CS 471/598 by H. Liu 6

The semantics of belief networks

Two equivalent views of a belief network Representing the JPD - helpful in constructing

networks Representing conditional independence

relations - helpful in designing inference procedures

CS 471/598 by H. Liu 7

1. Representing JPD - constructing a BN

A belief network provides a complete description of the domain. Every entry in the JPD can be calculated from the info in the network.

A generic entry in the joint is the probability of a conjunction of particular assignments to each variable.

P(x1,…,xn)=P(xi|Parents(xi)) (15.1)

What’s the probability of the event of J^M^A^!B^!E?

CS 471/598 by H. Liu 8

A method for constructing belief networks

Eq 15.1 defines what a given BN means but implies certain conditional independence relationships that can be used to guide the construction.

P(x1,…,xn)=P(xn|xn-1,…,x1)P(xn-1,…,x1)P(Xi|Xi-1,…,X1)=P(Xi|Parents(Xi)) (15.2)

The BN is a correct representation of the domain only if each node is C-indep’t of its predecessors in the node ordering, given its parents.

P(M|J,A,E,B)=P(M|A)

CS 471/598 by H. Liu 9

Incremental network construction

Choose relevant variables describing the domain

Choose an ordering for the variablesWhile there are variables left:

Pick a var and add a node to the network Set its parents to some minimal set of nodes

already in the net to satisfy Eq.15.2 Define the CPT for the var.

CS 471/598 by H. Liu 10

Compactness

A belief network can often be far more compact than the full joint.

In a locally structured system, each sub-component interacts directly with only a bounded number of other components.

Local structure is usually associated with linear rather than exponential growth in complexity.

With 20 nodes, if a node is directly influenced by 5 nodes, what’s the difference between BN & joint?

CS 471/598 by H. Liu 11

Node ordering

The correct order to add nodes is to add the “root causes” first, then the variables they influence, and so on until we reach the leaves that have no direct causal influence on the other variables.

What happens if we happen to choose the wrong order? Fig 15.3 shows an example.

If we stick to a true causal model, we end up having to specify fewer numbers, and the numbers will often be easier to come up with.

CS 471/598 by H. Liu 12

Representation of CPTs

Given canonical distributions, the complete table can be specified by naming the distribution with some parameters.

A deterministic node has its value specified exactly by the values of its parents.

Uncertain relationships can often be characterized by “noisy” logical relationships.

An example on page 444 - The probability that the output node is False is just the product of the noise parameters

for all the input nodes that are true.

CS 471/598 by H. Liu 13

2. Conditional independence relations

In design inference algorithms, we need to know if more general conditional independences hold.

Given a network, can we know if a set of nodes X is independent of another set Y, given a set of evidence nodes E? It boils down to d-separation.

If every undirected path from a node in X to a node in Y is d-separated by E, then X and Y are conditionally independent given E.

CS 471/598 by H. Liu 14

E d-separates X and Y if every undirected path from a node in X to a node in Y is blocked given E.

Three conditions make it possible for a path to be blocked given E: Fig 15.4

Fig 15.5 shows examples of three conditions.

CS 471/598 by H. Liu 15

Inference in belief networks

The basic task is to get P(Query|Evidence).The nature of probabilistic inferences (Fig

15.6) Diagnostic (from effects to causes)

P(Burglary|JohnCalls)

Causal (from causes to effects)P(JohnCalls|Burglary)

Intercausal (between causes of a common effect)P(Burglary|Alarm^Earthquake) vs. P(Burgalary|Alarm)

Mixed (combining two or more of the above)P(A|JohnCalls^!Earthquake), P(B|J^!E)

CS 471/598 by H. Liu 16

Answering queries

Singly connected networks - polytrees (Fig 15.6)

Causal and evidential support (Fig 15.7)The general strategy is

express P(X|E) in terms of E+ and E- Compute E+ Compute E-

Two basic functions (Fig 15.8, p 452) support-except(X,V) evidence-except(X,V)

CS 471/598 by H. Liu 17

Inference in multiply connected belief networks

A multiply connected graph - two nodes are connected by more than one path.

An example - Fig 15.9Three basic classes of algorithms for

evaluating multiply connected networks. Clustering - Fig 15.10 Conditioning - Fig 15.11 Stochastic simulation - logic sampling, or

likelihood weighting

CS 471/598 by H. Liu 18

Knowledge engineering for uncertain reasoning

Decide what to talk aboutDecide on a vocabulary of random

variablesEncode general knowledge about the

dependence Encode a description of the specific

problem instancePose queries to the inference procedure

and get answers

CS 471/598 by H. Liu 19

Case study

The PATHFINDER system (p 457) PATHFINDER I - pure logical reasoning PATHFINDER II - certainty factor, Dempster-

Shafer theory, simplified Bayesian model (independent assumption)

PATHFINDER III - the simplified Bayesian model paying attention to low-probability events

PATHFINDER IV - a belief network to represent the dependencies that couldn’t be handled in the simplified Bayesian model

CS 471/598 by H. Liu 20

Other approaches to uncertain reasoning

Different generations of expert systems Strict logic reasoning (ignore uncertainty) Probabilistic techniques using the full Joint Default reasoning - believed until a better reason is

found to believe something else Rules with certainty factors Handling ignorance - Dempster-Shafer theory Vagueness - something is sort of true (fuzzy logic)

Probability makes the same ontological commitment as logic: the event is true or false

CS 471/598 by H. Liu 21

Default reasoning

The four-wheel car conclusion is reached by default.

New evidence can cause the conclusion retracted - FOL is strictly monotonic.

Representatives are default logic, nonmonotonic logic, circumscription

There are problematic issues Refer to Page 459-360

CS 471/598 by H. Liu 22

Rule-based methods

Logical reasoning systems have properties like: Monotonicity Locality Detachment Truth-functionality

These properties are good for obvious computational advantages; bad as they’re inappropriate for uncertain reasoning.

CS 471/598 by H. Liu 23

Summary

Reasoning properly In FOL, it means conclusions follow from premises In probability, it means having beliefs that allow an agent to

act rationally

Conditional independence info is vital A belief network is a complete representation for the

JPD, but exponetionally smaller in size Belief networks can reason causally, diagnostically,

intercausally, or combining two or more of the three. For polytrees, the computational time is linear in

network size.