Uncertainty
• Agents almost never have access to the whole truth about their environment.
• Eg: wumpus world. which square contains the pit.Uncertainty- agent enters any one of the square.
Handling Uncertain knowledge
• Diagnosis always involves uncertainty.
• Eg: Dental diagnosis: (toothache)
Its wrong. It can be changed as:
The causal rule for this:
Fails to use FOL • Medical diagnosis: Laziness: Too hard to list out all antecedence & consequents needed to ensure the rule.
Theoretical ignorance: No complete theory for the domain. Practical ignorance: Even all rules are known – uncertain about a particular patient. Not all test can be run.
Degree of belief• Agent’s knowledge provide only degree of belief in the relevant sentences.
• Main tool- deal with Degree of belief is Probability theory.
• PT- Belief is given between 0 & 1.• Probability provides a way of summarizing the uncertainty that comes from our laziness & ignorance.
• Beliefs depends on the percepts.• Agent update its probability after receiving new percepts.
• Two types: 1. Unconditional probability: - Prior probability, before the evidence is obtained.
2. Conditional probability: - Posterior probability ,after the evidence is obtained.
Rational decision• Agent – Preferences b/w different possible outcomes.
• Utility theory -used to represent & reason with preferences.(the quality of being useful)
Decision theory = probability theory + Utility theory
• Agent chooses action that yield the highest expected utility. (Maximum Expected Utility – MEU)
Basic Probability• Degree of belief are always applied to propositions.
• Basic element – Random variable.• Types: 1. Boolean random variable: Eg: Cavity have the domain <true or false)
2. Discrete random variable: Eg: Weather have the domain <Sunny, rainy, Cloudy, snow>
3. Continuous random variable: Take on values from real numbers.
Eg: x= 3.75• Atomic events: An atomic event is a complete specification of the state of the world about which the agent is uncertain.
Atomic events• An atomic event is a complete specification of the state of the world about which the agent is uncertain.
• Eg: Cavity & Toothache has four distinct atomic events.
Cavity = False Toothache = True Cavity = True Toothache = True Cavity = False Toothache = False Cavity = True Toothache = False
Prior Probability• Degree of belief in the absence of any other information.(Unconditional)
• Written as P(a)• Eg: Cavity is 0.1 P(Cavity = true)= 0.1 or P(cavity) = 0.1
• P(weather) denotes a vector of values.
P(Weather = Sunny) = 0.7 P(Weather = rain) = 0.2 P( Weather = Cloudy) = 0.08 P(Weather = Snow ) = 0.02• Can be written as P( Weather) = <0.7,0.2,0.08,0.02>
• Is called as Prior Probability distribution.
Probability distribution• Joint Probability distribution: P(Weather, Cavity) – represented by a 4*2 table of probability.
• Full Joint Probability distribution: P( Cavity, Toothache, Weather) – represented by 2*2*4 table with 16 entries.
• Probability distribution for continuous variables are called Probability density functions.
Eg: P(x=5.67)
Posterior probability(Conditional)
• P(a|b)- “the probability of a given that all we know is b”
P( Cavity| toothache) = 0.8• Conditional with unconditional P(a|b) = P(a b) P(b)• Can be written as P(a b) = P(a|b) P(b) [Product rule]
Also P(a b) = P(b|a) P(a)
Axioms of Probability• Basic Axioms: [ Kolmogorov’s axioms) 1. All probability are between 0 & 1.
0 ≤ P(a) ≤ 1 2. Valid propositions have probability 1,Invalid have as 0.
P(true)=1 & P(false)=0 3. The Probability of a disjunction is given by
P(a b)= P(a) + P(b) – P(a b)
Inference Using Full joint distribution
• Probabilistic inference- Computation from observed evidence of posterior probabilities for query proposition.
• Eg: Domain Consisting of 3 Boolean variables Toothache , Cavity , Catch.
It’s a 2*2*2 table.
• P(cavity toothache) = 0.108+0.012+0.072+0.008+0.016+0.064 = 0.28• Adding the entries in the first row- Marginal Probability.
P(Cavity) = 0.108+0.012+0.072+0.008 = 0.2
• Process- Marginalization or summing out.
• 1/P(toothache) -constant. Normalization constant.
• It can be written as:
• Let X –Query variable ,E - set of evidence , e-Observed values, Y- Unobserved Variables.
Query-P(X|e) Evaluated as: [posterior probabilities ]
Probability Inference• Observed evidence.• Product Rule: P(a b) = P(a|b) P(b) P(a b) = P(b|a) P(a)• Equating : P(b|a) = P(a|b) P(b) P(a)
• Baye’s rule (Baye’s law or Baye’s theorem)
Independence
• Variable represented for probability are P( Weather, toothache , catch , cavity)
• It can be deduced as P(weather= cloudy) P(toothache ,catch ,cavity )
Example: Bayes’ rule• Disease Meningitis:• It Cause patient to have stiff neck- 50% of the time.
• Prior Probability that Patients has meningitis is 1/50000.
• Prior Probability that patient has stiff neck is 1/20.
• Let s be stiff neck & m be Meningitis.
P(s|m) = 0.5 P(m) = 1/50000 P(s)= 1/20 P(m|s) = P(s|m) P(m) P(s)
= 0.5 * 1/50000 = 0.0002.
1/20• 1 in 5000 patients with stiff neck to have Meningitis.
Probabilistic reasoning- Bayesian network
• Bayesian network is a systematic way to represent independence relationships explicitly.
• Data structure to represent the dependencies among variables.
• Directed graph – each node is annotated with quantitative probability information.
Specification of Bayesian network1. A set of random variables makes
up the node.2.A set of directed links or arrows
connects pair of nodes. X is the Parent of Y.3. Each node Xi has a conditional
probability distribution P(Xi|Parent(Xi))
4. No directed cycles.
X Y
Topology• Nodes & Links – specifies the conditional independence relationships.
• Variables Weather ,Toothache ,Catch ,Cavity .
B E P(A)
T T .95T F .94F T .29F F .001
Alarm
A P(J)
T .90
F .05
John calls
A P(M)
T .70
F .01
Mary Calls
P(B). 001
P(E).002
Burglary
Earth Quake
Semantics of Bayesian network
• 2 ways to understand: 1. To see the network as a representation of the joint probability distribution.
2. View as an encoding of a collection of conditional independence statements.
Full joint distribution
• Joint distribution is the probability of a conjunction of assignment to each variables.
Example• Calculate probability that alarm sounds but neither a burglary nor an earth quake has occurred and both John & Mary call.
Inference in Bayesian Network• Exact Inference.
• Approximate Inference.• Exact Inference: - Compute posterior Probability for a set of query, given observed event.
posterior probabilities P(X|e)Eg: Burglary: Observed event- John calls=true & Mary calls=true
P(Burglary|Johncalls=true,Marycalls=true)
= <0.284,0.716>
Inference by Enumeration• Conditional probability By Full-joint diatribution. Query P(X|e) is
Eg: P(Burglary|Johncalls=true,Marycalls=true).
Hidden variable are Alarm & Earth quake.
• Applying Semantics of Bayseian network:
• P(b) –Constant , moved out ward. P(e) – moved outward of summation over a.
• By stuctural Computation:
Approximate inference• Exact inference is not applicable in multiply connected network.
• Monte Carlo algorithm is used to provide approximate answers.( Samples).
• 2 ways to calculate: 1. Direct Sampling method. 2. Markov chain simulation.
Direct Sampling• Generation of samples from a known probability distribution.
• Eg: Assuming an ordering [ Cloudy, Sprinkler, Rain, Wet Grass]
Rejection sampling• Conditional probability P(X|e).• Estimate P(Rain |Sprinkler =true) using 100 samples.
• 73 have sprinkler = False are rejected,27 have sprinkler= True.
• Of the 27 , 8 have Rain = true & 19 have Rain=False.
Probabilistic Reasoning • Till now Static world ,Dynamic aspects of the problem.
• State & Observation: Xt - Set of unobserved state variable. Et - Set of observed evidence. et - Set of Values.• Eg: Umbrella problem t – set of start state R0, R1 , R2.. Set of State Variable. U1 , U2... Evidence Variable.
Stationary Process & Markov Assumption.
• Set of variable- unbounded , state & evidence changes over time.
• 2 problems: 1. unbounded num of conditional probability table.(each variable)
2. Unbounded num of parents.
• Stationary Process- Changes in the world –caused by a stationary process.
Eg: P(Ut|Parents(Ut) – same for all t• Markov assumption – Handling the infinite number of parents. Current state depends on finite history of previous states.
• Markov process or chain: First order Markov Process Second order Markov Process.
Solutions
• First order Markov Process: - Current state depends on the previous state & not on early states.
P(Xt | X0:t-1) = P(Xt | Xt-1) – Transition Model
• Second order Markov Process: - Depends on 2 previous states. P(Xt | Xt-2, Xt-1)• Restrict the parent of the evidence variable Et.
P(Et | X0:t, E0:t-1)= P(Et | Xt)
- Sensor Model.
Temporal Model
• HTM Hierarchical temporal memory is a biomimetic model based on the memory-prediction theory of brain function described by Jeff Hawkins in his book On Intelligence.
Inference in Temporal Model
• Basic inference tasks.Filtering & Monitoring: (Observation of previous states).
Prediction: (Future state). Smoothing or Hindsight: (Past state - Observation)
Most Likely explanation: Sequence of states-generated through Observation.
Rain [ True ,True ,False, True]``
• Filtering: P(Xt+1 | e1:t+1) = f P(et+1,P(Xt | e1:t) )
• Prediction: P(Xt+k+1 | e1:t) = ∑ P(Xt+k+1 | Xt+k) P(Xt+k|e1:t)
• Smoothing: P(Xk| e1:t) 1≤ k ≤ t P(Xk | e1:k) P(Xk+1:t|Xk) = f1:k bk+1:t
Hidden Markov Models• State of the process :- Single discrete random variable.
• Simplified matrix algorithm: State i to state j.• Eg: Umbrella world: ( Transition Model)