+ All Categories
Home > Documents > Artificial Intelligence Notes

Artificial Intelligence Notes

Date post: 23-Feb-2023
Category:
Upload: annauniv
View: 0 times
Download: 0 times
Share this document with a friend
59
UNIT 4
Transcript

UNIT 4

Uncertainty

• Agents almost never have access to the whole truth about their environment.

• Eg: wumpus world. which square contains the pit.Uncertainty- agent enters any one of the square.

Handling Uncertain knowledge

• Diagnosis always involves uncertainty.

• Eg: Dental diagnosis: (toothache)

Its wrong. It can be changed as:

The causal rule for this:

Fails to use FOL • Medical diagnosis: Laziness: Too hard to list out all antecedence & consequents needed to ensure the rule.

Theoretical ignorance: No complete theory for the domain. Practical ignorance: Even all rules are known – uncertain about a particular patient. Not all test can be run.

Degree of belief• Agent’s knowledge provide only degree of belief in the relevant sentences.

• Main tool- deal with Degree of belief is Probability theory.

• PT- Belief is given between 0 & 1.• Probability provides a way of summarizing the uncertainty that comes from our laziness & ignorance.

• Beliefs depends on the percepts.• Agent update its probability after receiving new percepts.

• Two types: 1. Unconditional probability: - Prior probability, before the evidence is obtained.

2. Conditional probability: - Posterior probability ,after the evidence is obtained.

Rational decision• Agent – Preferences b/w different possible outcomes.

• Utility theory -used to represent & reason with preferences.(the quality of being useful)

Decision theory = probability theory + Utility theory

• Agent chooses action that yield the highest expected utility. (Maximum Expected Utility – MEU)

Basic Probability• Degree of belief are always applied to propositions.

• Basic element – Random variable.• Types: 1. Boolean random variable: Eg: Cavity have the domain <true or false)

2. Discrete random variable: Eg: Weather have the domain <Sunny, rainy, Cloudy, snow>

3. Continuous random variable: Take on values from real numbers.

Eg: x= 3.75• Atomic events: An atomic event is a complete specification of the state of the world about which the agent is uncertain.

Atomic events• An atomic event is a complete specification of the state of the world about which the agent is uncertain.

• Eg: Cavity & Toothache has four distinct atomic events.

Cavity = False Toothache = True Cavity = True Toothache = True Cavity = False Toothache = False Cavity = True Toothache = False

Prior Probability• Degree of belief in the absence of any other information.(Unconditional)

• Written as P(a)• Eg: Cavity is 0.1 P(Cavity = true)= 0.1 or P(cavity) = 0.1

• P(weather) denotes a vector of values.

P(Weather = Sunny) = 0.7 P(Weather = rain) = 0.2 P( Weather = Cloudy) = 0.08 P(Weather = Snow ) = 0.02• Can be written as P( Weather) = <0.7,0.2,0.08,0.02>

• Is called as Prior Probability distribution.

Probability distribution• Joint Probability distribution: P(Weather, Cavity) – represented by a 4*2 table of probability.

• Full Joint Probability distribution: P( Cavity, Toothache, Weather) – represented by 2*2*4 table with 16 entries.

• Probability distribution for continuous variables are called Probability density functions.

Eg: P(x=5.67)

Posterior probability(Conditional)

• P(a|b)- “the probability of a given that all we know is b”

P( Cavity| toothache) = 0.8• Conditional with unconditional P(a|b) = P(a b) P(b)• Can be written as P(a b) = P(a|b) P(b) [Product rule]

Also P(a b) = P(b|a) P(a)

Axioms of Probability• Basic Axioms: [ Kolmogorov’s axioms) 1. All probability are between 0 & 1.

0 ≤ P(a) ≤ 1 2. Valid propositions have probability 1,Invalid have as 0.

P(true)=1 & P(false)=0 3. The Probability of a disjunction is given by

P(a b)= P(a) + P(b) – P(a b)

Inference Using Full joint distribution

• Probabilistic inference- Computation from observed evidence of posterior probabilities for query proposition.

• Eg: Domain Consisting of 3 Boolean variables Toothache , Cavity , Catch.

It’s a 2*2*2 table.

• P(cavity toothache) = 0.108+0.012+0.072+0.008+0.016+0.064 = 0.28• Adding the entries in the first row- Marginal Probability.

P(Cavity) = 0.108+0.012+0.072+0.008 = 0.2

• Process- Marginalization or summing out.

• Conditional Probability: (Product rule)

• Conditional & unconditional Probability: P(X|e)

• 1/P(toothache) -constant. Normalization constant.

• It can be written as:

• Let X –Query variable ,E - set of evidence , e-Observed values, Y- Unobserved Variables.

Query-P(X|e) Evaluated as: [posterior probabilities ]

Probability Inference• Observed evidence.• Product Rule: P(a b) = P(a|b) P(b) P(a b) = P(b|a) P(a)• Equating : P(b|a) = P(a|b) P(b) P(a)

• Baye’s rule (Baye’s law or Baye’s theorem)

Independence

• Variable represented for probability are P( Weather, toothache , catch , cavity)

• It can be deduced as P(weather= cloudy) P(toothache ,catch ,cavity )

Baye’s rule

P(b|a) = P(a|b) P(b) P(a)• Baye’s rule (Baye’s law or Baye’s theorem)

Example: Bayes’ rule• Disease Meningitis:• It Cause patient to have stiff neck- 50% of the time.

• Prior Probability that Patients has meningitis is 1/50000.

• Prior Probability that patient has stiff neck is 1/20.

• Let s be stiff neck & m be Meningitis.

P(s|m) = 0.5 P(m) = 1/50000 P(s)= 1/20 P(m|s) = P(s|m) P(m) P(s)

= 0.5 * 1/50000 = 0.0002.

1/20• 1 in 5000 patients with stiff neck to have Meningitis.

Probabilistic reasoning- Bayesian network

• Bayesian network is a systematic way to represent independence relationships explicitly.

• Data structure to represent the dependencies among variables.

• Directed graph – each node is annotated with quantitative probability information.

Specification of Bayesian network1. A set of random variables makes

up the node.2.A set of directed links or arrows

connects pair of nodes. X is the Parent of Y.3. Each node Xi has a conditional

probability distribution P(Xi|Parent(Xi))

4. No directed cycles.

X Y

Topology• Nodes & Links – specifies the conditional independence relationships.

• Variables Weather ,Toothache ,Catch ,Cavity .

Burglary

Earth quake

B E P(A)

T T .95T F .94F T .29F F .001

Alarm

A P(J)

T .90

F .05

John calls

A P(M)

T .70

F .01

Mary Calls

P(B). 001

P(E).002

Burglary

Earth Quake

Semantics of Bayesian network

• 2 ways to understand: 1. To see the network as a representation of the joint probability distribution.

2. View as an encoding of a collection of conditional independence statements.

Full joint distribution

• Joint distribution is the probability of a conjunction of assignment to each variables.

Example• Calculate probability that alarm sounds but neither a burglary nor an earth quake has occurred and both John & Mary call.

Node ordering

Node ordering

Inference in Bayesian Network• Exact Inference.

• Approximate Inference.• Exact Inference: - Compute posterior Probability for a set of query, given observed event.

posterior probabilities P(X|e)Eg: Burglary: Observed event- John calls=true & Mary calls=true

P(Burglary|Johncalls=true,Marycalls=true)

= <0.284,0.716>

Inference by Enumeration• Conditional probability By Full-joint diatribution. Query P(X|e) is

Eg: P(Burglary|Johncalls=true,Marycalls=true).

Hidden variable are Alarm & Earth quake.

• Applying Semantics of Bayseian network:

• P(b) –Constant , moved out ward. P(e) – moved outward of summation over a.

• By stuctural Computation:

Multiply connected network

Approximate inference• Exact inference is not applicable in multiply connected network.

• Monte Carlo algorithm is used to provide approximate answers.( Samples).

• 2 ways to calculate: 1. Direct Sampling method. 2. Markov chain simulation.

Direct Sampling• Generation of samples from a known probability distribution.

• Eg: Assuming an ordering [ Cloudy, Sprinkler, Rain, Wet Grass]

Rejection sampling• Conditional probability P(X|e).• Estimate P(Rain |Sprinkler =true) using 100 samples.

• 73 have sprinkler = False are rejected,27 have sprinkler= True.

• Of the 27 , 8 have Rain = true & 19 have Rain=False.

Probabilistic Reasoning • Till now Static world ,Dynamic aspects of the problem.

• State & Observation: Xt - Set of unobserved state variable. Et - Set of observed evidence. et - Set of Values.• Eg: Umbrella problem t – set of start state R0, R1 , R2.. Set of State Variable. U1 , U2... Evidence Variable.

Stationary Process & Markov Assumption.

• Set of variable- unbounded , state & evidence changes over time.

• 2 problems: 1. unbounded num of conditional probability table.(each variable)

2. Unbounded num of parents.

• Stationary Process- Changes in the world –caused by a stationary process.

Eg: P(Ut|Parents(Ut) – same for all t• Markov assumption – Handling the infinite number of parents. Current state depends on finite history of previous states.

• Markov process or chain: First order Markov Process Second order Markov Process.

Solutions

• First order Markov Process: - Current state depends on the previous state & not on early states.

P(Xt | X0:t-1) = P(Xt | Xt-1) – Transition Model

• Second order Markov Process: - Depends on 2 previous states. P(Xt | Xt-2, Xt-1)• Restrict the parent of the evidence variable Et.

P(Et | X0:t, E0:t-1)= P(Et | Xt)

- Sensor Model.

ExampleFirst Order

Second

Approximate in predicting• 2 solutions:

Temporal Model

• HTM Hierarchical temporal memory is a biomimetic model based on the memory-prediction theory of brain function described by Jeff Hawkins in his book On Intelligence.

Inference in Temporal Model

• Basic inference tasks.Filtering & Monitoring: (Observation of previous states).

Prediction: (Future state). Smoothing or Hindsight: (Past state - Observation)

Most Likely explanation: Sequence of states-generated through Observation.

Rain [ True ,True ,False, True]``

• Filtering: P(Xt+1 | e1:t+1) = f P(et+1,P(Xt | e1:t) )

• Prediction: P(Xt+k+1 | e1:t) = ∑ P(Xt+k+1 | Xt+k) P(Xt+k|e1:t)

• Smoothing: P(Xk| e1:t) 1≤ k ≤ t P(Xk | e1:k) P(Xk+1:t|Xk) = f1:k bk+1:t

• Smoothing:

• Finding the most likely sequence:

Hidden Markov Models• State of the process :- Single discrete random variable.

• Simplified matrix algorithm: State i to state j.• Eg: Umbrella world: ( Transition Model)

• Sensor Model: Diagonal Matrix Ot .Eg: Umbrella world, U1 = true. 0.9 0 0 0.2


Recommended