A LOGIC BASED CLASSIFICATION TECHNIQUE
General-to-Specific Ordering
Logic Based Classification 28/29/03
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
Logic Based
Tree questionsSky? Sunny, ok, Wind? Strong, ok yes enjoy sport
Like Decision Tree
Logic Based Classification 38/29/03
Expression<Sunny,?,?,Strong,?,?>
Means will enjoy sport only when sky is sunny and wind is strong, don’t care about other attributes
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
Candidate Elimination With candidate elimination object is to
predict class through the use of expressions
?’s are like wild cardsExpressions represent conjunctions
Logic Based Classification 48/29/03
First Approach
Finding a maximally specific hypothesis Start with most restrictive (specific) one
can get and relax to satisfy each positive training sample
Most general (all dimensions can be any value)
<?,?,?,?,?,?>Most restrictive (no dimension can be
anything<Ø, Ø, Ø, Ø, Ø, Ø>
Ø’s mean nothing will match it
Logic Based Classification 58/29/03
That pesky Ø What if a relation has a single Ø?
(remember, the expression is a conjunction)Ø
Logic Based Classification 68/29/03
Find-S AlgorithmInitialize h to most specific hypothesis in H (<Ø, Ø, Ø, Ø, Ø, Ø>)
For each positive training instance xFor each attribute constraint ai in h
If the constraint ai is satisfied by x then do nothingElse replace ai in h by the next more general constraint that is satisfied by x
Return hOrder of generality
? is more general than a specific attribute value which is more specific than Ø
Logic Based Classification 78/29/03
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
Set h to <Ø, Ø, Ø, Ø, Ø, Ø> First positive (x)
<Sunny,Warm,Normal,Strong,Warm,Same> Which attributes of x are satisfied by h? None? Replace each ai with a relaxed form from x
<Sunny,Warm,Normal,Strong,Warm,Same>
Example
Logic Based Classification 88/29/03
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
h is now <Sunny,Warm,Normal,Strong,Warm,Same>
Next positive <Sunny,Warm,High,Strong,Warm,Same>
Which attributes of x are satisfied by h? Not humidity Replace h with
<Sunny,Warm,?,Strong,Warm,Same>
Example
Logic Based Classification 98/29/03
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
h is now <Sunny,Warm,?,Strong,Warm,Same>
Next positive <Sunny,Warm,High,Strong,Cool,Change>
Which attributes of x are satisfied by h? Not water or forcast
Replace h with <Sunny,Warm,?,Strong,?,?>
Example
Return <Sunny,Warm,?,Strong,?,?>
Can one use this to “test” a new instance?
Logic Based Classification 108/29/03
Next: Version Space What if want all hypotheses that are
consistent with a training set (called a version space)
A hypothesis is consistent with a set of training examples if and only if h(x)=c(x) for each training example
<Sunny,Warm,?,Strong,?,? >
<Sunny, ?, ?, Strong, ?, ?>
<Sunny, Warm, ?, ?, ?, ?>
<?, Warm, ?, Strong, ?, ?>
<Sunny,?,?,?,?,?><?,Warm,?,?,?,?>
<?,?,?,?,?,Same>
Logic Based Classification 118/29/03
List-Then-Eliminate
Algorithm a list containing every hypothesis in For each training example
Remove from any hypothesis for which Output the list of hypotheses in
Exha
usti
ve
• Number of hypotheses 5,120 that can be represented (5*4*4*4*4*4)
• But a single Ø represents an empty set
• So semantically distinct hypotheses 973
Logic Based Classification 128/29/03
Next: Candidate Elimination
More compact representation
Just those hypotheses at the extreme ends Those that are the most
general and those that are the most specific
All else between would necessarily be in the
Process of Elimination
Logic Based Classification 138/29/03
Definitions And now for something totally formal:
The general boundary G, with respect to hypothesis space consistent with , is the set of maximally general members of consistent with .
G is identical to the set of all g that are members of H such that g is consistent with D and there does not exist a g’ in H such that it is more general than g and it (g’) is consistent with the training data
𝐺≡ {𝑔∈𝐻∨𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡 (𝑔 ,𝐷)∧(¬∃𝑔′∈𝐻 )[(𝑔 ′¿𝑔𝑔)∧𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡 (𝑔′ ,𝐷)]}
Logic Based Classification 148/29/03
Definitions The specific boundary S, with respect to
hypothesis space consistent with , is the set of minimally general members of consistent with .
S is identical to the set of all s that are members of H such that s is consistent with D and there does not exist a s’ in H such that it is more specific than s and it (s’) is consistent with the training data
𝑆≡ {𝑠∈𝐻∨𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡 (𝑠 ,𝐷)∧(¬∃𝑠 ′∈𝐻 )[(𝑠¿𝑔 𝑠 ′)∧𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑡 (𝑠 ′ ,𝐷)]}
Logic Based Classification 158/29/03
Example All yes’s are sunny, warm, and strong But “strong” isn’t enough to identify a
yes S:{<Sunny, Warm, ?, Strong, ?, ?>}
<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>
G: {<Sunny, ?, ?, ?, ?, ?>, <?, Warm, ?, ?, ?, ?> }5 ?’s
3 ?’s
4 ?’s
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
Logic Based Classification 168/29/03
Approach Start with two extremes
Most general (all dimensions can be any value) <?,?,?,?,?,?>
Most restrictive (no dimension can be anything <Ø, Ø, Ø, Ø, Ø, Ø>
Slowly work inward
Specific General
Logic Based Classification 178/29/03
Algorithm Initialize G to the set of maximally general hypotheses in H Initialize S to the set of maximally specific hypotheses in H For each training example d, do
If d is a positive example Remove from G any hypothesis inconsistent with d For each hypothesis s in S that is not consistent with d
Remove s from S Add to S all minimal generalizations h of s such that
h is consistent with d and some member of G is more general than h Remove from S any hypothesis that is more general than another hypothesis
in S If d is a negative example
Remove from S any hypothesis inconsistent with d For each hypothesis g in G that is not consistent with d
Remove g from G Add to G all minimal specializations h of g such that
h is consistent with d, and some member of S is more specific than h Remove from G any hypothesis that is less general than another hypothesis
in G
Logic Based Classification 188/29/03
Example Initialize
S0: <Ø, Ø, Ø, Ø, Ø, Ø>
G0: {<?,?,?,?,?,?>}
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
Logic Based Classification 198/29/03
Example First record
S1: {<Sunny,Warm,Normal,Strong,Warm,Same>}
G0 G1: {<?,?,?,?,?,?>}
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
Logic Based Classification 208/29/03
Example Second
S2: {<Sunny,Warm, ? ,Strong,Warm,Same>}
G0G1G2: {<?,?,?,?,?,?>}
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
Modify previous S minimally to keep consistent with d
Logic Based Classification 218/29/03
Example Third
S2S3: {<Sunny,Warm, ? ,Strong,Warm,Same>}
G3: {<Sunny,?,?,?,?,?>, <?,Warm,?,?,?,?>, <?,?,?,?,?,Same>}
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
Replace {<?,?,?,?,?,?>} with all one member expressions (minimally specialized)
Logic Based Classification 228/29/03
Example FourthS4: {<Sunny,Warm, ? ,Strong, ? , ? >}
G3G4: {<Sunny,?,?,?,?,?>, <?,Warm,?,?,?,?>, <?,?,?,?,?,Same>}
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
Back to positive, replace warm and same with “?” and remove “Same” from General
<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>Then can calculate the interior expressions
Logic Based Classification 238/29/03
What if Have two identical records but different classes?
If positive shows up first it, first step in evaluating a negative states “Remove from S any hypothesis that is not consistent with d” (S is now empty)
For each hypothesis g in G that is not consistent with d Remove g from G (all ?’s is inconsistent with No, G is empty) Add to G all minimal specializations h of g such that h is consistent with d,
and some member of S is more specific than h No matter what add to G it will violate either d or S (remains empty) Both are empty, broken. Known as converging to an empty version space
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm Normal Strong Warm Same No
S1: {<Sunny,Warm,Normal,Strong,Warm,Same>}
G0 G1: {<?,?,?,?,?,?>}
Established by first positive
Logic Based Classification 248/29/03
What if Have two identical records but different classes?
If negative shows up first it, first step in evaluating a positive states “Remove from G any hypothesis that is not consistent with d”
This is all of them, leaving an empty set For each hypothesis s in S that is not consistent with d
Remove s from S Add to S all minimal generalizations h of s such that h is consistent
with d and some member of G is more general than h No minimal generalization exists except <?,?,?,?,?,?>
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same NoSunny Warm Normal Strong Warm Same Yes
S0: <Ø, Ø, Ø, Ø, Ø, Ø>
G0G1:{<Rainy,?,?,?,?,?>, <Cloudy,?,?,?,?,?>, <?,Cold,?,?,?,?>,<?,?,High,?,?,?>,<?,?,?,Light,?,?>, <?,?,?,?,Cool,?>,<?,?,?,?,?,Change>}
Established by first negative
Logic Based Classification 258/29/03
Brittle Bad with noisy data Similar effect with false positives or
negatives
Logic Based Classification 268/29/03
Will it converge? Yes provided
1. There are no errors in the training examples
2. There is some hypothesis in H that correctly describes the target concept
For example: if the target concept is a disjunction () of feature attributes and the hypothesis space supports only conjunctions
Logic Based Classification 278/29/03
Classifying Never before
seen dataS4: {<Sunny,Warm, ? ,Strong, ? , ? >}
G3G4: {<Sunny,?,?,?,?,?>, <?,Warm,?,?,?,?>, <?,?,?,Strong,?,?>}
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Light Warm Same ?
<Sunny, ?, ?, Strong, ?, ?> <Sunny, Warm, ?, ?, ?, ?> <?, Warm, ?, Strong, ?, ?>
All training samples were strong windVote
No
No NoYesYes Yes No
Proportion can be a confidence metric
Logic Based Classification 288/29/03
A Unanimous Vote Same confidence as if already converged to the
single correct target concept
Regardless of which hypothesis in the version space is eventually found to be correct, it will be positive for at least some of the hypotheses in the current set, and the test case is unanimously positive
100% as good as most specific
match
Logic Based Classification 298/29/03
Best for… Discrete data Binary classes
Sky AirTemp
Humidity
Wind Water
Forecast
EnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
Logic Based Classification 308/29/03
Now for… Have seen 4 classifiers
Naïve Bayesian KNN Decision Tree Candidate Elimination
Now for some theory
Logic Based Classification 318/29/03
Have already… Curse of dimensionality Overfitting Lazy/Eager Radial basis Normalization Gradient descent Entropy/Information
gain Occam’s razor
Logic Based Classification 328/29/03
Biased Hypothesis Space
Another way of measuring whether a hypothesis captures the learning concept
Candidate Elimination Conjunction of
constraints on the attributes
Logic Based Classification 338/29/03
In regression Biased toward linear solutions
Naïve Bayes Biased to a given distribution or bin selection
KNN Biased toward solutions that assume
cohabitation of similarly classed instances Decision Tree
Short trees
Biased Hypothesis Space
Logic Based Classification 348/29/03
Unbiased learner? Must be able to accommodate every
distinct subset as class definition 96 distinct instances (3*2*2*2*2*2)
Sky has three possible answers–rest two Number of distinct subsets 296
Think binary: 1 indicates membership Sky AirTem
pHumidit
yWind Wate
rForeca
stEnjoySport
Sunny Warm Normal Strong Warm Same YesSunny Warm High Strong Warm Same YesRainy Cold High Strong Warm Chang
eNo
Sunny Warm High Strong Cool Change
Yes
Logic Based Classification 358/29/03
Number of hypotheses 5,120 that can be represented (5*4*4*4*4*4)
But a single Ø represents an empty set So semantically distinct hypotheses 973
Each hypothesis represents a subset (due to wild cards)
1+(4*3*3*3*3*3)
Search Space
S0: <Ø, Ø, Ø, Ø, Ø, Ø>
G0: {<?,?,?,?,?,?>}
• Candidate elimination can represent 973 different subsets
• But 296 is the number of distinct subsets
• Very biased
Logic Based Classification 368/29/03
I think of bias as inflexibility in expressing hypotheses
Or, alternatively, what are the implicit assumptions of the approach
Bias
Implicit Assumptions
Infle
xibi
lity
Logic Based Classification 378/29/03
Next term: inductive inference The process by which a conclusion is inferred
from multiple observations
What we’ve been doing
TRAINING DATA
CLASSIFIER
MAKE PREDICTION ON
NEW DATA
Logic Based Classification 388/29/03
The Hypothesis Inductive learning hypothesis
Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples
Logic Based Classification 398/29/03
Next Term Concept learning
Automatically inferring the general definition of some concept, given examples labeled as members or nonmembers of the concept
Roughly equate “Concept” to “Class”
Logic Based Classification 408/29/03
is the set of all possible hypotheses that the learner may consider regarding the choice of hypothesis representation.
In general, each hypothesis in represents a boolean-valued function defined over ; that is, . Note that this is for a two class system
The goal of the learner is to find a hypothesis such that for all in is the target concept
Hypotheses
Logic Based Classification 418/29/03
Target Concept In regression
The various “y” values of the training instances
Function approximation Naïve Bayes, KNN, and Decision Tree
Class
Logic Based Classification 428/29/03
Hypotheses In regression
Line; the coefficients (or other equation members such as exponents) Naïve Bayes
Class of an instance is predicted by determining most probable class given the training data. That is, by finding the probability for each class for each dimension, multiplying these probabilities (across the dimensions for each class) and taking the class with the maximum probability as the predicted class
KNN Class of an instance is predicted by examining an instance’s
neighborhood Decision Tree
Tree itself Candidate Elimination
Conjunction of constraints on the attributes
Logic Based Classification 438/29/03
Something Else We’ve Been Doing
Supervised Learning Supervision from an oracle that knows the
classes of the training data Is there unsupervised learning? Yes, covered in pattern rec
Seeks to determine how the data are organized
Clustering PCA Edge detection
Logic Based Classification 448/29/03
Definition of Machine Learning Machine learning addresses the question
of how to build computer programs that improve their performance at some task through experience.
Finally
Logic Based Classification 458/29/03
Learning Checkers All about representation Out representation
End game is to develop
function that returns the best next move
Logic Based Classification 468/29/03
chooseNextMove Look at every legal
move Determine goodness
(score) of resultant board state
Return the highest score (argmax)
Logic Based Classification 478/29/03
How to Assess a Board State
Score function, we will keep it simple Work with a polynomial with just a few
variables X1: the number of black pieces on the board X2: the number of red pieces on the board X3: the number of black kings on the board X4: the number of red kings on the board X5: the number of black pieces threatened by red X6: the number of red pieces threatened by black
Logic Based Classification 488/29/03
Score(b) Gotta learn them weights
But how?
𝑆𝑐𝑜𝑟𝑒 (𝑏)=𝑤0+𝑤1𝑥1+𝑤2𝑥2+𝑤3𝑥3+𝑤4 𝑥4+𝑤5𝑥5+𝑤6 𝑥6
X1: the number of black pieces on the boardX2: the number of red pieces on the boardX3: the number of black kings on the boardX4: the number of red kings on the boardX5: the number of black pieces threatened by redX6: the number of red pieces threatened by black
Logic Based Classification 498/29/03
Training A bunch of board states (a series of
games) Use them to jiggle the weights Must know the current real “score” vs.
“predicted score” using polynomial
Train the scoring function
Logic Based Classification 508/29/03
A trick If my predictor is good then it will be self-
consistent That is, the score of my best move should
lead to a good scoring board state If it doesn’t maybe we should adjust our
predictor
PRECOGNITION
Logic Based Classification 518/29/03
ScoreBasedUponSuccessor Successor returns the board state of the
best move (returned by chooseNextMove(b))
It has been found to be surprisingly successful
𝑆𝑐𝑜𝑟𝑒𝐵𝑎𝑠𝑒𝑑𝑈𝑝𝑜𝑛𝑆𝑢𝑐𝑐𝑒𝑠𝑠𝑜𝑟 (𝑏 )=𝑠𝑐𝑜𝑟𝑒(𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑜𝑟 (𝑏))
Logic Based Classification 528/29/03
Learning For each training sample (board states
from a series of games)
If win (zero opponent pieces on the board) could give some fixed score (100 if win, -100 if lose)
𝑤𝑖=𝑤𝑖+𝜂 (𝑆𝑐𝑜𝑟𝑒𝐵𝑎𝑠𝑒𝑑𝑈𝑝𝑜𝑛𝑆𝑢𝑐𝑐𝑒𝑠𝑠𝑜𝑟 (𝑏 )−𝑠𝑐𝑜𝑟𝑒 (𝑏) ) 𝑥 𝑖
Look familiar?LMS (least mean squares) weight update rule
Logic Based Classification 538/29/03
Is this a classifier? Is it Machine
Learning?
Classifier?
Logic Based Classification 548/29/03
Logic Based Classification 558/29/03
Makes a big deal… At the beginning of candidate elim pg 29 Diff between satisfies and consistent with Satisfies h when h(x)=1 regardless of
whether x is a positive or negative example
Consistent with h depends on the target concept, whether h(x)=c(x)