+ All Categories
Home > Documents > Learning Agents Laboratory Computer Science Department George Mason University

Learning Agents Laboratory Computer Science Department George Mason University

Date post: 05-Jan-2016
Category:
Upload: nikkos
View: 30 times
Download: 1 times
Share this document with a friend
Description:
CS 782 Machine Learning. 3. Inductive Learning from Examples: Version space learning. Prof. Gheorghe Tecuci. Learning Agents Laboratory Computer Science Department George Mason University. Overview. Instances, concepts and generalization. Concept learning from examples. - PowerPoint PPT Presentation
68
1 3, G.Tecuci, Learning Agents Laboratory Learning Agents Laboratory Computer Science Department George Mason University Prof. Gheorghe Tecuci 3. Inductive Learning from Examples: Version space learning
Transcript
Page 1: Learning Agents Laboratory Computer Science Department George Mason University

1 2003, G.Tecuci, Learning Agents Laboratory

Learning Agents LaboratoryComputer Science Department

George Mason University

Prof. Gheorghe Tecuci

3. Inductive Learning from Examples:Version space learning

3. Inductive Learning from Examples:Version space learning

Page 2: Learning Agents Laboratory Computer Science Department George Mason University

2 2003, G.Tecuci, Learning Agents Laboratory

OverviewOverview

Concept learning from examples

Version spaces and the candidate elimination algorithm

The LEX system

Discussion

Instances, concepts and generalization

Recommended reading

The learning bias

Page 3: Learning Agents Laboratory Computer Science Department George Mason University

3 2003, G.Tecuci, Learning Agents Laboratory

Basic ontological elements: instances and conceptsBasic ontological elements: instances and concepts

An instance is a representation of a particular entity from the application domain.

A concept is a representation of a set of instances.

government_of_Britain_1943government_of_US_1943

state_government

instance_of instance_of

“state_government” represents the set of all entities that are governments of states. This set includes “government_of_US_1943” and “government_of_Britain_1943” which are called positive examples.

government_of_US_1943

government_of_Britain_1943

state_government

“instance_of” is the relationship between an instance and the concept to which it belongs.

An entity which is not an instance of a concept is called a negative example of that concept.

Page 4: Learning Agents Laboratory Computer Science Department George Mason University

5 2003, G.Tecuci, Learning Agents Laboratory

Concept generalityConcept generality

A concept P is more general than another concept Q if and only if the set of instances represented by P includes the set of instances represented by Q.

state_government

democratic_government

representative_democracy

totalitarian_government

parliamentary_democracy

Example:

“subconcept_of” is the relationship between a concept and a more general concept.

state_government

subconcept_of

democratic_government

Page 5: Learning Agents Laboratory Computer Science Department George Mason University

7 2003, G.Tecuci, Learning Agents Laboratory

A generalization hierarchyA generalization hierarchy

feudal_god_king_government

totalitarian_government

democratic_government

theocratic_government

state_government

military_dictatorship

police_state

religious_dictatorship

representative_democracy

parliamentary_democracy

theocratic_democracy

monarchy

governing_body

other_state_government

dictator

deity_figure

chief_and_tribal_council

autocratic_leader

democratic_council_or_board

group_governing_body

other_group_

governing_body

government_of_Italy_1943

government_of_Germany_1943

government_of_US_1943

government_of_Britain_1943

ad_hoc_ governing_body established_ governing_body

other_type_of_governing_body

fascist_state

communist_dictatorshipreligious_

dictatorshipgovernment_

of_USSR_1943

Page 6: Learning Agents Laboratory Computer Science Department George Mason University

9 2003, G.Tecuci, Learning Agents Laboratory

OverviewOverview

Concept learning from examples

Version spaces and the candidate elimination algorithm

The LEX system

Discussion

Instances, concepts and generalization

Recommended reading

The learning bias

Page 7: Learning Agents Laboratory Computer Science Department George Mason University

10 2003, G.Tecuci, Learning Agents Laboratory

Empirical inductive concept learning from examplesEmpirical inductive concept learning from examples

Illustration

Positive examples of cups: P1 P2 ...

Negative examples of cups: N1 …

A description of the cup concept: has-handle(x), ...

Given

Learn

Approach:Compare the positive and the negative examples of a concept, in terms of their similarities and differences, and learn the concept as a generalized description of the similarities of the positive examples.

Concept Learning allows the agent to recognize other entities as being instances of the learned concept.

Why is Concept Learning important?

Page 8: Learning Agents Laboratory Computer Science Department George Mason University

11 2003, G.Tecuci, Learning Agents Laboratory

The learning problemThe learning problem

Given • a language of instances; • a language of generalizations; • a set of positive examples (E1, ..., En) of a concept • a set of negative examples (C1, ... , Cm) of the same concept • a learning bias • other background knowledge

Determine • a concept description which is a generalization of the positive

examples that does not cover any of the negative examples

Purpose of concept learningPredict if an instance is an example of the learned concept.

Page 9: Learning Agents Laboratory Computer Science Department George Mason University

12 2003, G.Tecuci, Learning Agents Laboratory

Generalization and specialization rulesGeneralization and specialization rules

A generalization rule is a rule that transforms an expression into a more general expression.

A specialization rule is a rule that transforms an expression into a less general expression.

The reverse of any generalization rule is a specialization rule.

Learning a concept from examples is based on generalization and specialization rules.

Page 10: Learning Agents Laboratory Computer Science Department George Mason University

13 2003, G.Tecuci, Learning Agents Laboratory

Indicate several generalizations of the following sentence:

Students who have lived in Fairfax for more then 3 years.

DiscussionDiscussion

Indicate several specializations of the following sentence:

Students who have lived in Fairfax for more then 3 years.

Page 11: Learning Agents Laboratory Computer Science Department George Mason University

14 2003, G.Tecuci, Learning Agents Laboratory

Generalization (and specialization) rulesGeneralization (and specialization) rules

Climbing the generalization hierarchy

Dropping condition

Generalizing numbers

Adding alternatives

Turning constants into variables

Page 12: Learning Agents Laboratory Computer Science Department George Mason University

15 2003, G.Tecuci, Learning Agents Laboratory

Turning constants into variablesTurning constants into variables

Generalizes an expression by replacing a constant with a variable.

?O1 is multi_group_force number_of_subgroups 5

?O1 is multi_group_force number_of_subgroups ?N1

generalization specialization?N1 5 5 ?N1

The set of multi_group_forces with 5 subgroups.

The set of multi_group_forces with any number of subgroups.

Allied_forces_operation_Husky

Axis_forces_Sicily

Japan_1944_Armed_Forces

Page 13: Learning Agents Laboratory Computer Science Department George Mason University

17 2003, G.Tecuci, Learning Agents Laboratory

Climbing the generalization hierarchiesClimbing the generalization hierarchies

Generalizes an expression by replacing a concept with a more general one.

?O1 is single_state_forcehas_as_governing_body ?O2

?O2 is representative_democracy

generalization specializationrepresentative_democracy

democratic_government

The set of single state forces governed by representative democracies

democratic_government representative_democracy

?O1 is single_state_forcehas_as_governing_body ?O2

?O2 is democratic_government

The set of single state forces governed by democracies

democratic_government

representative_democracy parliamentary_democracy

Page 14: Learning Agents Laboratory Computer Science Department George Mason University

19 2003, G.Tecuci, Learning Agents Laboratory

Dropping conditionsDropping conditions

Generalizes an expression by removing a constraint from its description.

?O1 is multi_member_forcehas_international_legitimacy “yes”

?O1 is multi_member_force

generalization specialization

The set of multi-member forces that have international legitimacy.

The set of multi-member forces (that may or may not have international legitimacy).

Page 15: Learning Agents Laboratory Computer Science Department George Mason University

20 2003, G.Tecuci, Learning Agents Laboratory

Extending intervalsExtending intervals

Generalizes an expression by replacing a number with an interval, or by replacing an interval with a larger interval.

?O1 is multi_group_force number_of_subgroups ?N1?N1 is-in [3 .. 7]

generalization specialization[3 .. 7] 5 5 [3 .. 7]

?O1 is multi_group_force number_of_subgroups ?N1?N1 is-in [2 .. 10]

generalization specialization[2 .. 10] [3 .. 7] [3 .. 7] [2 .. 10]

?O1 is multi_group_force number_of_subgroups 5

The set of multi_group_forces with exactly 5 subgroups.

The set of multi_group_forces with at least 3 subgroups and at most 7 subgroups.

The set of multi_group_forces with at most 10 subgroups.

Page 16: Learning Agents Laboratory Computer Science Department George Mason University

22 2003, G.Tecuci, Learning Agents Laboratory

Adding alternativesAdding alternatives

?O1 is alliance

has_as_member ?O2

?O1 is alliance OR coalition

has_as_member ?O2

generalization specialization

The set of alliances.

The set including both the alliances and the coalitions.

Generalizes an expression by replacing a concept C1 with the union (C1 U C2), which is a more general concept.

Page 17: Learning Agents Laboratory Computer Science Department George Mason University

23 2003, G.Tecuci, Learning Agents Laboratory

Generalization and specialization rulesGeneralization and specialization rules

Climbing the generalization hierarchies

Dropping conditions

Extending intervals

Adding alternatives

Turning constants into variables

Descending the generalization hierarchies

Adding conditions

Reducing intervals

Dropping alternatives

Turning variables into constants

Page 18: Learning Agents Laboratory Computer Science Department George Mason University

24 2003, G.Tecuci, Learning Agents Laboratory

Operational definition of generalization/specialization

Generalization/specialization of two concepts

Least general generalization of two concepts

Minimally general generalization of two concepts

Types of generalizations and specializationsTypes of generalizations and specializations

Maximally general specialization of two concepts

Page 19: Learning Agents Laboratory Computer Science Department George Mason University

25 2003, G.Tecuci, Learning Agents Laboratory

Operational definition of generalizationOperational definition of generalization

Operational definition:

A concept P is said to be more general than another concept Q if and only if Q can be transformed into P by applying a sequence of generalization rules.

Non-operational definition:

A concept P is said to be more general than another concept Q if and only if the set of instances represented by P includes the set of instances represented by Q.

This definition is not operational because it requires to show that each instance I from a potential infinite set Q is also in the set P.

Why isn’t this an operational definition?

Page 20: Learning Agents Laboratory Computer Science Department George Mason University

26 2003, G.Tecuci, Learning Agents Laboratory

Generalization of two conceptsGeneralization of two concepts

Operational definition:

The concept Cg is a generalization of the concepts C1 and C2 if and only if both C1 and C2 can be transformed into Cg by applying gene-ralization rules (assuming the existence of a complete set of rules).

Definition:

The concept Cg is a generalization of the concepts C1 and C2 if and only if Cg is more general than C1 and Cg is more general than C2.

MANEUVER-UNIT

ARMORED-UNIT INFANTRY-UNIT

MANEUVER-UNITis a generalization of

ARMORED-UNITand

INFANTRY-UNIT

How would you define this?

Is the above definition operational?

Page 21: Learning Agents Laboratory Computer Science Department George Mason University

27 2003, G.Tecuci, Learning Agents Laboratory

Generalization of two concepts: exampleGeneralization of two concepts: example

?O1 IS COURSE-OF-ACTIONTOTAL-NUMBER-OF-OFFENSIVE-ACTIONS 10TYPE OFFENSIVE

C1:

?O1 IS COURSE-OF-ACTIONTOTAL-NUMBER-OF-OFFENSIVE-ACTIONS 5

C2:

?O1 IS COURSE-OF-ACTIONTOTAL-NUMBER-OF-OFFENSIVE-ACTIONS ?N1

?N1 IS-IN [5 … 10]

C:

Generalize 10 to [5 .. 10]Drop “?O1 TYPE OFFENSIVE”

Generalize 5 to [5 .. 10]

Remark: COA=Course of Action

Page 22: Learning Agents Laboratory Computer Science Department George Mason University

28 2003, G.Tecuci, Learning Agents Laboratory

Specialization of two conceptsSpecialization of two concepts

Operational definition:

The concept Cs is a specialization of the concepts C1 and C2 if and only if both C1 and C2 can be transformed into Cs by applying specialization rules (or Cs can be transformed into both C1 and into C2 by applying generalization rules). This assumes a complete set of rules.

Definition:

The concept Cs is a specialization of the concepts C1 and C2 if and only if Cs is less general than C1 and Cs is less general than C2.

MILITARY-MANEUVER

MILITARY-ATTACK

PENETRATE-MILITARY-TASKis a specialization of

MILITARY-MANEUVERand

MILITARY-ATTACK

PENETRATE-MILITARY-TASK

Page 23: Learning Agents Laboratory Computer Science Department George Mason University

29 2003, G.Tecuci, Learning Agents Laboratory

Other useful definitionsOther useful definitions

The concept G is a minimally general generalization of A and B if and only if G is a generalization of A and B, and G is not more general than any other generalization of A and B.

If there is only one minimally general generalization of two concepts A and B, then this generalization is called the least general generalization of A and B.

The concept C is a maximally general specialization of two concepts A and B if and only if C is a specialization of A and B and no other specialization of A and B is more general than C.

Minimally general generalization

Least general generalization

Maximally general specialization

Specialization of a concept with a negative example

Page 24: Learning Agents Laboratory Computer Science Department George Mason University

30 2003, G.Tecuci, Learning Agents Laboratory

Concept learning: another illustrationConcept learning: another illustration

Learned concept:Cautious learner

Allied_Forces_1943 is equal_partner_multi_state_alliancehas_as_member US_1943

Positive examples:

Negative examples:

European_Axis_1943 is dominant_partner_multi_state_alliancehas_as_member Germany_1943

Somali_clans_1992 is equal_partner_multi_group_coalitionhas_as_member Isasq_somali_clan_1992

?O1 is multi_state_alliancehas_as_member ?O2

?O2 is single_state_force

A multi-state alliance that has as member a single state force.

Page 25: Learning Agents Laboratory Computer Science Department George Mason University

31 2003, G.Tecuci, Learning Agents Laboratory

What could be said about the predictions of a cautious learner?

DiscussionDiscussion

Concept to be learned

Concept learned by a

cautions learner

Page 26: Learning Agents Laboratory Computer Science Department George Mason University

33 2003, G.Tecuci, Learning Agents Laboratory

Concept learning: yet another illustrationConcept learning: yet another illustration

Aggressive learnerLearned concept:

Allied_Forces_1943 is equal_partner_multi_state_alliancehas_as_member US_1943

Positive examples:

Negative examples:

European_Axis_1943 is dominant_partner_multi_state_alliancehas_as_member Germany_1943

Somali_clans_1992 is equal_partner_multi_group_coalitionhas_as_member Isasq_somali_clan_1992

?O1 is multi_member_forcehas_as_member ?O2

?O2 is single_state_force

A multi-member force that has as member a single state force.

Page 27: Learning Agents Laboratory Computer Science Department George Mason University

34 2003, G.Tecuci, Learning Agents Laboratory

What could be said about the predictions of an aggressive learner?

DiscussionDiscussion

Concept learned byan aggressivelearner

Concept to be learned

Page 28: Learning Agents Laboratory Computer Science Department George Mason University

36 2003, G.Tecuci, Learning Agents Laboratory

How could one synergistically integrate a cautious learner with an aggressive learner to take advantage of their qualities to compensate for each other’s weaknesses?

DiscussionDiscussion

Concept to be learned

Concept learned by a

cautions learner

Concept learned byan aggressivelearner

Concept to be learned

Concept learned byan aggressivelearner Concept to be learned

Concept learned by a

cautions learner

Page 29: Learning Agents Laboratory Computer Science Department George Mason University

37 2003, G.Tecuci, Learning Agents Laboratory

OverviewOverview

Concept learning from examples

Version spaces and the candidate elimination algorithm

The LEX system

Discussion

Instances, concepts and generalization

Recommended reading

The learning bias

Page 30: Learning Agents Laboratory Computer Science Department George Mason University

38 2003, G.Tecuci, Learning Agents Laboratory

Basic idea of version space concept learningBasic idea of version space concept learning

UB+

Initialize the lower bound to the first positive example (LB=E1) and the upper bound (UB) to the most general generalization of E1.

LB

++

UBLBIf the next example is a positive one,

then generalize LB as little as possible to cover it.

_++

UBLB

If the next example is a negative one, then specialize UB as little as possible to uncover it and to remain more general than LB.

_++

UB=LB _

_ ++

Repeat the above two steps with the rest of examples until UB=LB.This is the learned concept.

Consider the examples E1, … , E2 in sequence.

Page 31: Learning Agents Laboratory Computer Science Department George Mason University

40 2003, G.Tecuci, Learning Agents Laboratory

The candidate elimination algorithm (Mitchell, 1978)The candidate elimination algorithm (Mitchell, 1978)

Let us suppose that we have an example e1 of a concept to be learned. Then, any sentence of the representation language which is more general than this example, is a plausible hypothesis for the concept.

G: {e }

S: {e }

•• •

• ••

••••

••

• • •••

••

••• •

g

1 H = { h | h is more general than e1 }

The version space is:

Page 32: Learning Agents Laboratory Computer Science Department George Mason University

41 2003, G.Tecuci, Learning Agents Laboratory

The candidate elimination algorithm (cont.)The candidate elimination algorithm (cont.)

• • ••

••••

••

• • •••

••

••

more general UB

LBmore specific

As new examples and counterexamples are presented to the program, candidate concepts are eliminated from H.

This is practically done by updating the set G (which is the set of the most general elements in H) and the set S (which is the set of the most specific elements in H).

Page 33: Learning Agents Laboratory Computer Science Department George Mason University

43 2003, G.Tecuci, Learning Agents Laboratory

The candidate elimination algorithmThe candidate elimination algorithm

1. Initialize S to the first positive example and G to its most general generalization

2. Accept a new training instance I• If I is a positive example then

- remove from G all the concepts that do not cover I;- generalize the elements in S as little as possible to

cover I but remain less general than some concept in G;- keep in S the minimally general concepts.

• If I is a negative example then- remove from S all the concepts that cover I;

- specialize the elements in G as little as possible to uncover I and be more general than at least one element from S;

- keep in G the maximally general concepts.

3. Repeat 2 until G=S and they contain a single concept C (this is the learned concept)

Page 34: Learning Agents Laboratory Computer Science Department George Mason University

44 2003, G.Tecuci, Learning Agents Laboratory

Illustration of the candidate elimination algorithmIllustration of the candidate elimination algorithm

Language of instances: (shape, size)shape: {ball, brick, cube}size: {large, small}

Learning process:

Input examples:

ball small +

shape size classball large +brick small –cube large –

-(brick, small)

G = {(ball, any-size) (any-shape, large)}

2

-(cube, large)

G = {(ball, any-size)}3

G = {(any-shape, any-size)}

S = {(ball, large)}

+(ball, large)

+(ball, large)

1

1

+(ball, small)

S = {(ball, any-size)}||

4

Language of generalizations: (shape, size)shape: {ball, brick, cube, any-shape}size: {large, small, any-size}

Page 35: Learning Agents Laboratory Computer Science Department George Mason University

48 2003, G.Tecuci, Learning Agents Laboratory

OverviewOverview

Concept learning from examples

Version spaces and the candidate elimination algorithm

The LEX system

Discussion

Instances, concepts and generalization

Recommended reading

The learning bias

Page 36: Learning Agents Laboratory Computer Science Department George Mason University

49 2003, G.Tecuci, Learning Agents Laboratory

The LEX systemThe LEX system

Lex is a system that uses the version space method to learn heuristics for suggesting when the integration operators should be applied for solving symbolic integration problems.

The problem of learning control heuristics

GivenOperators for symbolic integration:

OP1: ∫ r f(x) dx --> r ∫ f(x) dx

OP2: ∫ u dv --> uv - ∫ v du, where u=f1(x) and dv=f2(x)dx

OP3: 1 f(x) --> f(x)

OP4: ∫ (f1(x) + f2(x))dx --> ∫ f1(x) dx + ∫ f2(x)dx

OP5: ∫ sin(x) dx --> -cos(x) + C

OP6: ∫ cos(x) dx --> sin(x) + CFind

Heuristics for applying the operators as, for instance, the following one:

To solve ∫ rx transc(x) dx apply OP2 with u=rx and dv=transc(x)dx

Page 37: Learning Agents Laboratory Computer Science Department George Mason University

50 2003, G.Tecuci, Learning Agents Laboratory

Remarks

The integration operators assure a satisfactory level of competence to the LEX system. That it, LEX is able in principle to solve a significant class of symbolic integration problems. However, in practice, it may not be able to solve many of these problems because this would require too many resources of time and space.

The description of an operator shows when the operator is applicable, while a heuristic associated with an operator shows when the operator should be applied, in order to solve a problem.

LEX tries to discover, for each operator OPi, the definition of the concept:situations in which OPi should be used.

Page 38: Learning Agents Laboratory Computer Science Department George Mason University

51 2003, G.Tecuci, Learning Agents Laboratory

The architecture of LEXThe architecture of LEX

Version space of a proposed heuristic:

S: 3x cos(x) dx --> Apply OP2 with u = 3x dv = cos(x) dx

G: f1(x) f2(x) dx --> Apply OP2 with u = f1(x) dv = f2(x) dx

One of the suggested positive training instances:

3x cos(x) dx --> Apply OP2 with u = 3x dv = cos(x) dx

3x cos(x) dx

3x cos(x) dxOP2 with u = 3x, dv = cos(x) dx

3x sin(x) - 3sin(x) dx

3x sin(x) - 3 sin(x) dx

3x sin(x) + 3cos(x) + C

PROBLEMGENERATOR

LEARNERPROBLEMSOLVER

CRITIC

...

...OP1

OP5

∫ ∫

1. What search strategy to use for problem solving?

2. How to characterize individual problem solving steps?

3. How to learn from these steps?

How is the initial VS defined?

4. How to generate a new problem?

Page 39: Learning Agents Laboratory Computer Science Department George Mason University

53 2003, G.Tecuci, Learning Agents Laboratory

f

prim op

transc

trig explog

monom

sin cos tan ln exp

+ - * / ^

Generalization hierarchy for functions

r xn

k xn

r x...

k x ... ...

3 x ...

poly...

Page 40: Learning Agents Laboratory Computer Science Department George Mason University

54 2003, G.Tecuci, Learning Agents Laboratory

Illustration of the learning processIllustration of the learning process

Continue learning of the heuristic for applying OP2:

The problem generatorgenerates a new problem to solve that is useful for learning.

The problem solverSolves this problem

The criticExtract positive and negative examples from the problem solving tree.

The learnerRefine the version space of the heuristic.

Page 41: Learning Agents Laboratory Computer Science Department George Mason University

57 2003, G.Tecuci, Learning Agents Laboratory

OverviewOverview

Concept learning from examples

Version spaces and the candidate elimination algorithm

The LEX system

Discussion

Instances, concepts and generalization

Recommended reading

The learning bias

Page 42: Learning Agents Laboratory Computer Science Department George Mason University

58 2003, G.Tecuci, Learning Agents Laboratory

Types of bias: - restricted hypothesis space bias; - preference bias.

The learning biasThe learning bias

A bias is any basis for choosing one generalization over another, other than strict consistency with the observed training examples.

Page 43: Learning Agents Laboratory Computer Science Department George Mason University

59 2003, G.Tecuci, Learning Agents Laboratory

Some of the restricted spaces investigated:

- logical conjunctions (i.e. the learning system will look for a concept description in the form of a conjunction);

- linear threshold functions (for exemplar-based representations);

- three-layer neural networks with a fixed number of hidden units.

Restricted hypothesis space biasRestricted hypothesis space bias

The hypothesis space H (i.e. the space containing all the possible concept descriptions) is defined by the generalization language. This language may not be capable of expressing all possible classes of instances. Consequently, the hypothesis space in which the concept description is searched is restricted.

Page 44: Learning Agents Laboratory Computer Science Department George Mason University

60 2003, G.Tecuci, Learning Agents Laboratory

The language of instances consists of triples of bits as, for example: (0, 1, 1), (1, 0, 1).

How many concepts are in this space?

Restricted hypothesis space bias: exampleRestricted hypothesis space bias: example

The total number of subsets of instances is 28 = 256.

This hypothesis space consists of 3x3x3 = 27 elements.

The language of generalizations consists of triples of 0, 1, and *, where * means any bit, for example: (0, *, 1), (*, 0, 1).

How many concepts could be represented in this language?

Page 45: Learning Agents Laboratory Computer Science Department George Mason University

61 2003, G.Tecuci, Learning Agents Laboratory

Most preference biases attempt to minimize some measure of syntactic complexity of the hypothesis representation (e.g. shortest logical expression, smallest decision tree).

These are variants of Occam's Razor, which is the bias first defined by William of Occam (1300-1349):

Given two explanations of data, all other things being equal, the simpler explanation is preferable.

Preference biasPreference bias

A preference bias places a preference ordering over the hypotheses in the hypothesis space H. The learning algorithm can then choose the most preferred hypothesis f in H that is consistent with the training examples, and produce this hypothesis as its output.

Page 46: Learning Agents Laboratory Computer Science Department George Mason University

62 2003, G.Tecuci, Learning Agents Laboratory

In general, the preference bias may be implemented as an order relationship 'better(f1, f2)' over the hypothesis space H.Then, the system will choose the "best" hypothesis f, according to the "better" relationship.

An example of such a relationship:"less-general-than" which produces the least general expression consistent with the data.

Preference bias: representationPreference bias: representation

How could the preference bias be represented?

Page 47: Learning Agents Laboratory Computer Science Department George Mason University

63 2003, G.Tecuci, Learning Agents Laboratory

OverviewOverview

Concept learning from examples

Version spaces and the candidate elimination algorithm

The LEX system

Discussion

Instances, concepts and generalization

Recommended reading

The learning bias

Page 48: Learning Agents Laboratory Computer Science Department George Mason University

64 2003, G.Tecuci, Learning Agents Laboratory

ProblemProblem

color shape size classorange square large + i1blue ellipse small - i2red triangle small + i3green rectangle small - i4yellow circle large + i5

any-color

warm-color cold-color

red yelloworange blackblue green

any-shape

polygone round

triangle rectangle

square

circle ellipse

any-size

large small

Apply the candidate elimination algorithm to learn the concept represented by the above examples.

Language of instances: An instance is defined by triplet of the form (specific-color, specific-shape, specific-size)

Language of generalization: (color-concept, shape-concept, size-concept)

Set of examples:

Background knowledge:

Task:

Page 49: Learning Agents Laboratory Computer Science Department George Mason University

65 2003, G.Tecuci, Learning Agents Laboratory

Solution:

+i1: (color = orange) & (shape = square) & (size = large)S: {[(color = orange) & (shape = square) & (size = large)]}G: {[(color = any-color) & (shape = any-shape) & (size = any-size)]}

-i2: (color = blue) & (shape = ellipse) & (size = small)S: {[(color = orange) & (shape = square) & (size = large)]}G: {[(color = warm-color) & (shape = any-shape) & (size = any-size)], [(color = any-color) & (shape = polygon) & (size = any-size)], [(color = any-color) & (shape = any-shape) & (size = large)]}

+i3: (color = red) & (shape = triangle) & (size = small)S: {[(color = warm-color) & (shape = polygon) & (size = any-size)]}G: {[(color = warm-color) & (shape = any-shape) & (any-size)], [(color = any-color) & (shape = polygon) & (size = any-size)]}

-i4: (color = green) & (shape = rectangle) & (size = small)S: {[(color = warm-color) & (shape = polygon) & (size = any-size)]}G: {[(color = warm-color) & (shape = any-shape) & (size = any-size)}

+i5: (color = yellow) & (shape = circle) & (size = large)S: {[(color = warm-color) & (shape = any-shape) & (size = any-size)]}G: {[(color = warm-color) & (shape = any-shape) & (size = any-size)]}

The concept is:(color = warm-color) & (shape = any-shape) & (size = any-size) ; a warm color object

Page 50: Learning Agents Laboratory Computer Science Department George Mason University

66 2003, G.Tecuci, Learning Agents Laboratory

Does the order of the examples count? Why and how?

Consider the following order:

color shape size classorange square large + i1red triangle small + i3 yellow circle large + i5 blue ellipse small - i2green rectangle small - i4

Page 51: Learning Agents Laboratory Computer Science Department George Mason University

67 2003, G.Tecuci, Learning Agents Laboratory

What happens if there are not enough examples for S and G to become identical?

DiscussionDiscussion

Could we still learn something useful?

How could we classify a new instance?

When could we be sure that the classification is the same as the one made if the concept were completely learned?

Could we be sure that the classification is correct?

Page 52: Learning Agents Laboratory Computer Science Department George Mason University

68 2003, G.Tecuci, Learning Agents Laboratory

What happens if there are not enough examples for S and G to become identical?

Let us assume that one learns only from the first 3 examples:

color shape size classorange square large + i1blue ellipse small - i2red triangle small + i3

S: {[(color = warm-color) & (shape = polygon) & (size = any-size)]}

G: {[(color = warm-color) & (shape = any-shape) & (any-size)], [(color = any-color) & (shape = polygon) & (size = any-size)]}

The final version space will be:

Page 53: Learning Agents Laboratory Computer Science Department George Mason University

69 2003, G.Tecuci, Learning Agents Laboratory

color shape size classblue circle largeorange square smallred ellipse largeblue polygon small

G: {[(color = warm-color) & (shape = any-shape) & (any-size)], [(color = any-color) & (shape = polygon) & (size = any-size)]}

S: {[(color = warm-color) & (shape = polygon) & (size = any-size)]}

Assume that the final version space is:

How could we classify the following examples, how certain we are about the classification, and why?

_

+don’t knowdon’t know

Page 54: Learning Agents Laboratory Computer Science Department George Mason University

70 2003, G.Tecuci, Learning Agents Laboratory

Could the examples contain errors?

What kind of errors could be found in an example?

What will be the result of the learning algorithm if there are errors in examples?

What could we do if we know that there are errors?

DiscussionDiscussion

Page 55: Learning Agents Laboratory Computer Science Department George Mason University

71 2003, G.Tecuci, Learning Agents Laboratory

Could the examples contain errors?

What kind of errors could be found in an example?

DiscussionDiscussion

- Classification errors:- positive examples labeled as negative- negative examples labeled as positive

- Measurement errors- errors in the values of the attributes

Page 56: Learning Agents Laboratory Computer Science Department George Mason University

72 2003, G.Tecuci, Learning Agents Laboratory

What will be the result of the learning algorithm if there are errors in examples?

Let us assume that the 4th example is incorrectly classified:

color shape size classorange square large + i1blue ellipse small - i2red triangle small + i3green rectangle small + i4 (incorrect classification)yellow circle large + i5

S: {[(color = warm-color) & (shape = polygon) & (size = any-size)]}

G: {[(color = warm-color) & (shape = any-shape) & (any-size)], [(color = any-color) & (shape = polygon) & (size = any-size)]}

The version space after the first three examples is:

Continue learning

Page 57: Learning Agents Laboratory Computer Science Department George Mason University

73 2003, G.Tecuci, Learning Agents Laboratory

What could we do if we know that there might be errors in the examples?

If we cannot find a concept consistent with all the training examples, then we may try to find a concept that is consistent with all but one of the examples.

If this fails, then we may try to find a concept that is consistent with all but two of the examples, an so on.

What is a problem with this approach?

Combinatorial explosion.

Page 58: Learning Agents Laboratory Computer Science Department George Mason University

74 2003, G.Tecuci, Learning Agents Laboratory

What happens if we extend the generalization language to include conjunction, disjunction and negation of examples?

color shape size classorange square large + i1blue ellipse small - i2red triangle small + i3green rectangle small - i4yellow circle large + i5

any-color

warm-color cold-color

red yelloworange blackblue green

any-shape

polygone round

triangle rectangle

square

circle ellipse

any-size

large small

Learn the concept represented by the above examples by applying the Versions Space method.

Set of examples:

Background knowledge:

Task:

Page 59: Learning Agents Laboratory Computer Science Department George Mason University

75 2003, G.Tecuci, Learning Agents Laboratory

color shape size classorange square large + i1blue ellipse small - i2red triangle small + i3green rectangle small - i4yellow circle large + i5

Set of examples:

G = {all the examples}S = { i1 }

G = { ¬i2 } ; all the examples except i2S = { i1 }

G = { ¬i2 }S = { i1 or i3 }

G = { ¬i2 or ¬i4 } ; all examples except i2 and i4S = { i1 or i3 }

G = { ¬i2 or ¬i4 } ; all examples except i2 and i4S = { i1 or i3 or i5 }

These are the minimal

generalizations and

specializations

Page 60: Learning Agents Laboratory Computer Science Department George Mason University

76 2003, G.Tecuci, Learning Agents Laboratory

The futility of bias-free learningThe futility of bias-free learning

A learner that makes no a priori assumptions regarding the identity of the target concept has no rational basis for classifying any unseen instance.

Page 61: Learning Agents Laboratory Computer Science Department George Mason University

77 2003, G.Tecuci, Learning Agents Laboratory

What happens if we extend the generalization language to include internal disjunction? Does the algorithm still generalizes over the observed data?

color shape size classorange square large + i1blue ellipse small - i2red triangle small + i3green rectangle small - i4yellow circle large + i5

any-color

warm-color cold-color

red yelloworange blackblue green

any-shape

polygone round

triangle rectangle

square

circle ellipse

any-size

large small

Learn the concept represented by the above examples by applying the Versions Space method.

Set of examples:

Background knowledge:

Task:

Generalization(i1, i3): (orange or red, square or triangle, large or small)Is it different from: i1 or i3?

Page 62: Learning Agents Laboratory Computer Science Department George Mason University

78 2003, G.Tecuci, Learning Agents Laboratory

How is the generalization language extended by the internal disjunction?

Consider the following generalization hierarchy:

any-shape

polygon

triangle rectangle circle

Page 63: Learning Agents Laboratory Computer Science Department George Mason University

79 2003, G.Tecuci, Learning Agents Laboratory

How is the generalization language extended by the internal disjunction?

polygon

triangle rectangle circle

triangle or rectangle triangle or circle rectangle or circle

polygon or circle

triangle or rectangle or circle

any-shape

polygon

triangle rectangle circle

The above hierarchy is replaced with the following one:

Page 64: Learning Agents Laboratory Computer Science Department George Mason University

80 2003, G.Tecuci, Learning Agents Laboratory

any-color

warm-color cold-color

red yelloworange blackblue green

Consider now the following generalization hierarchy:

Which is the corresponding hierarchy containing disjunctions?

Page 65: Learning Agents Laboratory Computer Science Department George Mason University

81 2003, G.Tecuci, Learning Agents Laboratory

Could you think of another approach to learning a disjunctive concept with the candidate elimination algorithm?

Find a concept1 that is consistent with some of the positive examples and none of the negative examples.

Remove the covered positive examples from the training set and repeat the procedure for the rest of examples, computing another concept2 that covers some positive examples, and so on, until there is no positive example left.

The learned concept is “concept1 or concept2 or …”

Could you specify this algorithm better?

Hint: Initialize S with the first positive example, …

Page 66: Learning Agents Laboratory Computer Science Department George Mason University

82 2003, G.Tecuci, Learning Agents Laboratory

Consider the following:

Instance languagecolor {red, orange, yellow, blue, green, black}

Generalization languagecolor {red, orange, yellow, blue, green, black, warm-color, cold-color, any-color}

sequence of positive and negative examples of a concept, and the background knowledge represented by the following hierarchy:

Apply the candidate elimination algorithm to learn the concept represented by the above examples.

any-color

warm-color cold-color

red yelloworange blackblue green

example1(+): orange example2(-): blue example3(+): red

ExerciseExercise

Page 67: Learning Agents Laboratory Computer Science Department George Mason University

83 2003, G.Tecuci, Learning Agents Laboratory

• In its original form learns only conjunctive descriptions.

• However, it could be applied successively to learn disjunctive descriptions.

• Requires an exhaustive set of examples.

• Conducts an exhaustive bi-directional breadth-first search.

• The sets S and G can be very large for complex problems.

• It is very important from a theoretical point of view, clarifying the process of inductive concept learning from examples.

• Has very limited practical applicability because of the combinatorial explosion of the S and G sets.

• It is at the basis of the powerful Disciple multistrategy learning method which has practical applications.

Features of the version space methodFeatures of the version space method

Page 68: Learning Agents Laboratory Computer Science Department George Mason University

84 2003, G.Tecuci, Learning Agents Laboratory

Recommended readingRecommended reading

Mitchell T.M., Machine Learning, Chapter 2: Concept learning and the general to specific ordering, pp. 20-51, McGraw Hill, 1997.

Mitchell, T.M., Utgoff P.E., Banerji R., Learning by Experimentation: Acquiring and Refining Problem-Solving Heuristics, in Readings in Machine Learning.

Tecuci, G., Building Intelligent Agents, Chapter 3: Knowledge representation and reasoning, pp. 31-75, Academic Press, 1998.

Barr A. and Feigenbaum E. (Eds.), The Handbook of Artificial Intelligence, vol III, pp.385-400, pp.484-493.


Recommended