EC: Lecture 17: Classifier Systems Ata Kaban University of Birmingham.

EC: Lecture 17: Classifier Systems

Ata Kaban

University of Birmingham

Classifier Systems

• Rule based systems with input and output interface

Classifier List

Input Interface

Message List

Output Interface

database of messages=facts=binary strings

classifiers=rules:

IF cond AND cond … THEN action

ternary strings (0,1,#)

The message list

• Finite memory (db) of messages or facts– Messages are binary strings of fixed length

• Can be visible or not to the external world• Messages can arrive from several modules

– Input interface– Classifies

• Messages can be deleted by Output interface (after processing them)

The classifier list

• Finite size rule base• Classifier = production rule here

– IF cond1 AND cond2 … AND condN THEN action– Represented on fixed nos of words of the form

Cond1, cond2, … condN / action

• Nos conditions fixed• Conditions can contain NOT• Conditions are fixed length = length of messages• Over the alphabet {0,1,#}, where ‘#’=‘don’t care’• Conditions are matched against messages in the message list

(in terms of Hamming distance)

Matching• Non-negated conditions are satisfied if there is at least a

message in the message list whose Hamming distance from the conditions string is 0, disregarding ‘#’ characters

• E.g. message list: a: 0101, b: 1010, c: 1111;

condition: 0101: satisfied (by a)

1101: not satis

#101: satis (by a)

1###: satis (by both b and c)

##00: not satis

####: satis (by a, b and c)

• Conditions with many ‘#’ tend to be unspecific

Matching• Negated conditions are satisfied if no message in the message list

matches them

• E.g. message list: a: 0101, b: 1010, c: 1111

condition: ~0101: not satis (matched by a)

~1101: satis

~#101: not satis (a)

~1###: not satis (b,c)

~##00: satis

~####: not satis (a,b,c)

• Negated conditions with many ‘#’s tend to be very specific (I.e. can be satisfied by few messages)– A cond with M hash symbols can be satisfied by 2^{L-M} messages, L

being the message length

Actions

• Strings of fixed length over {0,1,#}• When a message matches a classifier, the

classifier(s) are activated & a message is built as follows:– 0’s and 1’s in the action string are copied in the

message– #’s are substituted by the corresponding

characters in the message that matches the first condition of the classifier.

E.g. actions

• Message list: a: 0101, b: 1010, c: 1111• Classifier list: i: #11#, ~#110 / 00## ii: ###1, ~#110 / ###0 iii: ##1#, ~1110 / 0##0The following set of messages will be produced: 0011 (posted by I, c matches cond1) 0100 (posted by ii, a matches cond1) 1110 (posted by ii, c matches cond1) 0010 (posted by iii, b matches cond1) 0110 (posted bt iii, c matches cond1)

• Many classifiers can be activated in parallel!

• Conflict resolution is only necessary if the active classifiers can produce more messages than entries in the message list

The Input Interface

• Initially the db is empty• The input interface is a mechanism by

which the CFS can obtain info about the environment, through messages– These messages are often description of the

state set of (binary) detectors that can sense various features of the environment

• E.g. indicate the position of a robot controlled by a CFS wrt some obstacles of environment

The Output Interface• The output interface is a mechanism of using (and

deleting) some messages in the message list– usually represent (and control) the state of a set of effectors which

act on the environment --- e.g. control the actions of a robot

• Must distinguish between output messages and other messages!

• Types of messages: – Input messages (posted by the input interface)– Internal messages (posted by classifiers to be later matched and

processed by other classifiers)– Messages meant to be output messages

• A tag can be used to distinguish between types of messages

E.g.

• A CFS which has to control a robot might have some classifiers devoted to obstacle avoidance:

1####### / 00 0100 00

##1##### / 00 0001 00

####1### / 00 1000 00

######1# / 00 0010 00

Conditions (any obstacles left, right, up down?)

tag

action

The CFS Main Cycle

• The blocks indicated in the basic CFS diagram are activated according to the following main cycle:

1. Activate the Input Interface & post the input messages it generates to the message list

2. Perform the matching of all the conditions of all classifiers against the message list

3. Activate classifiers whose conditions are satisfied and add their messages to the message list

4. Activate the output interface, I.e. remove the output messages from the message list & perform the actions they describe

5. Repeat the previous steps

What is Missing?• It has been proved that in the presence of

repetition of the main cycle, using ‘#’ characters and at least 2 conditions per classifier, a CFS is capable of arbitrarily complex computations and of representing structures like stacks, lists, etc.

• CFSs can be programmed to do useful things• BUT: are quite difficult to program• Need to add adaptation mechanisms to the basic

architecture! – so that they can learn to behave appropriately in the environment or to perform useful tasks.

The Need for Competition

• Some classifiers in the system are better than others– i.e. the messages they produce lead to better actions

• E.g. for an autonomous agent a good action is the action that leads to getting some food.

• In its basic form, a CFS gives the same chances to all classifiers to post their messages

• We would like to prioritize good classifiers• Classifiers should compete to post their messages!

– Based on some measure of quality

Quality of classifiers

• Usefulness, or strength: capability of determining a good performance of the whole system

• Relevance to a particular situation, or specificity: (L – nos of ‘#’ symbols)/L (where L is the length of message)– Classifiers that match a particular message (or few

messages only) are ‘specialists’ for that kind of situation --- these are preferred over classifiers that match many situations and therefore provide a kind of default behaviour for the system.

– This measure is needed to ensure the competition

Quality of classifiers

• Bid: strength and specificity combined:Bid = const *strength * specificity (const is usually cca. 1/10)

• Notes:– To maintain paralelism, we must allow for more than one

winner in the competition– To avoid premature convergence, we use probabilistic

competition, with bid-proportionate winning probability– For a given classifier, specificity is a constant, so the

strength can only be varied to influence the behaviour of a CFS.

How to adapt?• by Credit assignment

– Info about the quality of the behaviour of the system comes from the environment in the form of rewards (e.g. +1,-1)

– Need to decide which classifiers are responsible and to which extent for the good or bad overall behaviour

The Bucket Brigade Algorithm• If there is a reward (pr punishment), add it to the strength of all

classifiers active in the current major cycle• Make each active classifier pay its bid to the classifiers that

prepared the stage for it (I.e. posted messages matched by its conditions)

– In time, strength is propagated backwards and each classifier receives the correct share of credit for the good (or bad) behaviour of the system as a whole

How to adapt?• by Rule Discovery

– Discovery of optimum classifiers by means of genetic algorithms

– Gas can be used in 2 ways:• Pittsburg approach: considering the classifier list as a single

individual, whose chromosome is obtained by concatenating the conditions and actions of all classifiers

– Different sets of classifiers compete– Fitness of each CFS is determined by observing the syst for some

time

• Michigan approach: considering each classifier as a separate individual

– Different classifiers compete or cooperate– Requires a fitness measure for each classifier – bucket brigade

algorithm can be used

Main Cycle of Learning Classifier Systems

1. Read input messages from sensors2. Find the classifiers that are activated & select those with

highest importance (fitness)3. Reduce the fitness of the matching classifiers4. Clear the message list5. Use the matching classifiers6. Evaluate the resulting behaviour7. Punish or reward the classifiers that were active

(reinforcement learning)8. Generate a new population of classifiers with an evolutionary

algorithm9. iterate

Summary

• LCS– Rule based system

• Evolving LCS– Pitt approach: every individual contains all

rules– Michigan approach: every individual contains a

single rule– Fitness based on accuracy of strength value

Date post:	19-Dec-2015
Category:	Documents
View:	218 times
Download:	3 times

EC: Lecture 17: Classifier Systems Ata Kaban University of Birmingham.

Documents