3. BUCKET BRIGADE ALGORITHM FOR HIERARCHICAL CENSORED PRODUCTION RULE-BASED SYSTEM
In this chapter, we study the suitability of the standard bucket brigade
algorithm to Variable Precision Logic systems and propose a modified version of
bucket brigade algorithm. A solution to the problem of weakening correct rules
that are activated by incorrect rules created by the standard bucket brigade
algorithm is also incorporated in the modified version.
3.1. Bucket Brigade Algorithm: An overview·
Many learning systems face the problem of temporal credit allocation: the
proper reinforcement of activities that do not directly result in need satisfaction
or external reward but are nevertheless essential precursors to such outcomes
[Wilson]. Therefore, the development of an appropriate credit apportionment
scheme for adaptive rule- based systems is considered to be one of the main
issues in machine learning (e.g., [1], [2], [22], [33], [42], [43], [46], [65], [66], [72],
[74J, [75], [78],179], [80], [81], [83]). One well known algorithm in this regard is
the bucket brigade algori thm of Holland [42], [43], which is basically designed for
message-passing, rule-based classifier systems (genetic algorithms approach).
De long [19] mentioned" I think it is fair to say that most people who encounter
classifier systems after becoming familiar with the traditional genetic algorithms
literature are some what surprised at the emergence of the rather elaborate
(bucket brigade) mechanism to deal with apportionment of credit issues".
76
In certain other AI systems that learn to perform multi- step tasks as LEX
system [58] for symbolic integration, and ACT· cognitive architecture [1], credit
is assigned to early steps by keeping and analysing a record of all pre-payoff
actions, both considered and taken, and the associated reasoning. Because it is
not possible to keep track of all paths of activation actually followed by the rule
chains (a rule chain is a set of rule activated in sequence, starting with a rule
activated by environmental messages and ending with a rule performing an action
on the environment) where the number of these paths grows exponentially [22], . .
bucket brigade technique does not depend on retrospective analysis but operates
locally, during performance, in the strength transaction between steps, with the
better classifiers at each step being selected statistically over time. The bucket
brigade implements an "economy" in which "information" flows forward with
inference and "strength" flows backward in a "trickle down" distribution pattern.
Therefore, bucket brigade principle would appear appropriate for systems -such
as animals and autonomous robots in on-going interaction with uncertain
environments- where storage and analysis of raw experience is expensive or
impractical [82].
Many attempts have been made to study the bucket 'brigade algorithm and
how to overcome its limitations. Antonisse [2] explored the adaptation of the
bucket brigade to production systems where he addressed the problem of
unsupervised credit assignment. He concluded that the bucket brigade is an
inexpensive algorithm and stable method that appears to be convergent. However
according to Antonisse the main limitations of the bucket brigade that it does not
77.
gIve absolute measures of performance for rules in the system and the
propagation of credit along paths of inference is proportional to the lengths of
these paths, rather than constant. Huang [46] realised that the usefulness of the
action conducted by a rule is a function of the context in which the rule activates
and scalar-valued strengths in bucket brigade algorithms are not appropriate
approximator for the usefulness of rule actions. He therefore proposed an
algorithm called context-array bucket brigade algorithm in which' more
sophisticated rule strength such as array-value strengths are used. In order to
handle the default hierarchy formation problem (default hierarchy to work
properly classifiers that implement exception rules must be able to control the
system when they are applicable, thus preventing the default rules from making
mistakes) in the standard bucket brigade algorithm, Riolo [65] suggested simple
modification to the standard bucket brigade algorithm. The modification involves
biasing the calculation of the effective bid so that general classifiers (those with
low bid ratios) have much lower effective bids than do more specific classifiers
(those with high bid ratios). Dorigo [23] addressed ~he same problem and
proposed an apportionment of credit algorithm, called message-based bucket
brigade, in which messages instead of rules are evaluated and a rule quality is
considered a function of the value of the messages matching the rule conditions,
of the rule specificity and of the value of the message the rule tries to post.
Spiessens [72] has shown that the bucket brigade is not very efficient in
providing accurate predictions of the expected payoff from the environment and
presented an alternative algorithm for a classifier system called pes. The
78
strength of a classifier in pes is reliability measure of the prediction of this
classifier.
The problem of not adequately reinforce' long action sequences by bucket
brigade has been discussed by Wilson [81] and an alternative form of the
algorithm was suggested. The algorithm was designed to induce behavior
hier'archies in which modularity of the hierarchy would keep all the bucket
brigade chains short, thus more reinforceable and more rapidly learned, but
overall action sequences' could be long. The problem of long action chains was
mentioned for the first time by Holland [85] and he suggested a way to
implement "bridging classifiers" that speed up the flow of strength down a long
sequence of classifiers. Riolo [66] presented a comparison study between the . ,
standard bucket brigade algorithm and the "bridging classifiers" using the
CFS-C/FSW1 classifier system [63] [64]. Results illustrate that "bridging classifier"
have shown to dramatically decrease the number of times a long sequence must
be executed in order relocate strength to all the classifiers in the sequence.
Westerdale [78] claims that bucket brigade algorithm is a sub goal reward
scheme which may lead to inaccurate payoff, and therefore, he suggested another
scheme called genetic scheme in which the payoff is distributed in a different way
from the strategy used in the bucket brigade algorithm. Westerdale [80] further
examined the bucket brigade algorithm in a simple illustrative production system
(Railway). He employed a penalty scheme to produce an interretable bucket
brigade reward scheme. The result is a reward scheme that looks like a genetic
79
scheme with altruism added.
Grefenstette [35) presented a preliminary study of the use of the rule level
credit assignment provided by the bucket brigade in a s-ystem using the Pitt's
approach to genetic learning. The syst~m shows that the information concerning
the strength of individual rules can be useful in altering the physical
representation of the rule sets in order to promote the linkage of co-adapted sets
of rules.
To modify bucket brigade algorithm to be suitable for pattern matched
systems, it has to be noticed that there are three main differences between
classifier system and production system : (1) Knowledge representation (2)
probabilistic control (3) parallel versus sequential orientation [2].
Following Antonisse [2], because of the difference In knowledge
representation between classifier systems (a classifier is represented over the set
{O,l, *}) and pattern- matched production system (symbolic representation), the
determination of rule accuracy is highly problematic to production system
(unproblematic in classifier system).
Classifier systems apply classifiers probabilistically, proportional to (an
exponent of) the amount of their bids. In production system the branch with the
highest payoff is chosen whereas classifier systems choose inference branches. that
are (exponentially) proportional to expected payoff.
80
Classifier system is a massively parallel system, it expects all matching
classifiers to fire but adjudicates among classifiers that post incompatible results,
hence it resolves the problem of conflict resolution. In contrast, most of
production systems (e.g., OPS-like systems) support parallelism in the match
phase but expects single rule firing following conflict resolution. It is left to a
system builder to determine the retractions that remove productions from the
conflict set without firing.
3.2. Motivation
Under the standard bucket brigade algorithm, the correct rules initiated
by incorrect rules get weakened. To make the above mentioned problem clear,
let us consider the set of CPRs given below (we shall neglect the UNLESS part
for the time being). A full match for the preconditions is necessary to any rule
to enter the cOI1flict set. It is also further assumed, an equpl strength of value 10
is given to the rules in the set. The value of b is 0.5 and the environment reward
is 40.
Rule 1.
Rule 2.
IF [Is-holiday] THEN [Sunday]
UNLESS [National-day,Religious-day]
IF [Good-climate] THEN [Moderate-weather]
UNLESS [Is-raining,Is-very-hot]
IF [John-like-swimming] THEN [John-like-the-sea]
81
Rule 4.
Rule 5.
Rule 6.
Rule 7.
Rule 8.
Rule 9.
Rule 10.
IF [J ohn-like-swimming]
IF [John-like-swimming]
IF [Sunday,Moderate-weather]
UNLESS [John-fear-waves, J ohn-hate-sea-water]
THEN [John-like-swimming-pool]
UNLESS [Empty-swimming-pool]
THEN [John-like-skiing]
UNLESS [ ]
THEN [John-interested-in -outing]
UNLESS [John-is-ill, John-has-guest,
J ohn-watch-TV]
IF [John-like-the-sea, John-interested-in-outing]
THEN [John-may-go-to-the-sea]
UNLESS [sea-is-far]
IF [John-like-swimming-pool, J ohn~ in teres ted-in-ou ting]
THEN [John-may-go-to-swimming-pool]
UNLESS [swimming-pool-is-far]
IF [J ohn-like-swimming-p·ool]
THEN [John-may-go-for-mountaineering]
UNLESS [ ]
IF [John-like-skiing, John-interested-in-outing]
THEN [J ohn-may-go-to-snow-hills]
UNLESS [ 1
82
In the following explanation, we assume that the system receIves
envIronment inputs in the order (a) to (c) as given in Fig.3.1.
case(a): Environment input(a).
Is-holiday, Good-climate, John-like-swimming
Rules 1, 2, 3, 4, and 5 enter the conflict set and make each a bid of 5
(10*0.5) and all will get fired. Every rule of them pays out from its capital, the
amount of its bid resulting in changing its strength from 10 to 5 (10 - 5). The
firing of rules 1 and 2 will initiate rule 6 whereas the firing of rule 4 initiates rule
9. Rules 6 and 9 are now in the conflict set making each a bid of 5, both will get
fired and their strength is changed to 5 (10-5). Accordingly rule 6 pays out 2.5
each to rules 1 and 2 changing their strength to 7.5 (5 + 2.5). Rule 4 gets the bid
paid by rule 9 and its strength is changed to 10. The strength of rule 9 will
remain 5 since it does not initiate any other rule. The firing of rules 6 and 3
initiates rule 7 whereas the firing of rules 6 and 4 initiates rule 8 while the firing
of rules 6 and 5 initiates rule 10. Consequently rules 7, 8, and 10 enter the
conflict set and make each a bid of 5 changing their strength to 5 (10-5). This
process makes rule 7 to payout 2.5 each to rules 6 and 3 and thereby their
strength become 7.5 (5 + 2.5). Similarly rule 8 pays out 2.5 each to rules 6 and 4 .
modifying their strength to 10 (7.5 + 2.5) and 12.5( 10+ 2.5) consecutively. Again
rule 6 gets a payoff from rule 10 when the later pays out 2.5 each to rules 5 and
6 changing their strengths to 10 and 12.5 (10+2.5). Now, no more inference is
possible, and the results achieved by rules 7, 8, 9, and 10 are sent to the
83
7.5
( a)
( h)
19.315
( c )
Fig .3.1. The chosen solution paths and the litOdifications of rule strengths under the standard bucket brigade algoritha according to environ~nt in~lts (a) Is-holiday, Good-clinate and, John-like-swi~ning (b) Is-holiday, Good-cli~ate, and John-like-skiing; and (c) Sequence of inputs (a), (b), and (a).
84
environment for judgment. Rules 7 and 8 produce correct conclusions and rules
9 and 10 produce wrong conclusions (according to the environment input). The
environment rewards rules 7 and 8 by giving 20 (40/2) for each, this will lead to
changing of these rules's strength to 25 each while rule 9 and 10 will not get their
strength changed (it remains 5 for each). At the end of this stage the strength of
the rules will be as shown in Fig. 3.1(a) ( in this Figure and in the Figures that
follow, the encircled numbers represent rule numbers while numbers on the top ,
of the circles indicate the strength of the nodes) and listed below:
Rule I : 7.5
Rule 2 : 7.5
Rule 3 : 7.5
Rule 4 : 12.5
Rule 5 : 7.5
Rule 6 : 12.5
Rule 7 : 25
Rule 8 : 25
Rule 9 :5
Rule 10 : 5
case(b): Environment input(b)
Is-holiday, Good-climate, John-like-skiing
. Rules 1 and 2 enter the conflict set and both will fire paying each a bid of
3.75 changing their strength to 3.75 (7.5-3.75). Rule 6 is initiated by rules 1 and 2
85
and it gets fired paying a bid of 6.25 changing its strength to 6.25 (12.5 - 6.25)
and each the strength of rules 1 and 2 to 6.875 (3.75 + 6.25/2). Rule 6
subsequently initiates rule 10 which will fire paying out a bid of 2.5 to rule 6.
Thu.s the strength of rule 6 becomes 8.75 (6.25 + 2.5) whereas the strength of rule
10 is decremented to 2.5 (5-2.5). Since no more rules can be fired. the action of
rule 10 is considered to be the final conclusion. Comparing this conclusion with
the expected one (correct conclusion). the environment pays its reward to rule
10 changing its strength to 42.5 (2.5 + 40). As shown in Fig. 3.1 (b). at the end of
this stage, the strength of the rules will be as follows:
Rule 1 : 6.875
Rule 2 : 6.875
Rule 3 : 7.5
Rule 4 : 12.5
Rule 5 : 7.5
Rule 6 : 8.75
Rule 7 : 25
Rule 8 : 25
Rule 9 :5
Rule 10 : 42.5
86
case(c) : Environment inputs as input(a), then input(b) followed again by
input(a).
Fig.3.1(c) shows the result of applying this sequence of environment
messages. The strength of the rules at the end of this stage will he as follows:
Rule 1 : 19.375
Rule 2 : 19.375
Rule 3 : 13.125
Rule 4 : 12.5
Rule 5 : 24.656
Rule 6 : 42.626
Rule 7 : 36.25
Rule 8 : 36.25
Rule 9 : 1.25
Rule 10 : 25.312
From the ahove discussion, it is clear that rule 10 is a good rule where it
could prove that it is able to produce a correct conclusion (i.e., case(b)) despite
that it produced a wrong conclusion in case(a). This is due to the wrong rule
being initialized with (rule 5). This made rule 10 to lose its bid without gaining
any reward from the environment and yet rule 5 which is wrong is not easy to get
weakened. That is when case(a) is tried rule 10 is weakened and rule 5 is
strengthened (because it gets a payoff from rule 10 which may be higher than
what it pays out). To weaken rule 5 messages in case(a) may be tried for many
87
times, but this solution is still not acceptable because rule 10 will be also
weakened in addition to the fact that the environment messages should be chosen
at random.
In the following sections, we propose a modified version of bucket brigade
algorithm that overcomes the above mentioned problem and is suitable for CPRs
as well as HCPRs system.
3.3. Modified Bucket Brigade For CPRs System
In the standard bucket brigade, the bid formula is a multiplication of rule'~
strength, its accuracy, and a moderating factor which is less than one B j =
Sj * ACC j * b. In the classifier system language ACC j is the number of non don't
cares by the length of the condition string. For CPRs, ·we use the same bid
formula but with a different way of com,puting ACC j to enter the conflict set, any
CPR should be matched completely (preconditions) and the censor conditions
checked should be false. The number of censor conditions to be checked can be
. found by GCS (implicitly).
To compute the value of ACCi , we consider the following four cases
(omax j below means that all the existing censor conditions are checked).
1) P --+ D L C V UNK Yj ~ OJ ~ omax j < 1
2) P--+DLC (UNK = </» Yi ~ °i < Omax· = 1 - 1
3) P --+ D L UNK Yi = OJ = omax j < 1
4) P --+ D Yj = o· = omax j = 1 . 1
88
The first case may be analysed as following:
a) Some of the censor conditions are unknown giving rise to omaxj < 1.
b) N one of the censor condi tions are checked (depending on system
constraints and user requirement) giving rise to Yj = OJ.·
c) Some of the censor conditions are checked giving rise to Yj < OJ < omaxj.
d) All the censor conditions are checked giving rise to Yj < OJ =
Smax j < 1.
Similarly, all the other cases can be analysed as the first one, where in the
second case all the censor conditions are known and no more exceptionals are
there. In the third case, none of the exceptional cases is known while in the forth
case, the rule is perfect and no exceptionals are there.
Based on the likelihood of censor conditions, the censor conditions may
be put in two forms:
1) Censor conditions in all the rules have equal likelihood.
2) Censor conditions have unequal likelihood.
According to GCS, all the rules are considered to be in level 0 (no
specificities). Thus the certainty factor to be achieved for every rule is the same
but the number of censor conditions to be checked is still depends on the value
of y. In the case of unequal likelihood for censor conditions, the censor
conditions may he ordered according to the decreasing order of likelihood.
89
From the previous discussion, we can conclude that c5max j is computed as
following
n c5 ma~ Yj + ~ CF (Ck)
k=1
where n is the number of censor conditions attached to the ith rule,
CF(Ck) is the certainty factor associated with the kth censor condition.
The value of 6 i is then can be computed using the following formula
m OJ Yi + ~ CF(Cd)
d=1
where m is the number of censor condition checked in rule j , and CF(Cd) is the
certainty factor associated with the dth censor conditions Cd checked.
Finally, we define ACC j in terms of c5 j and omax j :
The other issue which is considered here that bucket brigade should be
able to weaken 'wrong rules which effect good ones and therefore their strengths
become very low prohibiting such rules to be qualified to enter the conflict set,
thereby they will not initiate any rule.
To solve this problem, every rule will have a rule- status in addition to its
strength. The rule-status shows whether the rule is correct or incorrect. Five cases
for rule-status are there: '#', 'C', 'NC', 'D', and 'X'. '#' means still it is not
known if this rule is correct or incorrect (it is the initial status for every rule), 'C'
denotes that the rule is correct, 'NC' says that the rule is incorrect, 'D' status
90
represents the doubtful rule (its status should be changed to 'C' or 'NC') and 'X'
status is the status of confirmed correct rules.
The proposed algorithm is very simple, that is, it uses the same technique
for bid distribution of the standard bucket brigade algorithm. The only difference
is that the rules with 'NC' status will pay their bid to the rules activated them but
will not get any payoff from the rules they activate (if the rule with 'NC' is a final
node, it may ge"t payoff from the environment if it produces correct conclusion).
The other difference is that, all the intermediate actions are stored because some
rule may be activated by more than one condition (needs more than one
intermediate rule to get fired) which may not be satisfied at the same time (this
is not true in classifier system -step 5- ).
3.3.1. Algorithm I
For every rule A that activates rule B, one of the following rules IS
applicable :
s 1. IF (B is not the final node)
THEN IF «( status(B) = 'D' or status(B) = '#') AND
(status(A) = 'D'orstatus(A) = 'X'or
status(A) = 'NC' or status (A) = '#' or
status(A) = 'C'» OR
«status(B) = 'NC') AND (status(A) = 'X' or
91
status(A) = 'C' or status(A) = 'NC' or
status(A) = 'D'» OR
«status(A) = 'NC' or status(A) = 'D') AND
status(B) = 'C'»)
THEN BEGIN
status(A) = status(A).
status(B) = status(B)
END
s2. IF (B is the final node) AND (status(A) = 'C' or
status(A) = 'X' or status(A) = 'D' or status(A) = 'NC')
THEN
BEGIN
IF (status(B) = 'C' or status(B) = 'X')
THEN
BEGIN
IF (status(A) = 'NC' or status(A) = 'D')
THEN BEGIN
status(B) = status(B)
IF (correct conclusion)
END
END
92
THEN status(A) = 'C'
ELSE status(A) = 'NC'
END
ELSE (* status(B) = '#' or 'NC' or '0')
IF (status(A) < > 'NC' or status(A) < > '0')
THEN BEGIN
status(A) = status(A)
IF (correct conclusion)
END
THEN status(B) = 'C'
ELSE status(B) = 'NC'
ELSE BEGIN (* status(A) = 'NC' or '0' *)
IF (correct conclusion)
THEN BEGIN
status(A) = 'C'
status(B) = 'C'
END
ELSE BEGIN (* Incorrect conclusion *)
status(A) = 'NC'
status(B) = 'NC'
END
END
s3. IF (status(B) = 'X' or status(B) = 'C')
THEN IF (status(A) = 'X' or status(A) = 'C')
THEN BEGIN
status(A) = 'X'
93
status(B) = 'X'
END
ELSE IF (status(A) = '#')
THEN BEGIN
status(A) = 'D'
status(B) = 'X'
END
s4. IF (status(A) = '#') AND (B isthe final node)
THEN IF (status(B) = 'C' or status(B) = 'NC' or
status(B) = 'D' or status(B) = 'X')
THEN IF (correct conclusion)
THEN BEGIN
status(A) = 'C' .
IF (status(B) = 'D' or status(B) = 'NC')
THEN status(B) = 'C'
ELSE status(B) = status (B)
END
ELSE BEGIN
status(A) = 'NC'
IF (status(B) = 'D')
THEN status(B) = 'NC'
ELSE status(B) = status(B)
94
END
ELSE BEGIN
status(A) = status(A) (* status(B) = '#' *)
IF (correct conclusion)
THEN status(B) = 'C'
ELSE status(B) = 'NC'
END
s5. . IF (status(B) = 'X') AND (B is not the final node) AND
(status(A) = '0' or status(A) = 'NC')
THEN
BEGIN
Al = A
WHILE (status(B) = 'X' & B is not the final.node) DO
BEGIN
A=B
B = rule activated by A
END
IF ( B is the final node)
THEN
BEGIN
IF (status(B) = 'X')
THEN IF (correct-conclusion)
95
THEN status(A 1) = 'C'
ELSE status of Al = 'NC'
END
ELSE (* B is not the final node *)
BEGIN
A=B
B = rule activated by A
END
END
s6. IF (status(B) = 'NC') AND (status(A) = '#') AND (B is
not the final node)
THEN status(A) = 'NC'
status(B) = status(B)
Note: In case A activates more than one rule, Le.,
A -----+1 Bl A I B2
A ------II B· I
A -----+1 Bk
. then the above steps are applied to the rule A ---II Bj such
that status(Bj ) is the highest in the order 'X', 'C', '0', 'NC', or '#'.
To understand the process of the modified algorithm, we shall consider
Fig.3.2, where the encircled numbers represent rule numbers, the values on top
96
Fig .3.2. Sequence of possible solution paths. Encircled IlUJIIbers denote the rule nMbers and the I'lWlbers on top of the nodes represent the rule strength.
97
of the nodes are the strength of the corresponding rules, and arrows indicate
solution paths. We shall assume that there are three solution paths, path 1 : 1, 2,
3, and 4; path2 : 6, 2, and 5 and path3 : 7. 3, and 4 and th,at path 1 produces
wrong conclusion while path2 and path3 produce correct conclusions. This means
that rule 1 is incorrect rule. First, we consider the standard bucket brigade
algorithm where all the rules initially have strength equal to 10. In case of correct
conclusion the -environment reward would be 10. Fig.3.3- shows a sequence of
solution paths corresponding to differ~nt environment inputs and the modified
rule strengths under the standard bucke~ hrigade algorithm.
The same sequence of solution paths is also shown in Fig.3.4 but this time
the modified bucket brigade algorithm is applied. Comparing the two figures
(Fig.3.3 and Fig.3.4). we notice that rule 1 (incorrect rule) is participating in the
solution paths 1, 2, 4. 9, 10. 15. 16, and 17 and has almost the same strength
value in both the figures in the solution paths 1, 2, 4, 9, and 10. This is because
rule 1 under the modified version has not been recognized as incorrect rule ('NC'
rule status) until solution path 10 is chosen. Recognition of rule 1 as a wrong rule
in the solution path 10 caused a contin~ous decrease (solution paths 15, 16, and
17) in its strength from 10.781 in the solution path 10 to solution path 1.349 in
the solution path 17. In contrast, under the standard bucket brigade algorithm
(Fig.3.3), the strength of rule 1 starts fluctuating after solution path 10 with a
minor difference but still its strength is high compared to the other nodes. It is
also to be noticed that the strength of the incorrect rule 1 in the solution path
17 in Fig.3.3 is much higher than the strength of rule 4 which is a correct rule
98
1. 111.
9.316 7.382 12.95 2. 11.~
19 5 11.25
3.~ 8.349 9.716 16.925
12.~
9.932 12.87 18.912 4. 13.~
9.962 6.875 12.812 19.951 15.441 19.996
5.~ 14;~
7.962 9.843 16.4Q6
6.~ 15.
8.75 8.75 15
7.~ 16.
8.75 11.875 17.5
8.~ 17.
9.
Fig .3.3. Sequence of solution paths and nodir ication of rule strengths according to the standard bucket brigade algorithn, Encircled nunbers denote the rule ~bers, On top of each node the rule strength is shown,
99
whereas for the same solution path in Fig.3.4, the strength of rule 1 is less than
the strength of rule 4. From the above discussion it is implied that, under the
standard bucket brigade algorithm, weakening of rule 1 amounts weakening of
rule 4 and the strengthening of rule 4 would strengthen rule 1 too. On the
Contrary, under the modified algorithm, after reaching certain stage, incorrect
rule 1 is weakened whenever it is used and strengthening of rule 4 would not
strengthen rule 1.' This will lead to recession in the strength of rule 1 and
eventually its strength will be less than the threshold value thereby preventing . .
such incorrect rules from participation in the solution path.
Fig.3.5 shows the performance of the modified bucket brigade algorithm
using the same set of rules and the cases case(a), case(b), and case(c) of the
environment inputs as given in section I. A comparison of Fig.3.1 and Fig.3.5
shows that during the first two cases case(a) and case(b), the strength of the rules
change in the same manner. That is, unt'il the end of these two cases, the
modified algorithm produced the same results as produced by the standard bucket
brigade. The reason is that, rule 5 which is incorrect rule was not yet recognized
as a wrong rule. Under the case(c) it is observed that the modified algorithm out . .
performs the standard bucket brigade algorithm as soon as rule 5 is recognized
as a wrong rule. Consequently the strength of rule 5 is decremented to 12
whereas for the same rule under the standard bucket brigade algorithm, the
strength is 24.656. Recognition of rule 5 as a wrong rule will cause a continuous
decrease in its strength whenever it is used and eventually it will not be able to
enter the conflict set. However, the incorrect rule 9 which is a final node gets
. 100
1. (51,54)
2. (51,54) 1~
1
3. (56,52)
7.5
8.75 3.75 11.25 ~'C' ~ ~~ ~~
4. (51,53) 19
5. (55,52) 4.375 6.562 12.815 ~'X' ~ ~~ ~~
6. (52) 5.468 9.688 16.496 ~'X' ~.-\2J r~
7. (51,53) 8.437 8.437 15 ~'C' ~ ~~ ~0
8. (53) 8.437 11.718 17.5 ~'X' ~-\2J ~0
9. (52)
4.1~1 , X'
11. (s2) 2.5 8.Q46 7.363 12. ~5
, NC' ~'X' \.2.J-~ 12. (52)
7.7~4 9.796 16.925 ~'X' \.2.J- ~~ ~~
13. (53) 5.625 8.795 12.865 18.912
'X' ~'X' ~. ~~ ~~
14. (52) 19.785 15.438 19.~~6 ~'X' \Y- ~0- ~
15. (55,52)
8.2~3 'X'
Fig .3.4. Sequence of solution paths showing the perforlllance of the lIIodif ied bucket brigade algorit,hlll. With each solution path the steps of the Algorithlll I used are indicated e,g" (sl,s2L Encircled nunbers denote the rule nUfilbers, On top of each node the rule strength & rule-status are shown.
101
12.S'M' 4 }-C'------ffi
7.S ,M, S l---------ti
(51.54) < a)
<51.54> < b)
(53.55)
<c>
Fig. 3. 5. The chosen so lut ion paths and the IIIOd if icat ions of ru Ie strengths under the proposed algorithlll according to inputs (a) Is-holiday, Good-clilllate, and John-like-swilllllling (h) IS-}10 liday , Good-clilllate, dnd John-like-skiing; and (c) Sequence of inputs (a), (b), and (a). The steps llsed undr the lllooified algorithlll are shown below each Figure.
102
weakened in both the algorithms and has the same strength value (1.25).
3.4. Modified Bucket Brigade For HCPRs System
. In this section, we consider the extension of the proposed algorithm for
'HCPRs system. The extension is based on the following modifications:
The GCS [3] provides a trade-off between certainty factor and specificity
where in a HCPR-tree a less specific node at a higher level has a higher
certainty and a more specific node at a lower level has a lower certainty.
According to our proposed bid formula for CPRs 'system, the certainty
factor of a rule is included in computing its accuracy. This would mean
that in HCPRs system, a HCPR at a higher level with lower specificity may
always pay a bid which is more than or equal to the bid paid by a lower
node with higher specificity. Therefore, the bid formula as such would not
provide a fair bid payment scheme for HCPRs system. To resolve this
problem, rules may compete according to the same suggested bid formula
but accuracy (ACC) is removed from the formula while paying the bid.
Whenever an intermediate node in a given HCPR-tree is matched but its
strength is less than the threshold value to enter the conflict set (weak . .
rule), the rule is not fired while the computation for the lower nodes in
the hierarchy is continued. This process would facilitate the firing of the
next rule in hierarchy and avoid firing of weak rule. The rule fired after
the weak rule in the hierarchy will not pay its bid to the weak rule but to
103
the rule flbove the weak rule in hierarchy.
It is to be noticed in this regard that if the root node of any
HCPR-tree itself is having a strength less than the threshold value, the
rule corresponding to the root will not be fired and subsequently the
HCPR-tree is considered to be wrong, and no inference is allowed through
this tree.
, When the conclusions are reached (no more rules can be fired), the
environment checks the output of all the nodes fired in the last HCPR-tree
activated and their status is modified accordingly. The last HePR fired
would receive the environment reward in case it produces correct
conclusion.
In order to understand the various steps involved in the algorithm, we
consider the four HCPR-trees given in Fig.3.6 Let us assume that initial strength
of all the rules is 10, the environment reward is 20, and the value of b is 0.5.
Initially all rules have '#' status. In the following implementation we are
assuming that the system receives environment inputs in the order (a) to (e) as
given in Fig. 3.7.
case (a): Environment input (a)
John-Iive-in-Delhi, New-day, Time-6AM,' John-interested-
i n- ba II-ga mes, J ohn-interested-in-football
104
2. Is-at- one (X) [Tine-night]
I. is-in-city(X,YI [ Live-in-city(X,YI]
3. Is-outside-hone(XI !Tillle-dayl
4. Is-working-outdoor(XI [Day-workingl
HCPR-tree 1
5. Is-enterta in ing-outdoor (X) [Day-sundayl
6. Is-doing-exercise [is-outside-hollle(XI,Tillle-earIY-lIIorningl
7. Is-jogging (Interested-in-athletics]
8. Is-enter aining-outdoor(XI [Interested-in-baIl-galllesl
9. Is-playing-football [lnterested-in-foot-balll
HCPR-tree 2
11. Is-playing-basket-ball
[In-basket-balHieldl
105
18. ls-playing-hand-ganes [Interested-in-hand-ball-galllesl
12. Is-playing-vo lley-ba II
[J n-voo I ey-ball-f ieldl
13. Is-playing hand-ball
[In-hand-ball-f ieldl
14, is-in-sociai-activities(X) [Is-entertaining-outdoor]
15, Is-in-ciuh(XI/Is-visiting-friends(XI [Ti~e-after-noon]
17, /s-in-club(x) [3p~-4p~1
18, Is-visiting-friends(XI [4p~-Sp~1
HCPR-tree 3
26, Ti~e-day
[4:36a~-7p~1
19, Ti~e-day/Ti~e-night
[new-dayl
22, Ti~e-ear lY-filorn ing 23. Tiflle-after-noon [5pfll-7afll] [12afll {= Spflll
HCPR-tree 4
16, Is-in-~ouie(XI
[Ti~e-euening]
21. Tiflle-night [) 7pfll { 12pflll
Fig.3.G. Four HCPR-trees to decide what X is doing.
106
HCPR-tree 4
HCPR- tree 1
(51. .54)
( a)
7.5
(51,54)
( b)
, C'
107
HCPR-tree 4
HCPR- tree 1
(51,53,54)
(c)
(51,53,54)
( d)
(e)
1.25 , C'
Fig.3. 7. Application of the 1II0dif ied algorithlll on HCPRs in Fig.3.6. The changes in the strengtll values anti rule-status are shown for the enviroTlfJlent inputs (a) John-live-in-Delhi, New-day, Tillle-6Aft, John-interested-in-ball-ganes, and John-interested-in-football (b) John-live-in-Delhi, New-dau, Tillle-6Pn, and Day-sunday (c) The input as ;n (a). (d) The input as in (b). (e) Several trai Is of inputs as in (a) or (b) and finally input as in (a), (The dotted arrow denotes that rule 9 pays its bid to rule 6 and not to rule 8).
108
Rules 19, 20, and 23 are matched from HePR-tree 4 and rule 1 is matched
from HePR-tree 1. Rule 20 can not fire unless rule 19 is fired and similarly rule
23 can not fire unless rule 20 is fired (because of hierarchy). Rules 19 and 1
enter the conflict set each making a bid of 5 and subsequently both are fired
decrementing their strength to 5 (10-5). Rule 20 from HePR-tree 4 is activated
by rule 19 and it pays out a bid of 5 to rule 19. The strength of rule 20 is
decremented to 5 (10-5) and strength of rule 19 becomes 10. At this stage, rule
23 from HCPR-tree 4 gets initiated by rule 20 of the same HCPR-tree (the
generality of rule 23) and rule 30f HePR-tree 1 gets initiated by its generality
rule 1 and rule 20 of HePR-tree 4. Rules 3 and 23 enter the conflict set making
each a bid of 5. Both rules are fired and rule 23 pays out its bid to rule 20. This
would result in·a strength of 10 (5+5) to rule 20 and 5 (10-5) to rule 23. Rule 3
pays out 2.5 to rule 20 and rule 1 eacq, so rule 5 gets its strength decremented
to 5 and the strength of rule 20 is incremented to 12.5 (10 + 2.5) while the
strength of rule 1 is incremented to 7.5 (5 + 2.5). Rule 6 gets initiated by rules
3 and 23, it pays out its bid of 5 to these rules reducing its strength to (10-5) and
increasing strength of bo'th rules 3 and 23 to 7.5 (5 + 2.5). Rule 8 will then be
initiated by rule 6 and gets fired paying a bid of 5 to rule 6 changing its strength
to 5( 1 0-5) and increasing strength of rule 6 to 10 . Firing of rule 8 initiates rule
9 (rule 8 's specificity) and rule 14 from HePR-tree 3. Both the rules 9 and 14
are fired paying each a bid of 5 to rule 8, thereby the strength of rule 8 becomes
15 and the strength of rules 14 and 9 becomes 5 each. Since now no more rules
can be fired, the last rules fired are 'rules 14 of HePR-tree 3 and rule 9 of
109
HCPR-tree 2. Considering these two trees as the conclusive HCPR-trees, the
results corresponding to all the nodes fired within these two trees are checked
against the expected output. Rules 6 and 9 of HCPR-tree 2 produce correct
conclusion and therefore will get their status changed to 'C' whereas rule 8 will
get its status changed to 'NC' since "it produces a wrong conclusion (wrong
specificity). Rule 14 of HCPR-tree 3 will get a status of 'NC' because it produces
incorrect conclusion. Obviously rule 9 gets a reward from the environment
changing its strength to 25 (5 + 20). The illustration of the above process is shown
in Fig. 3.7(a).
case(b): Environment input(b).
John-lives-in-Delhi, New-day, Time-6PM, Day-sunday
This time rules 19, 20, and 22 are fired from HCPR~tree 4 and rules 1 and
3 are fired from HCPR-tree 1 as in the same process explained in case (a). Firing
of rules 22 and 3 would result in firing of rule 5 of HCPR-tree 1 and
subsequently rule 14 of HCPR-tree 3 is fired. Finally rule 16 of the same
HCPR-tree is fired. After rule 16 has fired, no further inference is possible.
Checking the last HCPR-tree (HCPR-tree 3) fired, rule 14 produces correct
conclusion, hence its status is changed from 'NC' to 'C', rule 16 also produces
correct conclusion changing its status from '#' to 'C'. This process is shown in
Fig.3.7(b).
110
case(c): Environment input is the same as input(a).
The whole process along with strength and status modifications is shown
in Fig. 3.7(c).
case(d): Environment input is the same as input(b).
Fig. 3. 7( d) shows the latest modifications of strengths and rule-status.
case(e): After several trials of input(a) or input(b) are applied.
The strength of rule 8 will be decreased because it is status is 'NC' status
and it will not get any advantage from the strengthening of rule 14 or rule 9.
After many trials the strength of rule 8 would be less than the threshold
necessary for entering the conflict set. The strategy is that the conditions of rule
8 are matched without taking action permitting rule 9 to fire which is correct and
rule 14 will not· be fired since it is dependent on the action of rule 8. It is to be
noticed that rule 9 pays its bid to rule, 6 but not to rule 8 (rule 8 is not fired).
Fig. 3. 7( e) shows the sequence of the rules fired if at the end of several trials of
input(a)or input(b) the environment messages of case(a) are again given as input
(the dotted line in the figure denotes that rule 9 pays its bid to rule 6 and not to
rule 8).
111