3. BUCKET BRIGADE ALGORITHM FOR HIERARCHICAL...

3. BUCKET BRIGADE ALGORITHM FOR HIERARCHICAL CENSORED PRODUCTION RULE-BASED SYSTEM

In this chapter, we study the suitability of the standard bucket brigade

algorithm to Variable Precision Logic systems and propose a modified version of

bucket brigade algorithm. A solution to the problem of weakening correct rules

that are activated by incorrect rules created by the standard bucket brigade

algorithm is also incorporated in the modified version.

3.1. Bucket Brigade Algorithm: An overview·

Many learning systems face the problem of temporal credit allocation: the

proper reinforcement of activities that do not directly result in need satisfaction

or external reward but are nevertheless essential precursors to such outcomes

[Wilson]. Therefore, the development of an appropriate credit apportionment

scheme for adaptive rule- based systems is considered to be one of the main

issues in machine learning (e.g., [1], [2], [22], [33], [42], [43], [46], [65], [66], [72],

[74J, [75], [78],179], [80], [81], [83]). One well known algorithm in this regard is

the bucket brigade algori thm of Holland [42], [43], which is basically designed for

message-passing, rule-based classifier systems (genetic algorithms approach).

De long [19] mentioned" I think it is fair to say that most people who encounter

classifier systems after becoming familiar with the traditional genetic algorithms

literature are some what surprised at the emergence of the rather elaborate

(bucket brigade) mechanism to deal with apportionment of credit issues".

76

In certain other AI systems that learn to perform multi- step tasks as LEX

system [58] for symbolic integration, and ACT· cognitive architecture [1], credit

is assigned to early steps by keeping and analysing a record of all pre-payoff

actions, both considered and taken, and the associated reasoning. Because it is

not possible to keep track of all paths of activation actually followed by the rule

chains (a rule chain is a set of rule activated in sequence, starting with a rule

activated by environmental messages and ending with a rule performing an action

on the environment) where the number of these paths grows exponentially [22], . .

bucket brigade technique does not depend on retrospective analysis but operates

locally, during performance, in the strength transaction between steps, with the

better classifiers at each step being selected statistically over time. The bucket

brigade implements an "economy" in which "information" flows forward with

inference and "strength" flows backward in a "trickle down" distribution pattern.

Therefore, bucket brigade principle would appear appropriate for systems -such

as animals and autonomous robots in on-going interaction with uncertain

environments- where storage and analysis of raw experience is expensive or

impractical [82].

Many attempts have been made to study the bucket 'brigade algorithm and

how to overcome its limitations. Antonisse [2] explored the adaptation of the

bucket brigade to production systems where he addressed the problem of

unsupervised credit assignment. He concluded that the bucket brigade is an

inexpensive algorithm and stable method that appears to be convergent. However

according to Antonisse the main limitations of the bucket brigade that it does not

77.

gIve absolute measures of performance for rules in the system and the

propagation of credit along paths of inference is proportional to the lengths of

these paths, rather than constant. Huang [46] realised that the usefulness of the

action conducted by a rule is a function of the context in which the rule activates

and scalar-valued strengths in bucket brigade algorithms are not appropriate

approximator for the usefulness of rule actions. He therefore proposed an

algorithm called context-array bucket brigade algorithm in which' more

sophisticated rule strength such as array-value strengths are used. In order to

handle the default hierarchy formation problem (default hierarchy to work

properly classifiers that implement exception rules must be able to control the

system when they are applicable, thus preventing the default rules from making

mistakes) in the standard bucket brigade algorithm, Riolo [65] suggested simple

modification to the standard bucket brigade algorithm. The modification involves

biasing the calculation of the effective bid so that general classifiers (those with

low bid ratios) have much lower effective bids than do more specific classifiers

(those with high bid ratios). Dorigo [23] addressed ~he same problem and

proposed an apportionment of credit algorithm, called message-based bucket

brigade, in which messages instead of rules are evaluated and a rule quality is

considered a function of the value of the messages matching the rule conditions,

of the rule specificity and of the value of the message the rule tries to post.

Spiessens [72] has shown that the bucket brigade is not very efficient in

providing accurate predictions of the expected payoff from the environment and

presented an alternative algorithm for a classifier system called pes. The

78

strength of a classifier in pes is reliability measure of the prediction of this

classifier.

The problem of not adequately reinforce' long action sequences by bucket

brigade has been discussed by Wilson [81] and an alternative form of the

algorithm was suggested. The algorithm was designed to induce behavior

hier'archies in which modularity of the hierarchy would keep all the bucket

brigade chains short, thus more reinforceable and more rapidly learned, but

overall action sequences' could be long. The problem of long action chains was

mentioned for the first time by Holland [85] and he suggested a way to

implement "bridging classifiers" that speed up the flow of strength down a long

sequence of classifiers. Riolo [66] presented a comparison study between the . ,

standard bucket brigade algorithm and the "bridging classifiers" using the

CFS-C/FSW1 classifier system [63] [64]. Results illustrate that "bridging classifier"

have shown to dramatically decrease the number of times a long sequence must

be executed in order relocate strength to all the classifiers in the sequence.

Westerdale [78] claims that bucket brigade algorithm is a sub goal reward

scheme which may lead to inaccurate payoff, and therefore, he suggested another

scheme called genetic scheme in which the payoff is distributed in a different way

from the strategy used in the bucket brigade algorithm. Westerdale [80] further

examined the bucket brigade algorithm in a simple illustrative production system

(Railway). He employed a penalty scheme to produce an interretable bucket

brigade reward scheme. The result is a reward scheme that looks like a genetic

79

scheme with altruism added.

Grefenstette [35) presented a preliminary study of the use of the rule level

credit assignment provided by the bucket brigade in a s-ystem using the Pitt's

approach to genetic learning. The syst~m shows that the information concerning

the strength of individual rules can be useful in altering the physical

representation of the rule sets in order to promote the linkage of co-adapted sets

of rules.

To modify bucket brigade algorithm to be suitable for pattern matched

systems, it has to be noticed that there are three main differences between

classifier system and production system : (1) Knowledge representation (2)

probabilistic control (3) parallel versus sequential orientation [2].

Following Antonisse [2], because of the difference In knowledge

representation between classifier systems (a classifier is represented over the set

{O,l, *}) and pattern- matched production system (symbolic representation), the

determination of rule accuracy is highly problematic to production system

(unproblematic in classifier system).

Classifier systems apply classifiers probabilistically, proportional to (an

exponent of) the amount of their bids. In production system the branch with the

highest payoff is chosen whereas classifier systems choose inference branches. that

are (exponentially) proportional to expected payoff.

80

Classifier system is a massively parallel system, it expects all matching

classifiers to fire but adjudicates among classifiers that post incompatible results,

hence it resolves the problem of conflict resolution. In contrast, most of

production systems (e.g., OPS-like systems) support parallelism in the match

phase but expects single rule firing following conflict resolution. It is left to a

system builder to determine the retractions that remove productions from the

conflict set without firing.

3.2. Motivation

Under the standard bucket brigade algorithm, the correct rules initiated

by incorrect rules get weakened. To make the above mentioned problem clear,

let us consider the set of CPRs given below (we shall neglect the UNLESS part

for the time being). A full match for the preconditions is necessary to any rule

to enter the cOI1flict set. It is also further assumed, an equpl strength of value 10

is given to the rules in the set. The value of b is 0.5 and the environment reward

is 40.

Rule 1.

Rule 2.

IF [Is-holiday] THEN [Sunday]

UNLESS [National-day,Religious-day]

IF [Good-climate] THEN [Moderate-weather]

UNLESS [Is-raining,Is-very-hot]

IF [John-like-swimming] THEN [John-like-the-sea]

81

Rule 4.

Rule 5.

Rule 6.

Rule 7.

Rule 8.

Rule 9.

Rule 10.

IF [J ohn-like-swimming]

IF [John-like-swimming]

IF [Sunday,Moderate-weather]

UNLESS [John-fear-waves, J ohn-hate-sea-water]

THEN [John-like-swimming-pool]

UNLESS [Empty-swimming-pool]

THEN [John-like-skiing]

UNLESS [ ]

THEN [John-interested-in -outing]

UNLESS [John-is-ill, John-has-guest,

J ohn-watch-TV]

IF [John-like-the-sea, John-interested-in-outing]

THEN [John-may-go-to-the-sea]

UNLESS [sea-is-far]

IF [John-like-swimming-pool, J ohn~ in teres ted-in-ou ting]

THEN [John-may-go-to-swimming-pool]

UNLESS [swimming-pool-is-far]

IF [J ohn-like-swimming-p·ool]

THEN [John-may-go-for-mountaineering]

UNLESS [ ]

IF [John-like-skiing, John-interested-in-outing]

THEN [J ohn-may-go-to-snow-hills]

UNLESS [ 1

82

In the following explanation, we assume that the system receIves

envIronment inputs in the order (a) to (c) as given in Fig.3.1.

case(a): Environment input(a).

Is-holiday, Good-climate, John-like-swimming

Rules 1, 2, 3, 4, and 5 enter the conflict set and make each a bid of 5

(10*0.5) and all will get fired. Every rule of them pays out from its capital, the

amount of its bid resulting in changing its strength from 10 to 5 (10 - 5). The

firing of rules 1 and 2 will initiate rule 6 whereas the firing of rule 4 initiates rule

9. Rules 6 and 9 are now in the conflict set making each a bid of 5, both will get

fired and their strength is changed to 5 (10-5). Accordingly rule 6 pays out 2.5

each to rules 1 and 2 changing their strength to 7.5 (5 + 2.5). Rule 4 gets the bid

paid by rule 9 and its strength is changed to 10. The strength of rule 9 will

remain 5 since it does not initiate any other rule. The firing of rules 6 and 3

initiates rule 7 whereas the firing of rules 6 and 4 initiates rule 8 while the firing

of rules 6 and 5 initiates rule 10. Consequently rules 7, 8, and 10 enter the

conflict set and make each a bid of 5 changing their strength to 5 (10-5). This

process makes rule 7 to payout 2.5 each to rules 6 and 3 and thereby their

strength become 7.5 (5 + 2.5). Similarly rule 8 pays out 2.5 each to rules 6 and 4 .

modifying their strength to 10 (7.5 + 2.5) and 12.5( 10+ 2.5) consecutively. Again

rule 6 gets a payoff from rule 10 when the later pays out 2.5 each to rules 5 and

6 changing their strengths to 10 and 12.5 (10+2.5). Now, no more inference is

possible, and the results achieved by rules 7, 8, 9, and 10 are sent to the

83

7.5

( a)

( h)

19.315

( c )

Fig .3.1. The chosen solution paths and the litOdifications of rule strengths under the standard bucket brigade algoritha according to environ~nt in~lts (a) Is-holiday, Good-clinate and, John-like-swi~ning (b) Is-holiday, Good-cli~ate, and John-like-skiing; and (c) Sequence of inputs (a), (b), and (a).

84

environment for judgment. Rules 7 and 8 produce correct conclusions and rules

9 and 10 produce wrong conclusions (according to the environment input). The

environment rewards rules 7 and 8 by giving 20 (40/2) for each, this will lead to

changing of these rules's strength to 25 each while rule 9 and 10 will not get their

strength changed (it remains 5 for each). At the end of this stage the strength of

the rules will be as shown in Fig. 3.1(a) ( in this Figure and in the Figures that

follow, the encircled numbers represent rule numbers while numbers on the top ,

of the circles indicate the strength of the nodes) and listed below:

Rule I : 7.5

Rule 2 : 7.5

Rule 3 : 7.5

Rule 4 : 12.5

Rule 5 : 7.5

Rule 6 : 12.5

Rule 7 : 25

Rule 8 : 25

Rule 9 :5

Rule 10 : 5

case(b): Environment input(b)

Is-holiday, Good-climate, John-like-skiing

. Rules 1 and 2 enter the conflict set and both will fire paying each a bid of

3.75 changing their strength to 3.75 (7.5-3.75). Rule 6 is initiated by rules 1 and 2

85

and it gets fired paying a bid of 6.25 changing its strength to 6.25 (12.5 - 6.25)

and each the strength of rules 1 and 2 to 6.875 (3.75 + 6.25/2). Rule 6

subsequently initiates rule 10 which will fire paying out a bid of 2.5 to rule 6.

Thu.s the strength of rule 6 becomes 8.75 (6.25 + 2.5) whereas the strength of rule

10 is decremented to 2.5 (5-2.5). Since no more rules can be fired. the action of

rule 10 is considered to be the final conclusion. Comparing this conclusion with

the expected one (correct conclusion). the environment pays its reward to rule

10 changing its strength to 42.5 (2.5 + 40). As shown in Fig. 3.1 (b). at the end of

this stage, the strength of the rules will be as follows:

Rule 1 : 6.875

Rule 2 : 6.875

Rule 3 : 7.5

Rule 4 : 12.5

Rule 5 : 7.5

Rule 6 : 8.75

Rule 7 : 25

Rule 8 : 25

Rule 9 :5

Rule 10 : 42.5

86

case(c) : Environment inputs as input(a), then input(b) followed again by

input(a).

Fig.3.1(c) shows the result of applying this sequence of environment

messages. The strength of the rules at the end of this stage will he as follows:

Rule 1 : 19.375

Rule 2 : 19.375

Rule 3 : 13.125

Rule 4 : 12.5

Rule 5 : 24.656

Rule 6 : 42.626

Rule 7 : 36.25

Rule 8 : 36.25

Rule 9 : 1.25

Rule 10 : 25.312

From the ahove discussion, it is clear that rule 10 is a good rule where it

could prove that it is able to produce a correct conclusion (i.e., case(b)) despite

that it produced a wrong conclusion in case(a). This is due to the wrong rule

being initialized with (rule 5). This made rule 10 to lose its bid without gaining

any reward from the environment and yet rule 5 which is wrong is not easy to get

weakened. That is when case(a) is tried rule 10 is weakened and rule 5 is

strengthened (because it gets a payoff from rule 10 which may be higher than

what it pays out). To weaken rule 5 messages in case(a) may be tried for many

87

times, but this solution is still not acceptable because rule 10 will be also

weakened in addition to the fact that the environment messages should be chosen

at random.

In the following sections, we propose a modified version of bucket brigade

algorithm that overcomes the above mentioned problem and is suitable for CPRs

as well as HCPRs system.

3.3. Modified Bucket Brigade For CPRs System

In the standard bucket brigade, the bid formula is a multiplication of rule'~

strength, its accuracy, and a moderating factor which is less than one B j =

Sj * ACC j * b. In the classifier system language ACC j is the number of non don't

cares by the length of the condition string. For CPRs, ·we use the same bid

formula but with a different way of com,puting ACC j to enter the conflict set, any

CPR should be matched completely (preconditions) and the censor conditions

checked should be false. The number of censor conditions to be checked can be

. found by GCS (implicitly).

To compute the value of ACCi , we consider the following four cases

(omax j below means that all the existing censor conditions are checked).

1) P --+ D L C V UNK Yj ~ OJ ~ omax j < 1

2) P--+DLC (UNK = </» Yi ~ °i < Omax· = 1 - 1

3) P --+ D L UNK Yi = OJ = omax j < 1

4) P --+ D Yj = o· = omax j = 1 . 1

88

The first case may be analysed as following:

a) Some of the censor conditions are unknown giving rise to omaxj < 1.

b) N one of the censor condi tions are checked (depending on system

constraints and user requirement) giving rise to Yj = OJ.·

c) Some of the censor conditions are checked giving rise to Yj < OJ < omaxj.

d) All the censor conditions are checked giving rise to Yj < OJ =

Smax j < 1.

Similarly, all the other cases can be analysed as the first one, where in the

second case all the censor conditions are known and no more exceptionals are

there. In the third case, none of the exceptional cases is known while in the forth

case, the rule is perfect and no exceptionals are there.

Based on the likelihood of censor conditions, the censor conditions may

be put in two forms:

1) Censor conditions in all the rules have equal likelihood.

2) Censor conditions have unequal likelihood.

According to GCS, all the rules are considered to be in level 0 (no

specificities). Thus the certainty factor to be achieved for every rule is the same

but the number of censor conditions to be checked is still depends on the value

of y. In the case of unequal likelihood for censor conditions, the censor

conditions may he ordered according to the decreasing order of likelihood.

89

From the previous discussion, we can conclude that c5max j is computed as

following

n c5 ma~ Yj + ~ CF (Ck)

k=1

where n is the number of censor conditions attached to the ith rule,

CF(Ck) is the certainty factor associated with the kth censor condition.

The value of 6 i is then can be computed using the following formula

m OJ Yi + ~ CF(Cd)

d=1

where m is the number of censor condition checked in rule j , and CF(Cd) is the

certainty factor associated with the dth censor conditions Cd checked.

Finally, we define ACC j in terms of c5 j and omax j :

The other issue which is considered here that bucket brigade should be

able to weaken 'wrong rules which effect good ones and therefore their strengths

become very low prohibiting such rules to be qualified to enter the conflict set,

thereby they will not initiate any rule.

To solve this problem, every rule will have a rule- status in addition to its

strength. The rule-status shows whether the rule is correct or incorrect. Five cases

for rule-status are there: '#', 'C', 'NC', 'D', and 'X'. '#' means still it is not

known if this rule is correct or incorrect (it is the initial status for every rule), 'C'

denotes that the rule is correct, 'NC' says that the rule is incorrect, 'D' status

90

represents the doubtful rule (its status should be changed to 'C' or 'NC') and 'X'

status is the status of confirmed correct rules.

The proposed algorithm is very simple, that is, it uses the same technique

for bid distribution of the standard bucket brigade algorithm. The only difference

is that the rules with 'NC' status will pay their bid to the rules activated them but

will not get any payoff from the rules they activate (if the rule with 'NC' is a final

node, it may ge"t payoff from the environment if it produces correct conclusion).

The other difference is that, all the intermediate actions are stored because some

rule may be activated by more than one condition (needs more than one

intermediate rule to get fired) which may not be satisfied at the same time (this

is not true in classifier system -step 5- ).

3.3.1. Algorithm I

For every rule A that activates rule B, one of the following rules IS

applicable :

s 1. IF (B is not the final node)

THEN IF «( status(B) = 'D' or status(B) = '#') AND

(status(A) = 'D'orstatus(A) = 'X'or

status(A) = 'NC' or status (A) = '#' or

status(A) = 'C'» OR

«status(B) = 'NC') AND (status(A) = 'X' or

91

status(A) = 'C' or status(A) = 'NC' or

status(A) = 'D'» OR

«status(A) = 'NC' or status(A) = 'D') AND

status(B) = 'C'»)

THEN BEGIN

status(A) = status(A).

status(B) = status(B)

END

s2. IF (B is the final node) AND (status(A) = 'C' or

status(A) = 'X' or status(A) = 'D' or status(A) = 'NC')

THEN

BEGIN

IF (status(B) = 'C' or status(B) = 'X')

THEN

BEGIN

IF (status(A) = 'NC' or status(A) = 'D')

THEN BEGIN


IF (correct conclusion)

END

END

92

THEN status(A) = 'C'

ELSE status(A) = 'NC'

END

ELSE (* status(B) = '#' or 'NC' or '0')

IF (status(A) < > 'NC' or status(A) < > '0')

THEN BEGIN

status(A) = status(A)


END

THEN status(B) = 'C'

ELSE status(B) = 'NC'

ELSE BEGIN (* status(A) = 'NC' or '0' *)


THEN BEGIN

status(A) = 'C'

status(B) = 'C'

END

ELSE BEGIN (* Incorrect conclusion *)

status(A) = 'NC'

status(B) = 'NC'

END

END

s3. IF (status(B) = 'X' or status(B) = 'C')

THEN IF (status(A) = 'X' or status(A) = 'C')

THEN BEGIN

status(A) = 'X'

93

status(B) = 'X'

END

ELSE IF (status(A) = '#')

THEN BEGIN

status(A) = 'D'

status(B) = 'X'

END

s4. IF (status(A) = '#') AND (B isthe final node)

THEN IF (status(B) = 'C' or status(B) = 'NC' or

status(B) = 'D' or status(B) = 'X')

THEN IF (correct conclusion)

THEN BEGIN

status(A) = 'C' .

IF (status(B) = 'D' or status(B) = 'NC')


ELSE status(B) = status (B)

END

ELSE BEGIN

status(A) = 'NC'

IF (status(B) = 'D')

THEN status(B) = 'NC'

ELSE status(B) = status(B)

94

END

ELSE BEGIN

status(A) = status(A) (* status(B) = '#' *)



ELSE status(B) = 'NC'

END

s5. . IF (status(B) = 'X') AND (B is not the final node) AND

(status(A) = '0' or status(A) = 'NC')

THEN

BEGIN

Al = A

WHILE (status(B) = 'X' & B is not the final.node) DO

BEGIN

A=B

B = rule activated by A

END

IF ( B is the final node)

THEN

BEGIN

IF (status(B) = 'X')

THEN IF (correct-conclusion)

95

THEN status(A 1) = 'C'

ELSE status of Al = 'NC'

END

ELSE (* B is not the final node *)

BEGIN

A=B

B = rule activated by A

END

END

s6. IF (status(B) = 'NC') AND (status(A) = '#') AND (B is

not the final node)

THEN status(A) = 'NC'


Note: In case A activates more than one rule, Le.,

A -----+1 Bl A I B2

A ------II B· I

A -----+1 Bk

. then the above steps are applied to the rule A ---II Bj such

that status(Bj ) is the highest in the order 'X', 'C', '0', 'NC', or '#'.

To understand the process of the modified algorithm, we shall consider

Fig.3.2, where the encircled numbers represent rule numbers, the values on top

96

Fig .3.2. Sequence of possible solution paths. Encircled IlUJIIbers denote the rule nMbers and the I'lWlbers on top of the nodes represent the rule strength.

97

of the nodes are the strength of the corresponding rules, and arrows indicate

solution paths. We shall assume that there are three solution paths, path 1 : 1, 2,

3, and 4; path2 : 6, 2, and 5 and path3 : 7. 3, and 4 and th,at path 1 produces

wrong conclusion while path2 and path3 produce correct conclusions. This means

that rule 1 is incorrect rule. First, we consider the standard bucket brigade

algorithm where all the rules initially have strength equal to 10. In case of correct

conclusion the -environment reward would be 10. Fig.3.3- shows a sequence of

solution paths corresponding to differ~nt environment inputs and the modified

rule strengths under the standard bucke~ hrigade algorithm.

The same sequence of solution paths is also shown in Fig.3.4 but this time

the modified bucket brigade algorithm is applied. Comparing the two figures

(Fig.3.3 and Fig.3.4). we notice that rule 1 (incorrect rule) is participating in the

solution paths 1, 2, 4. 9, 10. 15. 16, and 17 and has almost the same strength

value in both the figures in the solution paths 1, 2, 4, 9, and 10. This is because

rule 1 under the modified version has not been recognized as incorrect rule ('NC'

rule status) until solution path 10 is chosen. Recognition of rule 1 as a wrong rule

in the solution path 10 caused a contin~ous decrease (solution paths 15, 16, and

17) in its strength from 10.781 in the solution path 10 to solution path 1.349 in

the solution path 17. In contrast, under the standard bucket brigade algorithm

(Fig.3.3), the strength of rule 1 starts fluctuating after solution path 10 with a

minor difference but still its strength is high compared to the other nodes. It is

also to be noticed that the strength of the incorrect rule 1 in the solution path

17 in Fig.3.3 is much higher than the strength of rule 4 which is a correct rule

98

1. 111.

9.316 7.382 12.95 2. 11.~

19 5 11.25

3.~ 8.349 9.716 16.925

12.~

9.932 12.87 18.912 4. 13.~

9.962 6.875 12.812 19.951 15.441 19.996

5.~ 14;~

7.962 9.843 16.4Q6

6.~ 15.

8.75 8.75 15

7.~ 16.

8.75 11.875 17.5

8.~ 17.

9.

Fig .3.3. Sequence of solution paths and nodir ication of rule strengths according to the standard bucket brigade algorithn, Encircled nunbers denote the rule ~bers, On top of each node the rule strength is shown,

99

whereas for the same solution path in Fig.3.4, the strength of rule 1 is less than

the strength of rule 4. From the above discussion it is implied that, under the

standard bucket brigade algorithm, weakening of rule 1 amounts weakening of

rule 4 and the strengthening of rule 4 would strengthen rule 1 too. On the

Contrary, under the modified algorithm, after reaching certain stage, incorrect

rule 1 is weakened whenever it is used and strengthening of rule 4 would not

strengthen rule 1.' This will lead to recession in the strength of rule 1 and

eventually its strength will be less than the threshold value thereby preventing . .

such incorrect rules from participation in the solution path.

Fig.3.5 shows the performance of the modified bucket brigade algorithm

using the same set of rules and the cases case(a), case(b), and case(c) of the

environment inputs as given in section I. A comparison of Fig.3.1 and Fig.3.5

shows that during the first two cases case(a) and case(b), the strength of the rules

change in the same manner. That is, unt'il the end of these two cases, the

modified algorithm produced the same results as produced by the standard bucket

brigade. The reason is that, rule 5 which is incorrect rule was not yet recognized

as a wrong rule. Under the case(c) it is observed that the modified algorithm out . .

performs the standard bucket brigade algorithm as soon as rule 5 is recognized

as a wrong rule. Consequently the strength of rule 5 is decremented to 12

whereas for the same rule under the standard bucket brigade algorithm, the

strength is 24.656. Recognition of rule 5 as a wrong rule will cause a continuous

decrease in its strength whenever it is used and eventually it will not be able to

enter the conflict set. However, the incorrect rule 9 which is a final node gets

. 100

1. (51,54)

2. (51,54) 1~

1

3. (56,52)

7.5

8.75 3.75 11.25 ~'C' ~ ~~ ~~

4. (51,53) 19

5. (55,52) 4.375 6.562 12.815 ~'X' ~ ~~ ~~

6. (52) 5.468 9.688 16.496 ~'X' ~.-\2J r~

7. (51,53) 8.437 8.437 15 ~'C' ~ ~~ ~0

8. (53) 8.437 11.718 17.5 ~'X' ~-\2J ~0

9. (52)

4.1~1 , X'

11. (s2) 2.5 8.Q46 7.363 12. ~5

, NC' ~'X' \.2.J-~ 12. (52)

7.7~4 9.796 16.925 ~'X' \.2.J- ~~ ~~

13. (53) 5.625 8.795 12.865 18.912

'X' ~'X' ~. ~~ ~~

14. (52) 19.785 15.438 19.~~6 ~'X' \Y- ~0- ~

15. (55,52)

8.2~3 'X'

Fig .3.4. Sequence of solution paths showing the perforlllance of the lIIodif ied bucket brigade algorit,hlll. With each solution path the steps of the Algorithlll I used are indicated e,g" (sl,s2L Encircled nunbers denote the rule nUfilbers, On top of each node the rule strength & rule-status are shown.

101

12.S'M' 4 }-C'------ffi

7.S ,M, S l---------ti

(51.54) < a)

<51.54> < b)

(53.55)

<c>

Fig. 3. 5. The chosen so lut ion paths and the IIIOd if icat ions of ru Ie strengths under the proposed algorithlll according to inputs (a) Is-holiday, Good-clilllate, and John-like-swilllllling (h) IS-}10 liday , Good-clilllate, dnd John-like-skiing; and (c) Sequence of inputs (a), (b), and (a). The steps llsed undr the lllooified algorithlll are shown below each Figure.

102

weakened in both the algorithms and has the same strength value (1.25).

3.4. Modified Bucket Brigade For HCPRs System

. In this section, we consider the extension of the proposed algorithm for

'HCPRs system. The extension is based on the following modifications:

The GCS [3] provides a trade-off between certainty factor and specificity

where in a HCPR-tree a less specific node at a higher level has a higher

certainty and a more specific node at a lower level has a lower certainty.

According to our proposed bid formula for CPRs 'system, the certainty

factor of a rule is included in computing its accuracy. This would mean

that in HCPRs system, a HCPR at a higher level with lower specificity may

always pay a bid which is more than or equal to the bid paid by a lower

node with higher specificity. Therefore, the bid formula as such would not

provide a fair bid payment scheme for HCPRs system. To resolve this

problem, rules may compete according to the same suggested bid formula

but accuracy (ACC) is removed from the formula while paying the bid.

Whenever an intermediate node in a given HCPR-tree is matched but its

strength is less than the threshold value to enter the conflict set (weak . .

rule), the rule is not fired while the computation for the lower nodes in

the hierarchy is continued. This process would facilitate the firing of the

next rule in hierarchy and avoid firing of weak rule. The rule fired after

the weak rule in the hierarchy will not pay its bid to the weak rule but to

103

the rule flbove the weak rule in hierarchy.

It is to be noticed in this regard that if the root node of any

HCPR-tree itself is having a strength less than the threshold value, the

rule corresponding to the root will not be fired and subsequently the

HCPR-tree is considered to be wrong, and no inference is allowed through

this tree.

, When the conclusions are reached (no more rules can be fired), the

environment checks the output of all the nodes fired in the last HCPR-tree

activated and their status is modified accordingly. The last HePR fired

would receive the environment reward in case it produces correct

conclusion.

In order to understand the various steps involved in the algorithm, we

consider the four HCPR-trees given in Fig.3.6 Let us assume that initial strength

of all the rules is 10, the environment reward is 20, and the value of b is 0.5.

Initially all rules have '#' status. In the following implementation we are

assuming that the system receives environment inputs in the order (a) to (e) as

given in Fig. 3.7.

case (a): Environment input (a)

John-Iive-in-Delhi, New-day, Time-6AM,' John-interested-

i n- ba II-ga mes, J ohn-interested-in-football

104

2. Is-at- one (X) [Tine-night]

I. is-in-city(X,YI [ Live-in-city(X,YI]

3. Is-outside-hone(XI !Tillle-dayl

4. Is-working-outdoor(XI [Day-workingl

HCPR-tree 1

5. Is-enterta in ing-outdoor (X) [Day-sundayl

6. Is-doing-exercise [is-outside-hollle(XI,Tillle-earIY-lIIorningl

7. Is-jogging (Interested-in-athletics]

8. Is-enter aining-outdoor(XI [Interested-in-baIl-galllesl

9. Is-playing-football [lnterested-in-foot-balll

HCPR-tree 2

11. Is-playing-basket-ball

[In-basket-balHieldl

105

18. ls-playing-hand-ganes [Interested-in-hand-ball-galllesl

12. Is-playing-vo lley-ba II

[J n-voo I ey-ball-f ieldl

13. Is-playing hand-ball

[In-hand-ball-f ieldl

14, is-in-sociai-activities(X) [Is-entertaining-outdoor]

15, Is-in-ciuh(XI/Is-visiting-friends(XI [Ti~e-after-noon]

17, /s-in-club(x) [3p~-4p~1

18, Is-visiting-friends(XI [4p~-Sp~1

HCPR-tree 3

26, Ti~e-day

[4:36a~-7p~1

19, Ti~e-day/Ti~e-night

[new-dayl

22, Ti~e-ear lY-filorn ing 23. Tiflle-after-noon [5pfll-7afll] [12afll {= Spflll

HCPR-tree 4

16, Is-in-~ouie(XI

[Ti~e-euening]

21. Tiflle-night [) 7pfll { 12pflll

Fig.3.G. Four HCPR-trees to decide what X is doing.

106

HCPR-tree 4

HCPR- tree 1

(51. .54)

( a)

7.5

(51,54)

( b)

, C'

107

HCPR-tree 4

HCPR- tree 1

(51,53,54)

(c)

(51,53,54)

( d)

(e)

1.25 , C'

Fig.3. 7. Application of the 1II0dif ied algorithlll on HCPRs in Fig.3.6. The changes in the strengtll values anti rule-status are shown for the enviroTlfJlent inputs (a) John-live-in-Delhi, New-day, Tillle-6Aft, John-interested-in-ball-ganes, and John-interested-in-football (b) John-live-in-Delhi, New-dau, Tillle-6Pn, and Day-sunday (c) The input as ;n (a). (d) The input as in (b). (e) Several trai Is of inputs as in (a) or (b) and finally input as in (a), (The dotted arrow denotes that rule 9 pays its bid to rule 6 and not to rule 8).

108

Rules 19, 20, and 23 are matched from HePR-tree 4 and rule 1 is matched

from HePR-tree 1. Rule 20 can not fire unless rule 19 is fired and similarly rule

23 can not fire unless rule 20 is fired (because of hierarchy). Rules 19 and 1

enter the conflict set each making a bid of 5 and subsequently both are fired

decrementing their strength to 5 (10-5). Rule 20 from HePR-tree 4 is activated

by rule 19 and it pays out a bid of 5 to rule 19. The strength of rule 20 is

decremented to 5 (10-5) and strength of rule 19 becomes 10. At this stage, rule

23 from HCPR-tree 4 gets initiated by rule 20 of the same HCPR-tree (the

generality of rule 23) and rule 30f HePR-tree 1 gets initiated by its generality

rule 1 and rule 20 of HePR-tree 4. Rules 3 and 23 enter the conflict set making

each a bid of 5. Both rules are fired and rule 23 pays out its bid to rule 20. This

would result in·a strength of 10 (5+5) to rule 20 and 5 (10-5) to rule 23. Rule 3

pays out 2.5 to rule 20 and rule 1 eacq, so rule 5 gets its strength decremented

to 5 and the strength of rule 20 is incremented to 12.5 (10 + 2.5) while the

strength of rule 1 is incremented to 7.5 (5 + 2.5). Rule 6 gets initiated by rules

3 and 23, it pays out its bid of 5 to these rules reducing its strength to (10-5) and

increasing strength of bo'th rules 3 and 23 to 7.5 (5 + 2.5). Rule 8 will then be

initiated by rule 6 and gets fired paying a bid of 5 to rule 6 changing its strength

to 5( 1 0-5) and increasing strength of rule 6 to 10 . Firing of rule 8 initiates rule

9 (rule 8 's specificity) and rule 14 from HePR-tree 3. Both the rules 9 and 14

are fired paying each a bid of 5 to rule 8, thereby the strength of rule 8 becomes

15 and the strength of rules 14 and 9 becomes 5 each. Since now no more rules

can be fired, the last rules fired are 'rules 14 of HePR-tree 3 and rule 9 of

109

HCPR-tree 2. Considering these two trees as the conclusive HCPR-trees, the

results corresponding to all the nodes fired within these two trees are checked

against the expected output. Rules 6 and 9 of HCPR-tree 2 produce correct

conclusion and therefore will get their status changed to 'C' whereas rule 8 will

get its status changed to 'NC' since "it produces a wrong conclusion (wrong

specificity). Rule 14 of HCPR-tree 3 will get a status of 'NC' because it produces

incorrect conclusion. Obviously rule 9 gets a reward from the environment

changing its strength to 25 (5 + 20). The illustration of the above process is shown

in Fig. 3.7(a).

case(b): Environment input(b).

John-lives-in-Delhi, New-day, Time-6PM, Day-sunday

This time rules 19, 20, and 22 are fired from HCPR~tree 4 and rules 1 and

3 are fired from HCPR-tree 1 as in the same process explained in case (a). Firing

of rules 22 and 3 would result in firing of rule 5 of HCPR-tree 1 and

subsequently rule 14 of HCPR-tree 3 is fired. Finally rule 16 of the same

HCPR-tree is fired. After rule 16 has fired, no further inference is possible.

Checking the last HCPR-tree (HCPR-tree 3) fired, rule 14 produces correct

conclusion, hence its status is changed from 'NC' to 'C', rule 16 also produces

correct conclusion changing its status from '#' to 'C'. This process is shown in

Fig.3.7(b).

110

case(c): Environment input is the same as input(a).

The whole process along with strength and status modifications is shown

in Fig. 3.7(c).

case(d): Environment input is the same as input(b).

Fig. 3. 7( d) shows the latest modifications of strengths and rule-status.

case(e): After several trials of input(a) or input(b) are applied.

The strength of rule 8 will be decreased because it is status is 'NC' status

and it will not get any advantage from the strengthening of rule 14 or rule 9.

After many trials the strength of rule 8 would be less than the threshold

necessary for entering the conflict set. The strategy is that the conditions of rule

8 are matched without taking action permitting rule 9 to fire which is correct and

rule 14 will not· be fired since it is dependent on the action of rule 8. It is to be

noticed that rule 9 pays its bid to rule, 6 but not to rule 8 (rule 8 is not fired).

Fig. 3. 7( e) shows the sequence of the rules fired if at the end of several trials of

input(a)or input(b) the environment messages of case(a) are again given as input

(the dotted line in the figure denotes that rule 9 pays its bid to rule 6 and not to

rule 8).

111

Date post:	22-Jun-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

3. BUCKET BRIGADE ALGORITHM FOR HIERARCHICAL...

Documents