1 Augmented Bayesian Classifier Beijie Xu. 2 problem to be solved NAÏVE Bayesian Classifier –Why...

Post on 19-Dec-2015

218 views 1 download

Tags:

transcript

1

Augmented Bayesian Classifier

Beijie Xu

2

problem to be solved

• NAÏVE Bayesian Classifier– Why naïve? - conditional independence

among attributes – Is that true in real world situation?

• Augmented Bayesian Classifier– Allows for relationship between attributes

3

Assumption / limitation

• Each node/attribute may have at most one (non-class) parent.

• An arc from A1 to A2 indicates that attribute A2 depends on A1

4

• The probability of an instance j belonging to class i is proportional to

• Augmented Bayesian is to modify this probability (considering the dependency between attributes)

Naïve Bayesian Classifier

( ) ( | )i k kj ik

P C P A V C ---- ①

5

If A1A2

• Multiple (slide#5) with ①

• Denominator: factor OUT the effect of the orphan node Aa

• Nominator: factor IN the effect of the arc from Ab to Aa. (the mutual effect of a and b given class Ci)

( | & )

( | )a aj i b bj

a aj i

P A V C A V

P A V C

6

ˆ ( , | )DpI Ai Aj C

, ,

( , , ) ( )( , | ) ( , , ) log

( , ) ( , )px y z

P x y z P zI X Y Z P x y z

P x z P y z

Let X,Y,Z be three variables, then the conditional mutual information between X and Y given Z is determined by above equation

7

Greedy Hill Climbing Search (HCS)

• Step 1– an J*I matrix where cell[i,j] is the probability th

at instance j belongs to class Ci. (there are J instances and I classes)

8

Greedy Hill Climbing Search (HCS) (2/3)

• Step2:– Try every possible arc. If there is an arc additi

on, which improves accuracy, then add the arc which improves accuracy the most to the currect network

9

Greedy Hill Climbing Search (HCS)(3/3)

• Step3: update I*J matrix

Go back to step 2Iterate until no “arc addition”

10

Three definitions

• Orphan – A node without a parent other than the class node is called an orphan.

• SuperParen – if we extend arcs from node Ax to every orphan, node Ax is a SuperParent

• FavoriteChild – if we extend an arc from node Ax to each orphan in turn, and test the effect on predictive accuracy, the node pointed to by the best arc is said to be the Favo

riteChild of Ax.

11

SuperParent (1/3)

• If an attribute is truly independent of all other attributes

( ) ( | ) ( ) ( | & )i k kj i i k k j i SPk k

P C P A V C P C P A V C A

--- (2)

12

SuperParent (2/3)

• Step1– Find an attribute Asp that makes the right-hand

side of equation (2) (slide#11) differs from the left-hand side the most.

13

SuperParent (3/3)

• Step2:– Find Asp’s FavoriteChild

Step1-2iteration

14

questions

• Does SuperParent produce the same set of augmented arcs as HCS?

• How many children attributes can an attribute have in each algorithm?

• Does the order of trying (possible arcs) affect the result?

15

16

references

• E, Keogh and M. J. Pazzeni, “Learning the structure of augmented Bayesian classifiers,” International Journal on Artificial Intelligence Tools, vol. 11, no. 4, pp. 587-601, 2002.