+ All Categories
Home > Documents > Iterative Dichotomiser 3 (ID3) Algorithm

Iterative Dichotomiser 3 (ID3) Algorithm

Date post: 12-Feb-2016
Category:
Upload: zalman
View: 64 times
Download: 0 times
Share this document with a friend
Description:
Iterative Dichotomiser 3 (ID3) Algorithm. Medha Pradhan CS 157B, Spring 2007. Agenda. Basics of Decision Tree Introduction to ID3 Entropy and Information Gain Two Examples. Basics. What is a decision tree? - PowerPoint PPT Presentation
22
Iterative Dichotomiser 3 Iterative Dichotomiser 3 (ID3) Algorithm (ID3) Algorithm Medha Pradhan Medha Pradhan CS 157B, Spring 2007 CS 157B, Spring 2007
Transcript
Page 1: Iterative Dichotomiser 3 (ID3) Algorithm

Iterative Dichotomiser 3 (ID3) Iterative Dichotomiser 3 (ID3) AlgorithmAlgorithm

Medha PradhanMedha PradhanCS 157B, Spring 2007CS 157B, Spring 2007

Page 2: Iterative Dichotomiser 3 (ID3) Algorithm

AgendaAgenda

Basics of Decision Tree Introduction to ID3 Entropy and Information Gain Two Examples

Page 3: Iterative Dichotomiser 3 (ID3) Algorithm

BasicsBasics What is a decision tree?

A tree where each branching (decision) node represents a choice between 2 or more alternatives, with every branching node being part of a path to a leaf node

Decision node: Specifies a test of some attribute

Leaf node: Indicates classification of an example

Page 4: Iterative Dichotomiser 3 (ID3) Algorithm

ID3ID3 Invented by J. Ross Quinlan Employs a top-down greedy search through

the space of possible decision trees.Greedy because there is no backtracking. It picks highest values first.

Select attribute that is most useful for classifying examples (attribute that has the highest Information Gain).

Page 5: Iterative Dichotomiser 3 (ID3) Algorithm

EntropyEntropy Entropy measures the impurity of an arbitrary collection of

examples. For a collection S, entropy is given as:

For a collection S having positive and negative examplesEntropy(S) = -p+log2p+ - p-log2p-

where p+ is the proportion of positive examples

and p- is the proportion of negative examples

In general, Entropy(S) = 0 if all members of S belong to the same class.Entropy(S) = 1 (maximum) when all members are split equally.

Page 6: Iterative Dichotomiser 3 (ID3) Algorithm

Information GainInformation Gain Measures the expected reduction in entropy. The

higher the IG, more is the expected reduction in entropy.

where Values(A) is the set of all possible values for attribute A,Sv is the subset of S for which attribute A has value v.

Page 7: Iterative Dichotomiser 3 (ID3) Algorithm

Example 1Example 1Sample training data to determine whether an animal lays eggs.

Independent/Condition attributesDependent/

Decision attributes

Animal Warm-blooded

Feathers Fur Swims Lays Eggs

Ostrich Yes Yes No No Yes

Crocodile No No No Yes Yes

Raven Yes Yes No No Yes

Albatross Yes Yes No No Yes

Dolphin Yes No No Yes No

Koala Yes No Yes No No

Page 8: Iterative Dichotomiser 3 (ID3) Algorithm

Entropy(4Y,2N): -(4/6)log2(4/6) – (2/6)log2(2/6)

= 0.91829

Now, we have to find the IG for all four attributes Warm-blooded, Feathers, Fur, Swims

Page 9: Iterative Dichotomiser 3 (ID3) Algorithm

For attribute ‘Warm-blooded’:Values(Warm-blooded) : [Yes,No]S = [4Y,2N]SYes = [3Y,2N] E(SYes) = 0.97095SNo = [1Y,0N] E(SNo) = 0 (all members belong to same class) Gain(S,Warm-blooded) = 0.91829 – [(5/6)*0.97095 + (1/6)*0]

= 0.10916For attribute ‘Feathers’:Values(Feathers) : [Yes,No]S = [4Y,2N]SYes = [3Y,0N] E(SYes) = 0SNo = [1Y,2N] E(SNo) = 0.91829Gain(S,Feathers) = 0.91829 – [(3/6)*0 + (3/6)*0.91829]

= 0.45914

Page 10: Iterative Dichotomiser 3 (ID3) Algorithm

For attribute ‘Fur’:Values(Fur) : [Yes,No]S = [4Y,2N]SYes = [0Y,1N] E(SYes) = 0

SNo = [4Y,1N] E(SNo) = 0.7219

Gain(S,Fur) = 0.91829 – [(1/6)*0 + (5/6)*0.7219]= 0.3167

For attribute ‘Swims’:Values(Swims) : [Yes,No]S = [4Y,2N]SYes = [1Y,1N] E(SYes) = 1 (equal members in both classes)

SNo = [3Y,1N] E(SNo) = 0.81127

Gain(S,Swims) = 0.91829 – [(2/6)*1 + (4/6)*0.81127]= 0.04411

Page 11: Iterative Dichotomiser 3 (ID3) Algorithm

Gain(S,Warm-blooded) = 0.10916Gain(S,Feathers) = 0.45914Gain(S,Fur) = 0.31670Gain(S,Swims) = 0.04411

Gain(S,Feathers) is maximum, so it is considered as the root node

Feathers

Y N

[Ostrich, Raven, Albatross]

[Crocodile, Dolphin, Koala]

Lays Eggs ?

Animal

Warm-

blooded

Feathers

Fur Swims

Lays Eggs

Ostrich

Yes Yes No No Yes

Crocodile

No No No Yes Yes

Raven Yes Yes No No Yes

Albatross

Yes Yes No No Yes

Dolphin

Yes No No Yes No

Koala Yes No Yes No No

The ‘Y’ descendant has only positive examples and becomes the leaf node with classification ‘Lays Eggs’

Page 12: Iterative Dichotomiser 3 (ID3) Algorithm

We now repeat the procedure,S: [Crocodile, Dolphin, Koala]S: [1+,2-]

Entropy(S) = -(1/3)log2(1/3) – (2/3)log2(2/3)= 0.91829

Animal Warm-blooded

Feathers Fur Swims Lays Eggs

Crocodile No No No Yes Yes

Dolphin Yes No No Yes No

Koala Yes No Yes No No

Page 13: Iterative Dichotomiser 3 (ID3) Algorithm

For attribute ‘Warm-blooded’:Values(Warm-blooded) : [Yes,No]S = [1Y,2N]SYes = [0Y,2N] E(SYes) = 0SNo = [1Y,0N] E(SNo) = 0 Gain(S,Warm-blooded) = 0.91829 – [(2/3)*0 + (1/3)*0] = 0.91829

For attribute ‘Fur’:Values(Fur) : [Yes,No]S = [1Y,2N]SYes = [0Y,1N] E(SYes) = 0SNo = [1Y,1N] E(SNo) = 1Gain(S,Fur) = 0.91829 – [(1/3)*0 + (2/3)*1] = 0.25162

For attribute ‘Swims’:Values(Swims) : [Yes,No]S = [1Y,2N]SYes = [1Y,1N] E(SYes) = 1SNo = [0Y,1N] E(SNo) = 0 Gain(S,Swims) = 0.91829 – [(2/3)*1 + (1/3)*0] = 0.25162

Gain(S,Warm-blooded) is maximum

Page 14: Iterative Dichotomiser 3 (ID3) Algorithm

The final decision tree will be:

Feathers

Y N

Lays eggs Warm-blooded

Y N

Lays EggsDoes not lay eggs

Page 15: Iterative Dichotomiser 3 (ID3) Algorithm

Example 2Example 2 Factors affecting sunburn

Name Hair Height Weight Lotion Sunburned

Sarah Blonde Average Light No Yes

Dana Blonde Tall Average Yes No

Alex Brown Short Average Yes No

Annie Blonde Short Average No Yes

Emily Red Average Heavy No Yes

Pete Brown Tall Heavy No No

John Brown Average Heavy No No

Katie Blonde Short Light Yes No

Page 16: Iterative Dichotomiser 3 (ID3) Algorithm

S = [3+, 5-]Entropy(S) = -(3/8)log2(3/8) – (5/8)log2(5/8)

= 0.95443

Find IG for all 4 attributes: Hair, Height, Weight, Lotion

For attribute ‘Hair’:Values(Hair) : [Blonde, Brown, Red]S = [3+,5-]SBlonde = [2+,2-] E(SBlonde) = 1SBrown = [0+,3-]E(SBrown) = 0 SRed = [1+,0-] E(SRed) = 0Gain(S,Hair) = 0.95443 – [(4/8)*1 + (3/8)*0 + (1/8)*0]

= 0.45443

Page 17: Iterative Dichotomiser 3 (ID3) Algorithm

For attribute ‘Height’:Values(Height) : [Average, Tall, Short]SAverage = [2+,1-] E(SAverage) = 0.91829STall = [0+,2-] E(STall) = 0 SShort = [1+,2-] E(SShort) = 0.91829Gain(S,Height) = 0.95443 – [(3/8)*0.91829 + (2/8)*0 + (3/8)*0.91829]

= 0.26571 For attribute ‘Weight’:Values(Weight) : [Light, Average, Heavy]SLight = [1+,1-] E(SLight) = 1SAverage = [1+,2-] E(SAverage) = 0.91829 SHeavy = [1+,2-] E(SHeavy) = 0.91829Gain(S,Weight) = 0.95443 – [(2/8)*1 + (3/8)*0.91829 + (3/8)*0.91829]

= 0.01571 For attribute ‘Lotion’:Values(Lotion) : [Yes, No]SYes = [0+,3-] E(SYes) = 0SNo = [3+,2-] E(SNo) = 0.97095Gain(S,Lotion) = 0.95443 – [(3/8)*0 + (5/8)*0.97095]

= 0.01571

Page 18: Iterative Dichotomiser 3 (ID3) Algorithm

Gain(S,Hair) = 0.45443Gain(S,Height) = 0.26571Gain(S,Weight) = 0.01571Gain(S,Lotion) = 0.3475Gain(S,Hair) is maximum, so it is considered as the root nodeName Hair Height Weigh

tLotion Sunbur

ned

Sarah Blonde Average

Light No Yes

Dana Blonde Tall Average

Yes No

Alex Brown Short Average

Yes No

Annie Blonde Short Average

No Yes

Emily Red Average

Heavy No Yes

Pete Brown Tall Heavy No No

John Brown Average

Heavy No No

Katie Blonde Short Light Yes No

HairBlonde Red

Brown

[Sarah, Dana,Annie, Katie]

[Emily]

[Alex, Pete, John]

Sunburned

NotSunburned?

Page 19: Iterative Dichotomiser 3 (ID3) Algorithm

Repeating again:S = [Sarah, Dana, Annie, Katie]S: [2+,2-]Entropy(S) = 1

Find IG for remaining 3 attributes Height, Weight, Lotion For attribute ‘Height’:Values(Height) : [Average, Tall, Short]S = [2+,2-]SAverage = [1+,0-] E(SAverage) = 0STall = [0+,1-] E(STall) = 0 SShort = [1+,1-] E(SShort) = 1Gain(S,Height) = 1 – [(1/4)*0 + (1/4)*0 + (2/4)*1]

= 0.5

Name Hair Height Weight Lotion Sunburned

Sarah Blonde Average Light No Yes

Dana Blonde Tall Average Yes No

Annie Blonde Short Average No Yes

Katie Blonde Short Light Yes No

Page 20: Iterative Dichotomiser 3 (ID3) Algorithm

For attribute ‘Weight’:Values(Weight) : [Average, Light]S = [2+,2-]SAverage = [1+,1-] E(SAverage) = 1SLight = [1+,1-] E(SLight) = 1 Gain(S,Weight) = 1 – [(2/4)*1 + (2/4)*1]

= 0

For attribute ‘Lotion’:Values(Lotion) : [Yes, No]S = [2+,2-]SYes = [0+,2-] E(SYes) = 0SNo = [2+,0-] E(SNo) = 0 Gain(S,Lotion) = 1 – [(2/4)*0 + (2/4)*0]

= 1

Therefore, Gain(S,Lotion) is maximum

Page 21: Iterative Dichotomiser 3 (ID3) Algorithm

In this case, the final decision tree will be

Hair

BlondeRed

Brown

Sunburned NotSunburnedLotion

Y N

SunburnedNotSunburned

Page 22: Iterative Dichotomiser 3 (ID3) Algorithm

ReferencesReferences "Machine Learning", by Tom Mitchell, McGraw-Hill, 1997 "Building Decision Trees with the ID3 Algorithm", by:

Andrew Colin, Dr. Dobbs Journal, June 1996

http://www2.cs.uregina.ca/~dbd/cs831/notes/ml/dtrees/dt_prob1.html

Professor Sin-Min Lee, SJSU. http://cs.sjsu.edu/~lee/cs157b/cs157b.html


Recommended