+ All Categories
Home > Documents > Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST)...

Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST)...

Date post: 28-Dec-2015
Category:
Upload: eustace-daniel
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
18
Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka
Transcript
Page 1: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Artificial Intelligence7. Decision trees

Japan Advanced Institute of Science and Technology (JAIST)Yoshimasa Tsuruoka

Page 2: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Outline• What is a decision tree?• How to build a decision tree• Entropy• Information Gain

• Overfitting• Generalization performance• Pruning

• Lecture slides• http://www.jaist.ac.jp/~tsuruoka/lectures/

Page 3: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Decision treesChapter 3 of Mitchell, T., Machine Learning (1997)

• Decision Trees– Disjunction of conjunctions– Successfully applied to a broad range of tasks• Diagnosing medical cases• Assessing credit risk of loan applications

• Nice characteristics– Understandable to human– Robust to noise

Page 4: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

• Concept: PlayTennis

A decision tree

Outlook

Humidity Wind

Sunny RainOvercast

Yes

No Yes

High Normal

No Yes

Strong Weak

Page 5: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

• Instance <Outlook = Sunny, Temperature = Hot, Humidity = High, Wind = Strong>

Classification by a decision tree

Outlook

Humidity Wind

Sunny RainOvercast

Yes

No Yes

High Normal

No Yes

Strong Weak

Page 6: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

(Outlook = Sunny ^ Humidity = Normal)v (Outlook = Overcast)v (Outlook = Rain ^ Wind = Weak)

Disjunction of conjunctions

Outlook

Humidity Wind

Sunny RainOvercast

Yes

No Yes

High Normal

No Yes

Strong Weak

Page 7: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Problems suited to decision trees

• Instanced are represented by attribute-value pairs• The target function has discrete target values• Disjunctive descriptions may be required• The training data may contain errors• The training data may contain missing

attribute values

Page 8: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Training dataDay Outlook Temperature Humidity Wind PlayTennis

D1 Sunny Hot High Weak No

D2 Sunny Hot High Strong No

D3 Overcast Hot High Weak Yes

D4 Rain Mild High Weak Yes

D5 Rain Cool Normal Weak Yes

D6 Rain Cool Normal Strong No

D7 Overcast Cool Normal Strong Yes

D8 Sunny Mild High Weak No

D9 Sunny Cool Normal Weak Yes

D10 Rain Mild Normal Weak Yes

D11 Sunny Mild Normal Strong Yes

D12 Overcast Mild High Strong Yes

D13 Overcast Hot Normal Weak Yes

D14 Rain Mild High Strong No

Page 9: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Which attribute should be tested at each node?

• We want to build a small decision tree

• Information gain– How well a given attribute separates the training

examples according to their target classification– Reduction in entropy

• Entropy– (im)purity of an arbitrary collection of examples

Page 10: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Entropy

• If there are only two classes

• In general,

ppppSEntropy 22 loglog

940.0

14/5log14/514/9log14/9]5,9[ 22

Entropy

c

iii ppSEntropy

12log

Page 11: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Information Gain

vAValuesv

v SEntropyS

SSEntropyASGain

,

• The expected reduction in entropy achieved by splitting the training examples

Page 12: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Example

048.0

00.114

6811.0

14

8940.0

14

6

14

8

,

]3,3[

]2,6[

]5,9[

,

,

StrongWeak

StrongWeakvv

v

Strong

Weak

SEntropySEntropySEntropy

SEntropyS

SSEntropyWindSGain

S

S

S

StrongWeakWindValues

Page 13: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Coumpiting Information Gain

Humidity Wind

High Normal Weak Strong

940.0

]5,9[

E

S

985.0

]4,3[

E

S

592.0

]1,6[

E

S

940.0

]5,9[

E

S

811.0

]2,6[

E

S

00.1

]3,3[

E

S

151.0

592.014

7985.0

14

7940.0

,

HumiditySGain

048.0

592.014

6811.0

14

8940.0

,

WindSGain

Page 14: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Which attribute is the best classifier?

• Information gain

029.0,

048.0,

151.0,

246.0,

eTemperaturSGain

WindSGain

HumiditySGain

OutlookSGain

Page 15: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Splitting training data with Outlook

Outlook

Sunny RainOvercast

{D1,D2,…,D14}[9+,5-]

{D1,D2,D8,D9,D11}[2+,3-]

{D3,D7,D12,D13}[4+,0-]

{D4,D5,D6,D10,D14}[3+,2-]

Yes? ?

Page 16: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Overfitting

• Growing each branch of the tree deeply enough to perfectly classify the training examples is not a good strategy.– The resulting tree may overfit the training data

• Overfitting– The tree can explain the training data very well

but performs poorly on new data

Page 17: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Alleviating the overfitting problem

• Several approaches– Stop growing the tree earlier– Post-prune the tree

• How can we evaluate the classification performance of the tree for new data?– The available data are separated into two sets of

examples: a training set and a validation (development) set

Page 18: Artificial Intelligence 7. Decision trees Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka.

Validation (development) set• Use a portion of the original training data to

estimate the generalization performance.

Original training set

Original training set

Test setTest set

Training setTraining set

Test setTest set

Validation setValidation set


Recommended