+ All Categories
Home > Documents > Decision Trees

Decision Trees

Date post: 01-Jan-2016
Category:
Upload: burton-nunez
View: 24 times
Download: 0 times
Share this document with a friend
Description:
Decision Trees. Ensemble methods. Use multiple models to obtain better predictive performance Ensembles combine multiple hypotheses to form a (hopefully) better hypothesis Combine multiple weak learners to produce a strong learner - PowerPoint PPT Presentation
Popular Tags:
13
Decision Trees ID Hair Height Weight Lotion Result Sarah Blonde Average Light No Sunburn Dana Blonde Tall Average Yes none Alex Brown Tall Average Yes None Annie Blonde Short Average No Sunburn Emily Red Average Heavy No Sunburn Pete Brown Tall Heavy No None John Brown Average Heavy No None Katie Blonde Short Light Yes None
Transcript
Page 1: Decision Trees

Decision Trees

ID Hair Height Weight Lotion Result

Sarah Blonde Average Light No Sunburn

Dana Blonde Tall Average Yes none

Alex Brown Tall Average Yes None

Annie Blonde Short Average No Sunburn

Emily Red Average Heavy No Sunburn

Pete Brown Tall Heavy No None

John Brown Average Heavy No None

Katie Blonde Short Light Yes None

Page 2: Decision Trees

Ensemble methods

Use multiple models to obtain better predictive performance

Ensembles combine multiple hypotheses to form a (hopefully) better hypothesis

Combine multiple weak learners to produce a strong learner

Typically much more computation, since you are training multiple learners

Page 3: Decision Trees

Ensemble learners

Typically combine multiple fast learners (like decision trees)

Tend to overfit Tend to get better results since there is deliberately

introduced significant diversity among models Diversity does not mean reduced performance

Note that empirical studies have shown that random forests do better than an ensemble of decision trees

Random forest is an ensemble of decisions trees that do not minimize entropy to choose tree nodes.

Page 4: Decision Trees

Bayes optimal classifier is an ensemble learner

Page 5: Decision Trees

Bagging: Bootstrap aggregating

Each model in the ensemble votes with equal weight Train each model with a random training set Random forests do better than bagged entropy

reducing DTs

Page 6: Decision Trees

Bootstrap estimation

Repeatedly draw n samples from D For each set of samples, estimate a statistic The bootstrap estimate is the mean of the individual

estimates Used to estimate a statistic (parameter) and its variance

Page 7: Decision Trees

Bagging

For i = 1 .. M Draw n*<n samples from D with replacement Learn classifier Ci

Final classifier is a vote of C1 .. CM

Increases classifier stability/reduces variance

Page 8: Decision Trees

Boosting

Incremental Build new models that try to do better on previous

model's mis-classifications Can get better accuracy Tends to overfit

Adaboost is canonical boosting algorithm

Page 9: Decision Trees

Boosting (Schapire 1989)

Randomly select n1 < n samples from D without replacement to obtain D1 Train weak learner C1

Select n2 < n samples from D with half of the samples misclassified by C1

to obtain D2 Train weak learner C2

Select all samples from D that C1 and C2 disagree on Train weak learner C3

Final classifier is vote of weak learners

Page 10: Decision Trees

Adaboost

Learner = Hypothesis = Classifier

Weak Learner: < 50% error over any distribution

Strong Classifier: thresholded linear combination of weak learner outputs

Page 11: Decision Trees

Discrete Adaboost

Page 12: Decision Trees

Real Adaboost

Page 13: Decision Trees

Comparison


Recommended