+ All Categories
Home > Documents > BOF Trees Visualization Zagreb, June 12, 2004 BOF Trees Visualization Zagreb, June 12, 2004...

BOF Trees Visualization Zagreb, June 12, 2004 BOF Trees Visualization Zagreb, June 12, 2004...

Date post: 13-Jan-2016
Category:
Upload: adrian-patrick
View: 213 times
Download: 1 times
Share this document with a friend
24
OF Trees Visualization OF Trees Visualization Zagreb Zagreb , June , June 12 12 , 2004 , 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability of Tree Ensembles Vesna Luzar-Stiffler, Ph.D. University Computing Centre, and CAIR Research Centre, Zagreb, Croatia Charles Stiffler, Ph.D. CAIR Research Centre, Zagreb, Croatia [email protected] , [email protected]
Transcript
Page 1: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

“BOF” Trees Diagram as a Visual Way to Improve Interpretability of Tree

Ensembles

“BOF” Trees Diagram as a Visual Way to Improve Interpretability of Tree

Ensembles

Vesna Luzar-Stiffler, Ph.D.University Computing Centre, and CAIR Research Centre,

Zagreb, Croatia Charles Stiffler, Ph.D.

CAIR Research Centre, Zagreb, [email protected], [email protected]

Page 2: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

OutlineOutline

Introduction/Background Trees Ensemble Trees Visualization Tools

Simulation Results

Web Survey Results

Conclusions/Recommendations

Page 3: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Introduction / BackgroundIntroduction / Background

Classification / Decision Trees Data mining (statistical learning) method for

classification Invented twice:

Statistical community: Breiman: Friedman et.al. (1984) Machine Learning community: Quinlan (1986)

Many positive features Interpretability, ability to handle data of mixed type

and missing values, robustness to outliers, etc.

Disadvantage unstable vis-à-vis seemingly minor data perturbations

low predictive power

Page 4: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Introduction / BackgroundIntroduction / Background

Possible improvements: Ensembles Bagging i.e., Bootstraping trees (Breiman, 1996) Boosting, e.g., AdaBoost (Freund & Schapire, 1997) Random Forests (Breiman, 2001) Stacking, randomized trees, etc.

Advantage: Improved prediction

Disadvantage Loss of interpretability (“black box”)

Page 5: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Classification TreeClassification Tree

Let

be the classification tree prediction at input x obtained from the full “training” data Z=

{(x1,y1),(x2,y2)…(xN,yN)}

)(ˆ xf

Page 6: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Bagging Classification TreeBagging Classification Tree

Let

be the classification tree prediction at input x obtained from the bootstrap sample Z*b, b=1,2,…B.

Bagging estimate:

)(ˆ * xf b

1

2

B

B

b

b

bagxf

Bxf

1

* )(ˆ1

)(ˆ

Page 7: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Visualization toolsVisualization tools

Graphs based on predictor “importances”

(Bxp) matrix F (p=# of predictors)

For bagged trees, we take the avg: Diagram 1, importance mean bar chart Diagram 2, (“BOF Clusters”) is the cluster

means chart (NEW) Diagram 3, (“BOF MDPREF”) is the

multidimensional preference bi-plot (NEW)

)(ˆ1ˆ

1

22

b

B

b kkTI

BI

Page 8: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Visualization toolsVisualization tools

Graphs based on proximity (nxn) matrix P, (n=# of cases) Diagram 4 (“Proximity Clusters”) is the cluster

means chart (Breiman,2002) Diagram 5 (“Proximity MDS”) is the

multidimensional scaling plot of “similar” cases (Breiman,2002)

Page 9: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Simulation experimentsSimulation experiments

S1:Generate a sample of size n=30,two classes, and p=5 variables (x1-x5), with a standard normal distribution and pair-wise correlation 0.95.The responses are generated according toPr(Y=1|x1≤0.5) = 0.2, Pr(Y=1|x1>0.5)=0.8.

S2:Generate a sample of size n=30,two classes, and p=5 variables (x1-x5), with a standard normal distribution and pair-wise correlation 0.95 between x1 and x2, and 0 among other predictors.The responses are generated according toPr(Y=1|x1≤0.5) = 0.2, Pr(Y=1|x1>0.5)=0.8.

Page 10: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Diagram 1, Mean importance Diagram 1, Mean importance

S1 S2

Page 11: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Diagram 2, “BOF Clusters” Diagram 2, “BOF Clusters”

S1 S2

Page 12: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Diagram 3, “BOF MDPREF” Diagram 3, “BOF MDPREF”

S1 S2

Page 13: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Diagram 4, “Proximity Clusters” Diagram 4, “Proximity Clusters”

S1 S2

Page 14: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Web Survey dataWeb Survey data

ICT infrastructure/usage in Croatian primary and secondary schools 25,000+ teachers (cases)200+ variablesResponse: “classroom use of a computer by educators” (yes/no)Partition 50% training 25% validation 25% test

Page 15: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Initial tree (before bagging)Initial tree (before bagging)

Page 16: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Diagram 1, “Mean importance” Diagram 1, “Mean importance”

Page 17: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Diagram 2, “BOF Clusters” Diagram 2, “BOF Clusters”

Page 18: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Diagram 3, “BOF MDPREF” Diagram 3, “BOF MDPREF”

Page 19: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Bootstrap tree 11Bootstrap tree 11

Page 20: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Bootstrap tree 22Bootstrap tree 22

Page 21: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Bootstrap tree 12Bootstrap tree 12

Page 22: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Clustering trees Clustering trees

Page 23: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Diagram 5, “Proximity MDS” Diagram 5, “Proximity MDS”

Page 24: BOF Trees Visualization  Zagreb, June 12, 2004 BOF Trees Visualization  Zagreb, June 12, 2004 “BOF” Trees Diagram as a Visual Way to Improve Interpretability.

BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004 BOF Trees Visualization BOF Trees Visualization ZagrebZagreb, June , June 1212, 2004 , 2004

Conclusions/ RecommendationsConclusions/ Recommendations

There are SWs for trees

There are some SWs for tree ensembles

There are some visualization tools (old and new)

The problem is they are not “interfaced” (integrated)


Recommended