+ All Categories
Home > Documents > Selecting the Best Decision Tree Models with SPSS 16.0 and R · Selecting the Best Decision Tree...

Selecting the Best Decision Tree Models with SPSS 16.0 and R · Selecting the Best Decision Tree...

Date post: 03-Jan-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
1
Selecting the Best Decision Tree Models with SPSS 16.0 and R Ordinal Target ("High") ¯ Legend OutsideSampleArea Node18, Rule1 Node20, Rule2 Not Critical Node, No rules 0 40 80 120 Kilometers Map of Critical Nodes for Ordinal Target Continuous Target "Maximimum" ¯ 0 40 80 120 Kilometers Legend OutsideSampleArea Node16, Rule1 Node4, RULE3 Node8, RULE2 Node9, RULE4 Not Critical Node, No rules Map of Critical Nodes for Continuous Target Table for Selecting Best Tree Model Table for Selecting Best Tree Model Gain Table for Ordinal Target Gain Table for Continuous Target Decision Tree with Critical Nodes (Red Circle) Decision Tree with Critical Nodes (Red Circle) Summary Output Summary Output 1 Summary Output 2 Rules and Predictors Table Rules and Predictors Table R Rules and Settings R Trees and Settings Risk Table Risk Table Data Dictionary Purpose: To predict a target variable utilizing decison trees for both ordinal and continuous data. The target variable in this project is averge household spending on food per year. Critical nodes and the rules developed by each tress focused on targeting the highest spending possible. C&RT was chosen as the best modelling method for both continuous and ordinal fields. C&RT was chosen based on the left table. Virtually all algoruthms presented similar cumulative gain (%) , Responses and Correctly classified records. With an outstanding improvement on index (%) values and reasonable tree levels, C&RT was chosen as the algorithm of choice for ordinal data. Continuous Chart The decision regarding which algorithm to use for continuous data was obvious. C&RT presents the highest value in maximum index(%), classified records and mean when comparing to the rest of the modelling methods. The only concern is that all the three models' final tree results are not quite satisfied, with only a slightly improvement in comparison to the node 0 perdicted values. Sources: CENSUS Canada, Canadian Expendictures and projections. Software: SPSS Modeler 16.0, IBM SPSS, ArcMap 10.2, R Prepared by: Chen Shi w027460 Instructor: Konrad Dramowicz Date: February, 2015 Ordinal Charts
Transcript
Page 1: Selecting the Best Decision Tree Models with SPSS 16.0 and R · Selecting the Best Decision Tree Models with SPSS 16.0 and R Ordinal Target ("High") ¯ Legend OutsideSampleArea Node18,

Selecting the Best Decision Tree Models with SPSS 16.0 and ROrdinal Target ("High")

¯

LegendOutsideSampleAreaNode18, Rule1Node20, Rule2Not Critical Node, No rules

0 40 80 120Kilometers

Map of Critical Nodes for Ordinal Target

Continuous Target "Maximimum"

¯ 0 40 80 120Kilometers

LegendOutsideSampleAreaNode16, Rule1Node4, RULE3Node8, RULE2Node9, RULE4Not Critical Node, No rules

Map of Critical Nodes for Continuous Target

Table for Selecting Best Tree Model Table for Selecting Best Tree Model

Gain Table for Ordinal Target

Gain Table for Continuous Target

Decision Tree with Critical Nodes (Red Circle)Decision Tree with Critical Nodes (Red Circle)

Summary Output

Summary Output 1

Summary Output 2

Rules and Predictors Table Rules and Predictors Table

R Rules and Settings

R Trees and Settings

Risk Table

Risk Table

Data Dictionary

Purpose: To predict a target variable utilizing decison treesfor both ordinal and continuous data. The target variable inthis project is averge household spending on food per year.Critical nodes and the rules developed by each tressfocused on targeting the highest spending possible. C&RTwas chosen as the best modelling method for bothcontinuous and ordinal fields.

C&RT was chosen based onthe left table. Virtually allalgoruthms presented similarcumulative gain (%) ,Responses and Correctlyclassified records. With anoutstanding improvement onindex (%) values andreasonable tree levels,C&RT was chosen as thealgorithm of choice forordinal data.

Continuous Chart

The decision regarding whichalgorithm to use for continuousdata was obvious. C&RTpresents the highest value inmaximum index(%), classifiedrecords and mean whencomparing to the rest of themodelling methods. The onlyconcern is that all the threemodels' final tree results are notquite satisfied, with only a slightlyimprovement in comparison tothe node 0 perdicted values.

Sources: CENSUS Canada,Canadian Expendictures andprojections. Software: SPSS Modeler16.0, IBM SPSS, ArcMap 10.2, RPrepared by: Chen Shi w027460Instructor: Konrad DramowiczDate: February, 2015

Ordinal Charts

Recommended