+ All Categories
Home > Documents > Decision Tree .

Decision Tree .

Date post: 27-Dec-2015
Category:
Upload: thomasine-allison
View: 228 times
Download: 1 times
Share this document with a friend
Popular Tags:
78
• Decision Tree https://store.theartofservice.com/the-decision-tree- toolkit.html
Transcript
Page 1: Decision Tree .

• Decision Tree

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 2: Decision Tree .

Machine learning - Decision tree learning

1 Decision tree learning uses a decision tree as a predictive model which maps observations about an

item to conclusions about the item's target value.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 3: Decision Tree .

Alternating decision tree - History

1 However, the algorithm as presented had several typographical errors. Clarifications and optimizations were later presented by Bernhard

Pfahringer, Geoffrey Holmes and Richard Kirkby.Bernhard Pfahringer, Geoffrey Holmes

and Richard Kirkby. Optimizing the Induction of Alternating Decision Trees. Proceedings of the

Fifth Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. 2001, pp. 477-487 Implementations are available in

Weka (machine learning)|Weka and JBoost.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 4: Decision Tree .

Alternating decision tree - Motivation

1 or decision trees as weak hypotheses. As an example, boosting decision stumps creates

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 5: Decision Tree .

Alternating decision tree - Motivation

1 Boosting a simple learner results in an unstructured set of T hypotheses, making it

difficult to infer correlations between attributes. Alternating decision trees

introduce structure to the set of hypotheses by requiring that they build off a hypothesis that was produced in an earlier iteration.

The resulting set of hypotheses can be visualized in a tree based on the

relationship between a hypothesis and its parent.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 6: Decision Tree .

Alternating decision tree - Alternating decision tree structure

1 An alternating decision tree consists of decision nodes and

prediction nodes

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 7: Decision Tree .

Alternating decision tree - Empirical results

1 Figure 6 in the original paper demonstrates that ADTrees are typically as robust as boosted

decision trees and boosted decision stumps. Typically, equivalent

accuracy can be achieved with a much simpler tree structure than recursive partitioning algorithms.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 8: Decision Tree .

Gene expression programming - Decision trees

1 Decision trees (DT) are classification models where a series of questions

and answers are mapped using nodes and directed edges.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 9: Decision Tree .

Gene expression programming - Decision trees

1 Decision trees have three types of nodes: a root node, internal nodes, and leaf or terminal nodes. The root

node and all internal nodes represent test conditions for different attributes or variables in a dataset. Leaf nodes specify the class label for all different

paths in the tree.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 10: Decision Tree .

Gene expression programming - Decision trees

1 Most decision tree induction algorithms involve selecting an

attribute for the root node and then make the same kind of informed

decision about all the nodes in a tree.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 11: Decision Tree .

Gene expression programming - Decision trees

1 Decision trees can also be created by gene expression programming, with the advantage that all the decisions

concerning the growth of the tree are made by the algorithm itself without

any kind of human input.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 12: Decision Tree .

Gene expression programming - Decision trees

1 This aspect of decision tree induction also carries to gene expression

programming and there are two GEP algorithms for decision tree

induction: the evolvable decision trees (EDT) algorithm for dealing

exclusively with nominal attributes and the EDT-RNC (EDT with random numerical constants) for handling

both nominal and numeric attributes.https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 13: Decision Tree .

Gene expression programming - Decision trees

1 In the decision trees induced by gene expression programming, the

attributes behave as function nodes in the gene expression

programming#The basic gene expression algorithm|basic gene

expression algorithm, whereas the class labels behave as terminals

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 14: Decision Tree .

Gene expression programming - Decision trees

1 This again ensures that all decision trees designed by GEP are always valid programs

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 15: Decision Tree .

Gene expression programming - Decision trees

1 For example, consider the decision tree below to decide whether to play outside:

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 16: Decision Tree .

Gene expression programming - Decision trees

1 Then the chromosomes are expressed as decision trees and their fitness evaluated against a training

dataset

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 17: Decision Tree .

Gene expression programming - Decision trees

1 Decision trees with both nominal and numeric attributes are also easily induced with gene

expression programming using the framework described gene expression programming#The

GEP-RNC algorithm|above for dealing with random numerical constants. The chromosomal

architecture includes an extra domain for encoding random numerical constants, which are used as thresholds for splitting the data at each branching node. For example, the gene below with a head size of 5 (the Dc starts at position

16):

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 18: Decision Tree .

Gene expression programming - Decision trees

1 These random numerical constants are encoded in the Dc domain and

their expression follows a very simple scheme: from top to bottom and

from left to right, the elements in Dc are assigned one-by-one to the elements in the decision tree

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 19: Decision Tree .

Gene expression programming - Decision trees

1 which can also be represented more colorfully as a conventional decision tree:

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 20: Decision Tree .

Decision tree learning

1 'Decision tree learning' uses a decision tree as a Predictive

modelling|predictive model which maps observations about an item to conclusions about the item's target

value

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 21: Decision Tree .

Decision tree learning

1 In decision analysis, a decision tree can be used to visually and explicitly

represent decisions and decision making. In data mining, a decision

tree describes data but not decisions; rather the resulting classification tree can be an input for decision making. This page deals with decision trees in

data mining.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 22: Decision Tree .

Decision tree learning - General

1 Decision tree learning is a method commonly used in data mining. The

goal is to create a model that predicts the value of a target

variable based on several input variables. An example is shown on

the right. Each interior node corresponds to one of the input

variables; there are edges to children for each of the possible values of

that input variable. Each leaf represents a value of the target

variable given the values of the input variables represented by the path

from the root to the leaf.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 23: Decision Tree .

Decision tree learning - General

1 A decision tree is a simple representation for classifying

examples. Decision tree learning is one of the most successful techniques for supervised

classification learning. For this section, assume that all of the features have finite discrete

domains, and there is a single target feature called the classification. Each

element of the domain of the classification is called a class.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 24: Decision Tree .

Decision tree learning - General

1 A decision tree or a classification tree is a tree in which each internal (non-

leaf) node is labeled with an input feature. The arcs coming from a node

labeled with a feature are labeled with each of the possible values of the feature. Each leaf of the tree is labeled with a class or a probability

distribution over the classes.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 25: Decision Tree .

Decision tree learning - General

1 Machine Learning 1: 81-106, Kluwer Academic Publishers is an example of

a greedy algorithm, and it is by far the most common strategy for

learning decision trees from data.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 26: Decision Tree .

Decision tree learning - General

1 In data mining, decision trees can be described also as the combination of

mathematical and computational techniques to aid the description,

categorisation and generalisation of a given set of data.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 27: Decision Tree .

Decision tree learning - Types

1 Decision trees used in data mining are of

two main types:

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 28: Decision Tree .

Decision tree learning - Types

1 Some techniques, often called ensemble methods, construct more than one decision

tree:

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 29: Decision Tree .

Decision tree learning - Types

1 *'Bootstrap aggregating|Bagging' decision trees, an early ensemble method, builds multiple decision trees by repeatedly resampling

training data with replacement, and voting the trees for a consensus prediction.Breiman, L. (1996). Bagging Predictors. Machine Learning, 24: pp. 123-140.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 30: Decision Tree .

Decision tree learning - Types

1 *A 'Random forest|Random Forest' classifier uses a number of decision

trees, in order to improve the classification rate.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 31: Decision Tree .

Decision tree learning - Types

1 *'Rotation forest' - in which every decision tree is trained by first applying principal component

analysis (PCA) on a random subset of the input features.Rodriguez, J.J. and

Kuncheva, L.I. and Alonso, C.J. (2006), Rotation forest: A new

classifier ensemble method, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10):1619-

1630.https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 32: Decision Tree .

Decision tree learning - Types

1 'Decision tree learning' is the construction of a decision tree from

class-labeled training tuples. A decision tree is a flow-chart-like

structure, where each internal (non-leaf) node denotes a test on an

attribute, each branch represents the outcome of a test, and each leaf (or terminal) node holds a class label. The topmost node in a tree is the

root node.https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 33: Decision Tree .

Decision tree learning - Types

1 * Multivariate adaptive regression splines|MARS: extends decision trees

to better handle numerical data.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 34: Decision Tree .

Decision tree learning - Types

1 ID3 and CART were invented independently at around same time (b/w 1970-80), yet follow a similar approach for learning decision tree

from training tuples.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 35: Decision Tree .

Decision tree learning - Formulae

1 Algorithms for constructing decision trees usually work top-down, by choosing a

variable at each step that best splits the set of items. Different algorithms use different

metrics for measuring best. These generally measure the homogeneity of the target

variable within the subsets. Some examples are given below. These metrics are applied to

each candidate subset, and the resulting values are combined (e.g., averaged) to

provide a measure of the quality of the split.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 36: Decision Tree .

Decision tree learning - Information gain

1 Used by the ID3 algorithm|ID3, C4.5 algorithm|C4.5 and C5.0 tree-

generation algorithms. Information gain in decision trees|Information gain is based on the concept of

information entropy|entropy from information theory.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 37: Decision Tree .

Decision tree learning - Decision tree advantages

1 Amongst other data mining methods,

decision trees have various advantages:

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 38: Decision Tree .

Decision tree learning - Decision tree advantages

1 * 'Simple to understand and interpret.' People are able to

understand decision tree models after a brief explanation.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 39: Decision Tree .

Decision tree learning - Limitations

1 Such algorithms cannot guarantee to return the globally-optimal decision tree.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 40: Decision Tree .

Decision tree learning - Limitations

1 * Decision-tree learners can create over-complex trees that do not

generalise well from the training data. (This is known as overfitting.)

Mechanisms such as Pruning (decision trees)|pruning are

necessary to avoid this problem.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 41: Decision Tree .

Decision tree learning - Limitations

1 * There are concepts that are hard to learn because decision trees do not express them easily, such as XOR, parity bit#Parity|parity or

multiplexer problems. In such cases, the decision tree becomes

prohibitively large. Approaches to solve the problem involve either

changing the representation of the problem domain (known as

propositionalisation) or using learning algorithms based on more expressive representations (such as

statistical relational learning or inductive logic programming).

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 42: Decision Tree .

Decision tree learning - Limitations

1 * For data including categorical variables with different numbers of levels, information gain in decision

trees is biased in favor of those attributes with more levels.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 43: Decision Tree .

Decision tree learning - Decision graphs

1 In a decision tree, all paths from the root node to the leaf node proceed

by way of conjunction, or AND.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 44: Decision Tree .

Decision tree learning - Decision graphs

1 In general, decision graphs infer models with fewer leaves than decision trees.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 45: Decision Tree .

Decision tree learning - Alternative search methods

1 Breeding Decision Trees Using Evolutionary Techniques, Proceedings

of the Eighteenth International Conference on Machine Learning,

p.393-400, June 28-July 01, 2001Barros, Rodrigo C., Basgalupp,

M

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 46: Decision Tree .

Decision trees

1 A 'decision tree' is a decision support tool that uses a tree-like Diagram|graph or Causal model|

model of decisions and their possible consequences, including probability|

chance event outcomes, resource costs, and utility. It is one way to

display an algorithm.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 47: Decision Tree .

Decision trees

1 Decision trees are commonly used in operations research, specifically in decision analysis, to help identify a strategy most likely to reach a goal.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 48: Decision Tree .

Decision trees - Overview

1 A decision tree is a flowchart-like structure in which internal node

represents test on an attribute, each branch represents outcome of test

and each leaf node represents class label (decision taken after computing

all attributes). A path from root to leaf represents classification rules.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 49: Decision Tree .

Decision trees - Overview

1 In decision analysis a decision tree and the closely related influence diagram is used as a visual and analytical decision support tool, where the expected values (or expected utility) of competing alternatives are calculated.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 50: Decision Tree .

Decision trees - Overview

1 Decision trees are commonly used in operations research, specifically in decision analysis, to help identify a strategy most likely to reach a goal. If in practice decisions have to be taken online with no recall under incomplete knowledge, a decision

tree should be paralleled by a probability model as a best choice

model or online selection model algorithm. Another use of decision trees is as a descriptive means for calculating conditional probability|

conditional probabilities.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 51: Decision Tree .

Decision trees - Overview

1 Decision trees, influence diagrams, utility functions, and other decision

analysis tools and methods are taught to undergraduate students in

schools of business, health economics, and public health, and

are examples of operations research or management science methods.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 52: Decision Tree .

Decision trees - Decision tree elements

1 Drawn from left to right, a decision tree has only burst nodes (splitting

paths) but no sink nodes (converging paths). Therefore, used manually,

they can grow very big and are then often hard to draw fully by hand. Traditionally, decision trees have

been created manually - as the aside example shows - although

increasingly, specialized software is employed.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 53: Decision Tree .

Decision trees - Decision tree using flow chart symbols

1 Commonly a decision tree is drawn using flow chart symbols as it is

easier for many to read and understand.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 54: Decision Tree .

Decision trees - Another example

1 Shaw, [http://www.sciencedirect.com/scienc

e?_ob=ArticleURL_udi=B6V05-4007D5X-

C_user=793840_coverDate=01%2F27%2F1995_fmt=summary_orig=search_cdi=5637view=c_acct=C000043460_version=1_urlVersion=0_userid=793840md5=b66b56153f6780c30e07201eadd454cfref=full Induction of

fuzzy decision trees]https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 55: Decision Tree .

Decision trees - Influence diagram

1 A decision tree can be represented more compactly as an influence

diagram, focusing attention on the issues and relationships between

events.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 56: Decision Tree .

Decision trees - Advantages and disadvantages

1 Amongst decision support tools, decision trees (and influence

diagrams) have several advantages. Decision trees:

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 57: Decision Tree .

Decision trees - Advantages and disadvantages

1 * Are simple to understand and interpret. People are able to

understand decision tree models after a brief explanation.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 58: Decision Tree .

Decision trees - Advantages and disadvantages

1 * For data including categorical variables with different number of levels, information gain in decision trees are biased in favor of those

attributes with more levels.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 59: Decision Tree .

List of important publications in computer science - Induction of Decision Trees

1 Description: Decision Trees are a common learning algorithm and a

decision representation tool. Development of decision trees was done by many researchers in many

areas, even before this paper. Though this paper is one of the most

influential in the field.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 60: Decision Tree .

Game complexity - Decision trees

1 A decision tree is a subtree of the game tree, with each position

labelled with player A wins, player B wins or drawn, if that position can be proved to have that value (assuming

best play by both sides) by examining only other positions in the

graph

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 61: Decision Tree .

Information gain in decision trees

1 In information theory and machine learning, 'information gain' is a synonym for Kullback–Leibler

divergence. However, in the context of decision trees, the term is

sometimes used synonymously with mutual information, which is the

expectation value of the Kullback–Leibler divergence of a conditional

probability distribution.https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 62: Decision Tree .

Information gain in decision trees

1 In machine learning, this concept can be used to define a preferred

sequence of attributes to investigate to most rapidly narrow down the

state of X. Such a sequence (which depends on the outcome of the

investigation of previous attributes at each stage) is called a Decision tree

learning|decision tree. Usually an attribute with high mutual

information should be preferred to other attributes.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 63: Decision Tree .

Learning algorithms - Decision tree learning

1 Decision tree learning uses a decision tree as a predictive

modelling|predictive model which maps observations about an item to conclusions about the item's target

value.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 64: Decision Tree .

Decision tree model

1 In computational complexity theory|computational complexity and communication complexity theories

the 'decision tree model' is the model of computation or

communication in which an algorithm or communication process is

considered to be basically a decision tree, i.e., a sequence of branching

operations based on comparisons of some quantities, the comparisons

being assigned the unit computational cost.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 65: Decision Tree .

Decision tree model

1 Several variants of decision tree models may be considered,

depending on the complexity of the operations allowed in the

computation of a single comparison and the way of branching.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 66: Decision Tree .

Decision tree model

1 The computation complexity of a problem or an algorithm expressed in

terms of the decision tree model is called 'decision tree complexity' or

'query complexity'.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 67: Decision Tree .

Decision tree model - Simple decision tree

1 The model in which every decision is based on the comparison of two numbers within constant time is

called simply a decision tree model. It was introduced to establish

computational complexity of sorting and searching.Data structures and

algorithms, by Alfred V. Aho, John E. Hopcroft, Jeffrey D. Ullman

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 68: Decision Tree .

Decision tree model - Simple decision tree

1 In this case the decision tree model is a binary tree

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 69: Decision Tree .

Decision tree model - Linear decision tree

1 Linear decision trees, just like the simple decision trees, make a

branching decision based on a set of values as input. As opposed to binary decision trees, linear decision trees

have three output branches. A linear function f(x_1, \dots, x_i) is being

tested and branching decisions are made based on the sign of the

function (negative, positive, or 0).https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 70: Decision Tree .

Decision tree model - Algebraic decision tree

1 Algebraic decision trees are a generalization of linear decision trees

to allow test functions to be polynomials of degree d.

Geometrically, the space is divided into semi-algebraic sets (a

generalization of hyperplane). The evaluation of the complexity is more

difficult.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 71: Decision Tree .

Decision tree model - Deterministic decision tree

1 If the output of a decision tree is f(x), for all x\in \^n , the decision tree is said to compute f. The depth of a tree is the maximum number of

queries that can happen before a leaf is reached and a result obtained.

'D(f)', the 'deterministic decision tree' complexity of f is the smallest depth

among all deterministic decision trees that compute f.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 72: Decision Tree .

Decision tree model - Randomized decision tree

1 'R_2(f)' is defined as the complexity of the lowest-depth randomized

decision tree whose result is f(x) with probability at least 2/3 for all x\in \^n

(i.e., with bounded 2-sided error).

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 73: Decision Tree .

Decision tree model - Randomized decision tree

1 'R_2(f)' is known as the Monte Carlo algorithm|Monte Carlo randomized decision-

tree complexity, because the result is allowed to be incorrect with bounded two-sided error. The Las Vegas algorithm|Las Vegas decision-tree complexity 'R_0(f)'

measures the expected depth of a decision tree that must be correct (i.e., has zero-

error). There is also a one-sided bounded-error version known as 'R_1(f)'.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 74: Decision Tree .

Decision tree model - Nondeterministic decision tree

1 The nondeterministic decision tree complexity of a function is known more commonly as the Certificate

(complexity)|certificate complexity of that function. It measures the

number of input bits that a nondeterministic algorithm would

need to look at in order to evaluate the function with certainty.

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 75: Decision Tree .

Decision tree model - Quantum decision tree

1 Q_2(f) and Q_E(f) are more commonly known as 'quantum query

complexities', because the direct definition of a quantum decision tree

is more complicated than in the classical case

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 76: Decision Tree .

Decision tree model - Relationship between different models

1 Noam Nisan found that the Monte Carlo randomized decision tree complexity is also polynomially

related to deterministic decision tree complexity: D(f) = O(R_2(f)^3)

https://store.theartofservice.com/the-decision-tree-toolkit.html

Page 77: Decision Tree .

Decision tree model - Relationship between different models

1 The quantum decision tree complexity Q_2(f) is also polynomially related to D(f). Midrijanis showed that D(f) = O(Q_E(f)^3), improving a quartic bound due to Beals et al. Beals et al. also showed that D(f) = O(Q_2(f)^6), and this is still the best known bound. However, the

largest known gap between deterministic and quantum query complexities is only

quadratic. A quadratic gap is achieved for the Grover's algorithm|OR function; D(OR_n)

= n while Q_2(OR_n) = \Theta(\sqrt).

https://store.theartofservice.com/the-decision-tree-toolkit.html


Recommended