Date post: | 27-Dec-2015 |
Category: |
Documents |
Upload: | thomasine-allison |
View: | 228 times |
Download: | 1 times |
• Decision Tree
https://store.theartofservice.com/the-decision-tree-toolkit.html
Machine learning - Decision tree learning
1 Decision tree learning uses a decision tree as a predictive model which maps observations about an
item to conclusions about the item's target value.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Alternating decision tree - History
1 However, the algorithm as presented had several typographical errors. Clarifications and optimizations were later presented by Bernhard
Pfahringer, Geoffrey Holmes and Richard Kirkby.Bernhard Pfahringer, Geoffrey Holmes
and Richard Kirkby. Optimizing the Induction of Alternating Decision Trees. Proceedings of the
Fifth Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining. 2001, pp. 477-487 Implementations are available in
Weka (machine learning)|Weka and JBoost.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Alternating decision tree - Motivation
1 or decision trees as weak hypotheses. As an example, boosting decision stumps creates
https://store.theartofservice.com/the-decision-tree-toolkit.html
Alternating decision tree - Motivation
1 Boosting a simple learner results in an unstructured set of T hypotheses, making it
difficult to infer correlations between attributes. Alternating decision trees
introduce structure to the set of hypotheses by requiring that they build off a hypothesis that was produced in an earlier iteration.
The resulting set of hypotheses can be visualized in a tree based on the
relationship between a hypothesis and its parent.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Alternating decision tree - Alternating decision tree structure
1 An alternating decision tree consists of decision nodes and
prediction nodes
https://store.theartofservice.com/the-decision-tree-toolkit.html
Alternating decision tree - Empirical results
1 Figure 6 in the original paper demonstrates that ADTrees are typically as robust as boosted
decision trees and boosted decision stumps. Typically, equivalent
accuracy can be achieved with a much simpler tree structure than recursive partitioning algorithms.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Gene expression programming - Decision trees
1 Decision trees (DT) are classification models where a series of questions
and answers are mapped using nodes and directed edges.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Gene expression programming - Decision trees
1 Decision trees have three types of nodes: a root node, internal nodes, and leaf or terminal nodes. The root
node and all internal nodes represent test conditions for different attributes or variables in a dataset. Leaf nodes specify the class label for all different
paths in the tree.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Gene expression programming - Decision trees
1 Most decision tree induction algorithms involve selecting an
attribute for the root node and then make the same kind of informed
decision about all the nodes in a tree.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Gene expression programming - Decision trees
1 Decision trees can also be created by gene expression programming, with the advantage that all the decisions
concerning the growth of the tree are made by the algorithm itself without
any kind of human input.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Gene expression programming - Decision trees
1 This aspect of decision tree induction also carries to gene expression
programming and there are two GEP algorithms for decision tree
induction: the evolvable decision trees (EDT) algorithm for dealing
exclusively with nominal attributes and the EDT-RNC (EDT with random numerical constants) for handling
both nominal and numeric attributes.https://store.theartofservice.com/the-decision-tree-toolkit.html
Gene expression programming - Decision trees
1 In the decision trees induced by gene expression programming, the
attributes behave as function nodes in the gene expression
programming#The basic gene expression algorithm|basic gene
expression algorithm, whereas the class labels behave as terminals
https://store.theartofservice.com/the-decision-tree-toolkit.html
Gene expression programming - Decision trees
1 This again ensures that all decision trees designed by GEP are always valid programs
https://store.theartofservice.com/the-decision-tree-toolkit.html
Gene expression programming - Decision trees
1 For example, consider the decision tree below to decide whether to play outside:
https://store.theartofservice.com/the-decision-tree-toolkit.html
Gene expression programming - Decision trees
1 Then the chromosomes are expressed as decision trees and their fitness evaluated against a training
dataset
https://store.theartofservice.com/the-decision-tree-toolkit.html
Gene expression programming - Decision trees
1 Decision trees with both nominal and numeric attributes are also easily induced with gene
expression programming using the framework described gene expression programming#The
GEP-RNC algorithm|above for dealing with random numerical constants. The chromosomal
architecture includes an extra domain for encoding random numerical constants, which are used as thresholds for splitting the data at each branching node. For example, the gene below with a head size of 5 (the Dc starts at position
16):
https://store.theartofservice.com/the-decision-tree-toolkit.html
Gene expression programming - Decision trees
1 These random numerical constants are encoded in the Dc domain and
their expression follows a very simple scheme: from top to bottom and
from left to right, the elements in Dc are assigned one-by-one to the elements in the decision tree
https://store.theartofservice.com/the-decision-tree-toolkit.html
Gene expression programming - Decision trees
1 which can also be represented more colorfully as a conventional decision tree:
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning
1 'Decision tree learning' uses a decision tree as a Predictive
modelling|predictive model which maps observations about an item to conclusions about the item's target
value
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning
1 In decision analysis, a decision tree can be used to visually and explicitly
represent decisions and decision making. In data mining, a decision
tree describes data but not decisions; rather the resulting classification tree can be an input for decision making. This page deals with decision trees in
data mining.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - General
1 Decision tree learning is a method commonly used in data mining. The
goal is to create a model that predicts the value of a target
variable based on several input variables. An example is shown on
the right. Each interior node corresponds to one of the input
variables; there are edges to children for each of the possible values of
that input variable. Each leaf represents a value of the target
variable given the values of the input variables represented by the path
from the root to the leaf.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - General
1 A decision tree is a simple representation for classifying
examples. Decision tree learning is one of the most successful techniques for supervised
classification learning. For this section, assume that all of the features have finite discrete
domains, and there is a single target feature called the classification. Each
element of the domain of the classification is called a class.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - General
1 A decision tree or a classification tree is a tree in which each internal (non-
leaf) node is labeled with an input feature. The arcs coming from a node
labeled with a feature are labeled with each of the possible values of the feature. Each leaf of the tree is labeled with a class or a probability
distribution over the classes.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - General
1 Machine Learning 1: 81-106, Kluwer Academic Publishers is an example of
a greedy algorithm, and it is by far the most common strategy for
learning decision trees from data.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - General
1 In data mining, decision trees can be described also as the combination of
mathematical and computational techniques to aid the description,
categorisation and generalisation of a given set of data.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Types
1 Decision trees used in data mining are of
two main types:
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Types
1 Some techniques, often called ensemble methods, construct more than one decision
tree:
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Types
1 *'Bootstrap aggregating|Bagging' decision trees, an early ensemble method, builds multiple decision trees by repeatedly resampling
training data with replacement, and voting the trees for a consensus prediction.Breiman, L. (1996). Bagging Predictors. Machine Learning, 24: pp. 123-140.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Types
1 *A 'Random forest|Random Forest' classifier uses a number of decision
trees, in order to improve the classification rate.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Types
1 *'Rotation forest' - in which every decision tree is trained by first applying principal component
analysis (PCA) on a random subset of the input features.Rodriguez, J.J. and
Kuncheva, L.I. and Alonso, C.J. (2006), Rotation forest: A new
classifier ensemble method, IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10):1619-
1630.https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Types
1 'Decision tree learning' is the construction of a decision tree from
class-labeled training tuples. A decision tree is a flow-chart-like
structure, where each internal (non-leaf) node denotes a test on an
attribute, each branch represents the outcome of a test, and each leaf (or terminal) node holds a class label. The topmost node in a tree is the
root node.https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Types
1 * Multivariate adaptive regression splines|MARS: extends decision trees
to better handle numerical data.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Types
1 ID3 and CART were invented independently at around same time (b/w 1970-80), yet follow a similar approach for learning decision tree
from training tuples.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Formulae
1 Algorithms for constructing decision trees usually work top-down, by choosing a
variable at each step that best splits the set of items. Different algorithms use different
metrics for measuring best. These generally measure the homogeneity of the target
variable within the subsets. Some examples are given below. These metrics are applied to
each candidate subset, and the resulting values are combined (e.g., averaged) to
provide a measure of the quality of the split.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Information gain
1 Used by the ID3 algorithm|ID3, C4.5 algorithm|C4.5 and C5.0 tree-
generation algorithms. Information gain in decision trees|Information gain is based on the concept of
information entropy|entropy from information theory.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Decision tree advantages
1 Amongst other data mining methods,
decision trees have various advantages:
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Decision tree advantages
1 * 'Simple to understand and interpret.' People are able to
understand decision tree models after a brief explanation.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Limitations
1 Such algorithms cannot guarantee to return the globally-optimal decision tree.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Limitations
1 * Decision-tree learners can create over-complex trees that do not
generalise well from the training data. (This is known as overfitting.)
Mechanisms such as Pruning (decision trees)|pruning are
necessary to avoid this problem.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Limitations
1 * There are concepts that are hard to learn because decision trees do not express them easily, such as XOR, parity bit#Parity|parity or
multiplexer problems. In such cases, the decision tree becomes
prohibitively large. Approaches to solve the problem involve either
changing the representation of the problem domain (known as
propositionalisation) or using learning algorithms based on more expressive representations (such as
statistical relational learning or inductive logic programming).
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Limitations
1 * For data including categorical variables with different numbers of levels, information gain in decision
trees is biased in favor of those attributes with more levels.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Decision graphs
1 In a decision tree, all paths from the root node to the leaf node proceed
by way of conjunction, or AND.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Decision graphs
1 In general, decision graphs infer models with fewer leaves than decision trees.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree learning - Alternative search methods
1 Breeding Decision Trees Using Evolutionary Techniques, Proceedings
of the Eighteenth International Conference on Machine Learning,
p.393-400, June 28-July 01, 2001Barros, Rodrigo C., Basgalupp,
M
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees
1 A 'decision tree' is a decision support tool that uses a tree-like Diagram|graph or Causal model|
model of decisions and their possible consequences, including probability|
chance event outcomes, resource costs, and utility. It is one way to
display an algorithm.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees
1 Decision trees are commonly used in operations research, specifically in decision analysis, to help identify a strategy most likely to reach a goal.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees - Overview
1 A decision tree is a flowchart-like structure in which internal node
represents test on an attribute, each branch represents outcome of test
and each leaf node represents class label (decision taken after computing
all attributes). A path from root to leaf represents classification rules.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees - Overview
1 In decision analysis a decision tree and the closely related influence diagram is used as a visual and analytical decision support tool, where the expected values (or expected utility) of competing alternatives are calculated.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees - Overview
1 Decision trees are commonly used in operations research, specifically in decision analysis, to help identify a strategy most likely to reach a goal. If in practice decisions have to be taken online with no recall under incomplete knowledge, a decision
tree should be paralleled by a probability model as a best choice
model or online selection model algorithm. Another use of decision trees is as a descriptive means for calculating conditional probability|
conditional probabilities.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees - Overview
1 Decision trees, influence diagrams, utility functions, and other decision
analysis tools and methods are taught to undergraduate students in
schools of business, health economics, and public health, and
are examples of operations research or management science methods.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees - Decision tree elements
1 Drawn from left to right, a decision tree has only burst nodes (splitting
paths) but no sink nodes (converging paths). Therefore, used manually,
they can grow very big and are then often hard to draw fully by hand. Traditionally, decision trees have
been created manually - as the aside example shows - although
increasingly, specialized software is employed.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees - Decision tree using flow chart symbols
1 Commonly a decision tree is drawn using flow chart symbols as it is
easier for many to read and understand.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees - Another example
1 Shaw, [http://www.sciencedirect.com/scienc
e?_ob=ArticleURL_udi=B6V05-4007D5X-
C_user=793840_coverDate=01%2F27%2F1995_fmt=summary_orig=search_cdi=5637view=c_acct=C000043460_version=1_urlVersion=0_userid=793840md5=b66b56153f6780c30e07201eadd454cfref=full Induction of
fuzzy decision trees]https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees - Influence diagram
1 A decision tree can be represented more compactly as an influence
diagram, focusing attention on the issues and relationships between
events.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees - Advantages and disadvantages
1 Amongst decision support tools, decision trees (and influence
diagrams) have several advantages. Decision trees:
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees - Advantages and disadvantages
1 * Are simple to understand and interpret. People are able to
understand decision tree models after a brief explanation.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision trees - Advantages and disadvantages
1 * For data including categorical variables with different number of levels, information gain in decision trees are biased in favor of those
attributes with more levels.
https://store.theartofservice.com/the-decision-tree-toolkit.html
List of important publications in computer science - Induction of Decision Trees
1 Description: Decision Trees are a common learning algorithm and a
decision representation tool. Development of decision trees was done by many researchers in many
areas, even before this paper. Though this paper is one of the most
influential in the field.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Game complexity - Decision trees
1 A decision tree is a subtree of the game tree, with each position
labelled with player A wins, player B wins or drawn, if that position can be proved to have that value (assuming
best play by both sides) by examining only other positions in the
graph
https://store.theartofservice.com/the-decision-tree-toolkit.html
Information gain in decision trees
1 In information theory and machine learning, 'information gain' is a synonym for Kullback–Leibler
divergence. However, in the context of decision trees, the term is
sometimes used synonymously with mutual information, which is the
expectation value of the Kullback–Leibler divergence of a conditional
probability distribution.https://store.theartofservice.com/the-decision-tree-toolkit.html
Information gain in decision trees
1 In machine learning, this concept can be used to define a preferred
sequence of attributes to investigate to most rapidly narrow down the
state of X. Such a sequence (which depends on the outcome of the
investigation of previous attributes at each stage) is called a Decision tree
learning|decision tree. Usually an attribute with high mutual
information should be preferred to other attributes.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Learning algorithms - Decision tree learning
1 Decision tree learning uses a decision tree as a predictive
modelling|predictive model which maps observations about an item to conclusions about the item's target
value.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model
1 In computational complexity theory|computational complexity and communication complexity theories
the 'decision tree model' is the model of computation or
communication in which an algorithm or communication process is
considered to be basically a decision tree, i.e., a sequence of branching
operations based on comparisons of some quantities, the comparisons
being assigned the unit computational cost.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model
1 Several variants of decision tree models may be considered,
depending on the complexity of the operations allowed in the
computation of a single comparison and the way of branching.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model
1 The computation complexity of a problem or an algorithm expressed in
terms of the decision tree model is called 'decision tree complexity' or
'query complexity'.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model - Simple decision tree
1 The model in which every decision is based on the comparison of two numbers within constant time is
called simply a decision tree model. It was introduced to establish
computational complexity of sorting and searching.Data structures and
algorithms, by Alfred V. Aho, John E. Hopcroft, Jeffrey D. Ullman
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model - Simple decision tree
1 In this case the decision tree model is a binary tree
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model - Linear decision tree
1 Linear decision trees, just like the simple decision trees, make a
branching decision based on a set of values as input. As opposed to binary decision trees, linear decision trees
have three output branches. A linear function f(x_1, \dots, x_i) is being
tested and branching decisions are made based on the sign of the
function (negative, positive, or 0).https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model - Algebraic decision tree
1 Algebraic decision trees are a generalization of linear decision trees
to allow test functions to be polynomials of degree d.
Geometrically, the space is divided into semi-algebraic sets (a
generalization of hyperplane). The evaluation of the complexity is more
difficult.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model - Deterministic decision tree
1 If the output of a decision tree is f(x), for all x\in \^n , the decision tree is said to compute f. The depth of a tree is the maximum number of
queries that can happen before a leaf is reached and a result obtained.
'D(f)', the 'deterministic decision tree' complexity of f is the smallest depth
among all deterministic decision trees that compute f.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model - Randomized decision tree
1 'R_2(f)' is defined as the complexity of the lowest-depth randomized
decision tree whose result is f(x) with probability at least 2/3 for all x\in \^n
(i.e., with bounded 2-sided error).
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model - Randomized decision tree
1 'R_2(f)' is known as the Monte Carlo algorithm|Monte Carlo randomized decision-
tree complexity, because the result is allowed to be incorrect with bounded two-sided error. The Las Vegas algorithm|Las Vegas decision-tree complexity 'R_0(f)'
measures the expected depth of a decision tree that must be correct (i.e., has zero-
error). There is also a one-sided bounded-error version known as 'R_1(f)'.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model - Nondeterministic decision tree
1 The nondeterministic decision tree complexity of a function is known more commonly as the Certificate
(complexity)|certificate complexity of that function. It measures the
number of input bits that a nondeterministic algorithm would
need to look at in order to evaluate the function with certainty.
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model - Quantum decision tree
1 Q_2(f) and Q_E(f) are more commonly known as 'quantum query
complexities', because the direct definition of a quantum decision tree
is more complicated than in the classical case
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model - Relationship between different models
1 Noam Nisan found that the Monte Carlo randomized decision tree complexity is also polynomially
related to deterministic decision tree complexity: D(f) = O(R_2(f)^3)
https://store.theartofservice.com/the-decision-tree-toolkit.html
Decision tree model - Relationship between different models
1 The quantum decision tree complexity Q_2(f) is also polynomially related to D(f). Midrijanis showed that D(f) = O(Q_E(f)^3), improving a quartic bound due to Beals et al. Beals et al. also showed that D(f) = O(Q_2(f)^6), and this is still the best known bound. However, the
largest known gap between deterministic and quantum query complexities is only
quadratic. A quadratic gap is achieved for the Grover's algorithm|OR function; D(OR_n)
= n while Q_2(OR_n) = \Theta(\sqrt).
https://store.theartofservice.com/the-decision-tree-toolkit.html
For More Information, Visit:
• https://store.theartofservice.com/the-decision-tree-toolkit.html
The Art of Servicehttps://store.theartofservice.com