Post on 31-Dec-2015
description
transcript
Perception-Based Classification(PBC) System
Salvador Ledezma
sledezma@uci.edu
April 25, 2002
Introduction
Concepts Demo of PBC
References: “Towards and Effective Cooperation of the User and
Computer for Classification” “Visual Data Mining with Pixel-oriented Visualization
Techniques” “Visual Classification: An Interactive Approach to
Decision Tree Construction” Mihael Ankerst, author or coauthor
Data Mining
Exploration and Analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns and rules
Part of Knowledge Discovery in Databases (KDD) process
Classification
Major task of Data Mining Assign object to one of a set of given classes
based on object attributes
Classification Algorithms
Decision Tree Classifier Training set – set of objects whose attributes and
class is already known Using training set, tree classifier determines a
classification function represented by a decision tree Model for class attribute as a function of the values of
other attributes Test set – validates the classification function
Classification Example
Classification (cont)
Usually algorithms are black boxes with no user interaction or intervention
Reasons for user involvement in decision tree construction: Use human pattern recognition capabilities User will have better understanding of tree User provides domain knowledge
Visual Data Mining
Tackle data mining tasks by enabling human involvement Incorporating perceptivity of humans
Visual Classification
Construction of decision trees is decomposed into substeps
Enables human involvement Example: PBC Data visualization based on 2 concepts
Each attribute of training data is visualized in a separate part of screen
Different class labels of training objects are represented by different colors
Pixel-Oriented Visualization Techniques
Represent each attribute value as a single colored pixel
Map the range of possible attribute values to a fixed color map
Maximizes the amount of information represented at one time without any overlap
Circle Segments Technique
Data is a circle divided into segments Each segment represents an attribute Attribute values are mapped by a single
colored pixel and arrangement starts in the center and proceeds outward
Example
Represents 50 stocks. 1 circle represents the prices of different stocks at the same time
Light = high stock price
Dark = low stock price
Bar Visualization
For each attribute Attribute values are sorted into attribute lists Classes are defined by colors
Within a bar, sorted attribute values are mapped to pixels, line by line
Each attribute is placed in a different bar
DNA Training Data
Attribute 85 and attribute 90 visually are good candidates for splitting tree
Algorithm picks 90 as the optimal split
PBC
Uses pixel-oriented visualization Visualizes training data in order to support
interactive decision tree construction Examples of use
Automatic Automatic-manual (top 2 levels) Manual-automatic Manual Actual use lies somewhere in between this spectrum
Additional Functionality
Propose split Look-ahead
For a hypothetical split
Expand tree Automatic expanding and construction
PBC demo