Intelligent Database Systems Lab Presenter : Chuang, Kai-Ting Authors : Rodrigo T. Peres, Claus...

Post on 20-Jan-2018

215 views 0 download

description

Intelligent Database Systems Lab Motivation The problem of data visualization consists of generating a bi-dimensional projection of a high- dimensional data set.

transcript

Intelligent Database Systems Lab

Presenter : Chuang, Kai-Ting

Authors : Rodrigo T. Peres, Claus Aranha,

Carlos E. Pedreira

2013, InfSci

Optimized bi-dimensional data projection for clustering visualization

Intelligent Database Systems Lab

Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Intelligent Database Systems Lab

Motivation• The problem of data visualization consists of

generating a bi-dimensional projection of a high-

dimensional data set.

Intelligent Database Systems Lab

Objectives• We propose a new method to project n-dimensional

data onto two dimensions, for visualization purposes.

• We apply Differential Evolution as a meta-heuristic to

optimize a divergence measure of the projected data.

• This divergence measure is based on the Cauchy–

Schwartz divergence, extended for multiple classes.

Intelligent Database Systems Lab

Methodology-Framework

Intelligent Database Systems Lab

Methodology

Intelligent Database Systems Lab

Methodology-Cauchy-Schwartz divergence measure

Intelligent Database Systems Lab

Methodology-Information Theoretic Learning (ITL)

Intelligent Database Systems Lab

Methodology-Information Theoretic Learning (ITL)

Intelligent Database Systems Lab

Methodology-Information Theoretic Learning (ITL)

Intelligent Database Systems Lab

Methodology-Computational complexity of the Dcs

Intelligent Database Systems Lab

Methodology-Differential Evolution

Intelligent Database Systems Lab

Methodology-Data transformation

Intelligent Database Systems Lab

Experiment setup• Synthetic data sets– Initial conditions.– Robustness of the method to very noisy dimesions.• Real world data sets– Pen Digits– Lung Cancer– Compares monocytes-related dendritic cells, plasmocytoid

dendritic cells and B-lymphocytes.– Compares monocytes and neutrophils.– Compares plasmocytoid dendritic cells and neutrophils .

Intelligent Database Systems Lab

Experiment-Kernel width

Intelligent Database Systems Lab

Experiment-Synthetic data sets1

Intelligent Database Systems Lab

Experiment-Synthetic data sets2

Intelligent Database Systems Lab

Experiment-Real world data sets

Intelligent Database Systems Lab

Conclusions• Using this method, we promote the bi-dimensional

visualization of high-dimensional data sets with

optimized cluster separation.

Intelligent Database Systems Lab

Comments• Advantages– The method performed well .• Disadvantages– It may be slower to train on data sets with a larger

number of cases. • Applications– Visualization.