+ All Categories
Home > Documents > Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria...

Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria...

Date post: 21-Jan-2016
Category:
Upload: morgan-pearson
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
18
Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information Visualization 2004
Transcript
Page 1: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Uncovering Clusters in Crowded Parallel Coordinates Visualizations

Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz

Information Visualization 2004

Page 2: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Abstract

• The idea is inspired by traditional image processing techniques such as grayscale manipulation.

• Reducing visual clutter and allowing the analyst to observe relevant patterns in the parallel coordinates.

Page 3: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Introduction

• The strong overlapping of graphical markers hampers the user’s ability to identify patterns in the data when the number of records and the dimensionality of the data set are high.

• It is important to avoid displaying irrelevant information and enhancing the presentation of the useful one.

Page 4: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Introduction

• Tackling this problem with a strategy that computes frequency and density information, and uses them in parallel coordinates visualizations to filter out the information to be presented to the user.

Page 5: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Frequency Information

• The frequency function for a n-dimensional variable x is defined as :

where h is the size of bins, σ is the number of records in the same bin, m is the number of all records.

Page 6: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Frequency Information

• A two-dimensional matrix is generated to store the frequency of each pair of attribute values, which is then used to draw the polygonal lines for the records in the data set.

• For a data set with n attributes, n-1 frequency matrices are generated, one for each pair of attributes.

Page 7: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Frequency Information

• All the non-zero matrix elements generate a line segment in the visualization and the pixel intensity used to draw the line segment.

• Each line segment is drawn with the Bresenham algorithm:

Page 8: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Interactive Parallel Coordinates Frequency and Density plots

• The intensity of the pixel with coordinates (q,p) is given by:

• Square wave smoothing filter is used for each pixel:

Page 9: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Interactive Parallel Coordinates Frequency and Density plots

• S is a scaling factor.

Page 10: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Density Information

• The density function for a n-dimensional variable x is defined as :

where di is the i-th record of the data set and K is the kernel function, the parameter defines a smoothing factor or bandwidth.

Page 11: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

visualizations of the Pollen data

a) Frequency Plot b) Density Plot

Page 12: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Interactive high-dimensional clustering with IPC plot

Page 13: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Interactive high-dimensional clustering with IPC plot

Page 14: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Interactive high-dimensional clustering with IPC plot

Page 15: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Interactive high-dimensional clustering with IPC plot

Page 16: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Interactive high-dimensional clustering with IPC plot

Page 17: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Performance

• Running times in seconds for the proposed algorithm with different values of m and n.

Page 18: Uncovering Clusters in Crowded Parallel Coordinates Visualizations Alimir Olivettr Artero, Maria Cristina Ferreiara de Oliveira, Haim levkowitz Information.

Conclusions

• The new plots support interactive data exploration of large and high-dimensional data sets, allowing users to remove noise and highlight areas with high concentration of data.

• The proposed algorithms use only integer arithmetic to compute the frequency matrices.


Recommended