+ All Categories
Home > Documents > Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project...

Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project...

Date post: 27-Mar-2015
Category:
Upload: christopher-higgins
View: 217 times
Download: 0 times
Share this document with a friend
Popular Tags:
19
Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet Graphisch-Interaktive Systeme (GRIS) Visual Analysis Group Fraunhoferstraße 5 64283 Darmstadt Germany Tel.: +49 (6151) 155 – 666 Fax: +49 (6151) 155 – 669 Email: [email protected] http://www.gris.tu-darmstadt.de/home/members/bernard/index.de.htm
Transcript
Page 1: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

Visualization and Search Approaches for Time-Oriented Scientific Primary Data

A WGL-TIB Project

Jürgen Bernard

Technische Universität DarmstadtFachgebiet Graphisch-Interaktive Systeme (GRIS)Visual Analysis GroupFraunhoferstraße 564283 DarmstadtGermany

Tel.: +49 (6151) 155 – 666Fax: +49 (6151) 155 – 669

Email: [email protected]

http://www.gris.tu-darmstadt.de/home/members/bernard/index.de.htm

Page 2: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

Outline

1. Motivation

2. Practical Approach

3. Outlook

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 2

Page 3: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

Trends Content-based search and presentation available for many document types

Text documents Digital image, video, audio, etc.

Repositories for new kinds of non-textural documents Like scientific primary data PANGAEA, PsychData, Dryad, ELEXIR, KoLaWiss

Information overload: need for visual retrieval and data exploration

1 Motivation

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 3

Page 4: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

Time-oriented scientific primary data Massive amounts of time-oriented scientific primary data that may be valuable

for future research Heterogenity of standards in different research disciplines Exploration of scientific primary data repositories currently restricted to “meta-

search”

1 Motivation

PANGAEA PanPlot Tool. (http://doi.pangaea.de/10.1594/PANGAEA.330147)

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 4

Page 5: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

Development of a Visual Catalog for time series data Interactive-graphic access to huge amounts of scientific primary data Combination of content-based, and meta data retrieval Explorative data analysis by searching, browsing and zooming techniques User-adaptive search methods Establish higher data transparency and deeper user comprehension

1 Motivation

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 5

Page 6: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

Possible Use Case Scenario

Natural scientist detects interesting curve progression

Hypothesis: curve progression indicates future event

Search for similar curve progressions in related data sets

Visual overview of the most similar data elements

Filter result set, adapt the reference example

1 Motivation

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 6

Page 7: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

Outline

1. Motivation

2. Practical Approach

3. Outlook

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 7

Page 8: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

2.1 Overview

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 8

Back End

Front End

Bernard, J. and Brase, J. and Fellner, D. and Koepler, O. and Kohlhammer, J. and Ruppert, T. and Schreck, T. and Sens, I. A Visual Digital Library Approach for Time-Oriented Scientic Primary Data. Accepted at the 14th European Conference on Research and Advanced Technology for Digital

Libraries (ECDL), 2010

Page 9: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

2.2 Data Import

Parse primary data Initialy: use Pangaea data files In principle: inclusion of additional repositories by customized data parsers

Define a generic time series data structure Feature based approach Store meta data as „bag of words“ Special attributes: time stamp,

parameter/unit

Defined data base schema MySQL data base Efficient data management

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 9

Page 10: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

2.3 Preprocessing

Problem: time series primary data may not be directly applicable for clustering, consider:Outliers, noiseMissing valuesformat of time stamps in data filesInhomogenous time quantization, timestamp compatibility

Possible approachesAggregation (binning)TransformationInterpolation

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 10

Page 11: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

2.4 Feature Extraction

Descriptors: representations of time series in a reduced dimensionality space

Basis for similarity measures (clustering, indexing, search) Problem: full sequence vs. sub sequence search Huge amount of descriptors published, suitable descriptor approaches:

Binning, (current approach), Discrete Fourier Transformation (DFT), Discrete Wavelet Transformation (DWT), SAX-Descriptor (symbolic representation)

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 11

Lin, J. and Keogh, E. and Lonardi, S. and Chiu, B. : A symbolic representation of time series, with implications for streaming algorithms.Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, 2003

Page 12: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

2.5 Visual Catalog

Visualization of clustering results (global und local) Self-organizing Map as a basis for smart visualization of huge amounts of

time series Provide a single grid view for details (zooming) We also explore other layout algorithms

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 12

Page 13: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

2.5 Visual Catalog

Visual-interactive time series search Query by example, query by sketch Problem: fullsequence vs. subsequence search Colormaps for the indication of similarity List-based visualizations for result sets Save session, export results

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 13

Page 14: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

2.5 Visual Catalog

Meta data search Additional search functionality, combination with content-based search Search by „Bag of words“ Search by special attributes

Physical unit, example: temperature, humidity, etc. Use time stamp: define time interval and quantization

Combine multiple search operations: filtering

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 14

Filtering

Page 15: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

Outline

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 15

1. Motivation

2. Practical Approach

3. Outlook

Page 16: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

3 Outlook

Summary WGL-TIB project: visual access to scientific primary data Project start: 01/2010, project duration: 3 years Facing time-series data Feature-based descriptor approach Interfaces for visualization, browsing and search

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 16

Page 17: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

3 Outlook

Requirements, use-cases and evaluation model (TIB Hannover) Test visual catalog prototype with real world problems Collaboration with scientific users to capture user view User in the loop approach

Technical future work (GRIS, FhG) Establish an application prototype Similarity search operations, based on evaluation results Special user interface

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 17

Page 18: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

Thank you for your attention

Comments very welcome

Do you have any questions?

Acknowledgements

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 18

Page 19: Visualization and Search Approaches for Time-Oriented Scientific Primary Data A WGL-TIB Project Jürgen Bernard Technische Universität Darmstadt Fachgebiet.

Related Work

PANGAEA Publishing Network for Geoscientic & Environmental Data. (http://www.pangaea.de/)

PsychData National Repository for Psychological Research Data. (http://psychdata.zpid.de/ (in German))

Dryad Digital Repository for Data Underlying Published Works. (http://www.datadryad.org/)

ELIXIR European Life Sciences Infrastructure for Biological Information. (http://www.elixir-europe.org/)

KoLaWiss: Society for Scientic Data Processing Goettingen: Cooperative long-term preservation for research centers (in German).Project Report (2009)

PANGAEA PanPlot Tool. (http://doi.pangaea.de/10.1594/PANGAEA.330147) Bernard, J. and Brase, J. and Fellner, D. and Koepler, O. and Kohlhammer, J. and

Ruppert, T. and Schreck, T. and Sens, I.: A Visual Digital Library Approach for Time-Oriented Scientic Primary Data. Accepted at the 14th European Conference on Research and Advanced Technology for Digital Libraries (ECDL), 2010

Lin, J. and Keogh, E. and Lonardi, S. and Chiu, B.: A symbolic representation of time series, with implications for streaming algorithms. Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, 2003

10.04.23 | Interactive-Graphics Systems | Visual Search and Analysis Group | Jürgen Bernard | 19


Recommended