On-the-fly Visualization of Scientific Geospatial Data Using Wavelets GeoDA Cyrus Shahabi, Farnoush...

Post on 26-Mar-2015

212 views 0 download

Tags:

transcript

On-the-fly Visualization of Scientific On-the-fly Visualization of Scientific Geospatial Data Using WaveletsGeospatial Data Using Wavelets

GeoDGeoDAA

Cyrus Shahabi, Farnoush Banaei-Kashani, Kai Song

Outline

• Motivation and Problem Definition

• Our Solution: GeoDA – Underlying Technology

• Background: Discrete Wavelet Transform• WOLAP

– Prototype System Development

• Summary

• Future Work

USC-JPL SURP Project

Cyrus Shahabi and Farnoush Banaei-KashaniInformation Laboratory (InfoLab)

University of Southern California (USC)Los Angeles, CA 90089

[shahabi,banaeika]@usc.eduhttp://infolab.usc.edu

Yi Chao and Peggy LiClimate, Oceans, and Solid Earth Science Section

Jet Propulsion Laboratory (JPL)Pasadena, CA 91109

[yi.chao,peggy.li]@jpl.nasa.govhttp://science.jpl.nasa.gov/COSE/

Earth Science Data Visualization

Range Selection

Without Re-scaling

With Re-scaling

Aggregated query over latitude, longitude and/or time

Range SelectionRange Selection

Range Re-scaling Range Re-scaling

Earth Science Data Visualization

Off-line vs. On-the-fly Visualization

• Off-line Visualization– Pre-selected range (and resolution)– Visualization by query pre-computation

• On-the-fly Visualization– On-the-fly range (and resolution) selection– Visualization by on-the-fly query computation to

support dynamic data

Outline

• Motivation and Problem Definition

• Our Solution: GeoDA – Underlying Technology

• Background: Discrete Wavelet Transform• WOLAP

– Prototype System Development

• Summary

• Future Work

80 70 60 90 37 67 50 50 a

75 75 52 50 5 -15 -15 0

75 51 0 1

63 12

63 12 5 -15 -15 0 0 1 â

* For simplification, assume {1/2, 1/2} and {1/2, -1/2} as filters instead of the Haar filters {1/2, 1/2} and {1/2, -1/2}.

{1/2, -1/2}{1/2, 1/2}

=DWT(a)â

75 75 60 90 36 66 50 50 a′

=Waâ

63 63 63 63 63 63 63 63 75 75 51 51 51 51 75 75 75 75 52 52 50 50 75 75 80 70 37 67 50 50 60 90

Multi-resolution view:Compression!

63 12 -15 -15

Discrete Wavelet Transform

Wavelets in Databases

Others’ work1:

Data Compression

– Reason: save space?

– Implicit reason: queries deal with smaller datasets and hence faster

– Problems:

• Only approximate results!

• Very data-dependant

• Different error rates for different queries

Our work (WOLAP)2:

Query Compression

– Reason: fast response time

– Define range-sum query as dot product of query vector and data vector

– At the query time, we have the knowledge of what is important to the pending query

– More opportunities:

• Progressive results

• Data-independent approximation

1 See Vitter-CIKM'98, Vitter-SIGMOD'99, Agrawal-CIKM'00, Garofalakis-VLDB'00

2 See Schmidt-PODS‘02, Schmidt-EDBT‘02, Jahangiri-SIGMOD’05

80 70 60 90 37 67 50 50 178.1933.94 7.07 -21.21-21.21 0 0 2

WOLAP Example

80 70 60 90 37 67 50 50 178.1933.94 7.07 -21.21-21.21 0 0 2

Original Wavelet*

1 1 1 1 1 1 1 1 2.83 0 0 0 0 0 0 0

Result=504

0 0 1 1 1 1 1 0

Result=304

80 70 60 90 37 67 50 50

Result=178.19*2.83=504

1.73 -.35 -1 .5 0 0 0 .71

Result=178.19*1.73+33.94*(-.35)+2*.5

178.1933.94 7.07 -21.21-21.21 0 0 2

=304

* Here we assume the actual Haar filter: {1/2, 1/2} and {1/2, -1/2}

a â

O(N) O(log N)<<

(Parseval Theorem)

(Parseval Theorem)~303 (99% accuracy!)

0 1.4 1.4 1.4 1.4 1.4 1.4 0.7

1.0 2.0 2.0 1.5

2.1 2.5

3.3 -.3

0 0 0 0 0 0 0 0.7

-1 0 0 0.5

-.7 0.4

WOLAP Query Complexity: O(log n)

0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0

3.3 -.3 -.7 0.4 -1 0 0 0.5 0 0 0 0 0 0 0 0.7

Assuming that the query is of size N: Theorem 1: Using “lazy wavelet transform” (computing only on the

boundaries of the selected range), one can transform any polynomial range-aggregate query in O(log N) to wavelet domain.

Theorem 2: The query has O(log N) non-zero values in wavelet domain.

Related Work

Abbadi-ICDE'99

Agrawal-SIGMOD'97

Abbadi-Dawak'00

d=number of dimensionsN=domain size for each dimension

Outline

• Motivation and Problem Definition

• Our Solution: GeoDA – Underlying Technology

• Background: Discrete Wavelet Transform• WOLAP

– Prototype System Development

• Summary

• Future Work

GeoDA Architecture

NC Files

Google Map Mashup

Wavelet Datacubes

Text Files

WOLAP Query Engine (ProDA) Plotting Tools

Presentation Tier

Query Tier

Data Tier

Helena Data

Helene Dataset 10+ dimensions (selected longitude and latitude) 100+ Variables (selected SST) 1km by 1km resolution, daily samples, world-wide 36000 18000 data points per sample (~1/3 of which are null)

Helene Datacube Dimensions: Latitude, Longitude Variable: SST

Presentation Tier

Implementation Cross-language development – JavaScript, C#, ASP.NET AJAX Multi-thread programming

Progressive Visualization

GeoDA

Outline

• Motivation and Problem Definition

• Our Solution: GeoDA – Underlying Technology

• Background: Discrete Wavelet Transform• WOLAP

– Prototype System Development

• Summary

• Future Work

Summary

• We devised a framework for on-the-fly visualization of large-scale scientific datasets.

• We designed and exploited a fast range-aggregate query processing technique, WOLAP, that enables on-the-fly visualization. WOLAP supports the family of polynomial range-aggregate queries.

• We developed a prototype system, GeoDA, as a proof-of-concept based on the designed visualization framework and query processing technique.

Future Work

• Supporting dynamic datasets by extending WOLAP to handle append of the data stream in wavelet domain.

• Enhancing WOLAP via caching, to enable group/batch aggregate queries.

Q & A