+ All Categories
Home > Documents > Desktop techniques for the exploration of terascale size, time-varying data sets

Desktop techniques for the exploration of terascale size, time-varying data sets

Date post: 25-Feb-2016
Category:
Upload: lacy
View: 42 times
Download: 1 times
Share this document with a friend
Description:
Desktop techniques for the exploration of terascale size, time-varying data sets. John Clyne & Alan Norton Scientific Computing Division National Center for Atmospheric Research Boulder, CO USA. National Center for Atmospheric Research. - PowerPoint PPT Presentation
Popular Tags:
24
SC05 November, 2005 [email protected] Supercomputing • Communications • NCAR Scientific Computing Div Desktop techniques for the exploration of terascale size, time-varying data sets John Clyne & Alan Norton Scientific Computing Division National Center for Atmospheric Research Boulder, CO USA
Transcript
Page 1: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Desktop techniques for the exploration of terascale size, time-varying data sets

John Clyne & Alan NortonScientific Computing Division

National Center for Atmospheric ResearchBoulder, CO USA

Page 2: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

National Center for Atmospheric Research

Space Weather Turbulence

Atmospheric ChemistryClimate Weather

The Sun

More than just the atmosphere… from the earth’s oceans to the solar interior

Page 3: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Goals

1. Improve scientist’s ability to investigate and understand complex phenomena found in high-resolution fluid flow simulations– Accelerate analysis process and improve scientific productivity– Enable exploration of data sets heretofore impractical due to

unwieldy size– Gain insight into physical processes governing fluid dynamics

widely found in the natural world

2. Demonstrate visualization’s ability to aid in day-to-day scientific discovery process

Page 4: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Problem motivation:Analysis of high resolution numerical turbulence simulations

• Simulations are huge!!– May require months of supercomputer time– Multi-variate (typically 5 to 8 variables)– Time-varying data– A single experiment may yield terabytes of numerical data

• Analysis requirements are formidable– Numerical outputs simulate phenomena not easily observed!!!– Interesting domain regions (ROIs) may not be known apriori

• Additionally…– Historical focus of computing centers on batch processing– Dichotomy of batch and interactive processing needs– Currently available analysis tools inadequate for large data needs

• Single threaded, 32bit, in-core algorithms• Lack advanced visualization capabilities

– Currently available visualization tools ill-suited for analysis

Page 5: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

[Numerical] models that can currently be run on typical supercomputing platforms produce data in amounts that make storage expensive, movement cumbersome, visualization difficult, and detailed analysis impossible.  The result is a significantly reduced scientific return from the nation's largest computational efforts.

And furthermore…

Page 6: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

A sampling of various technology performance curves

• Not all technologies advance at same rate!!!

Performance gains from 1980 to present

1

10

100

1000

10000

100000Im

prov

emen

t

Disk Drive Internal DataRateDisk Drive InterfaceData RateEthernet NetworkBandwidthIntel MicroprocessorClock SpeedDrive Capacity

Page 7: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Example: Compressible plume dynamics

• 504x504x2048• 5 variables (u,v,w,rho,temp)• ~500 time steps saved• 9TBs storage• Six months compute time

required on 112 IBM SP RS/6000 processors

• Three months for post-processing

• Data may be analyzed for several years

M. Rast, 2004. Image courtesy of Joseph Mendoza, NCAR/SCD

Page 8: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Visualization and Analysis Platform for oceanic, atmospheric, and solar Research (VAPoR)

Key componentsDomain specific

numerically simulated turbulence in the natural sciencesData processing language

Data post processing and quantitative analysisAdvanced visualization

Identify spatial/temporal ROIsMultiresolution

Enable speed/quality tradeoffs

This work is funded in part through a U.S. National Science Foundation, Information Technology Research program grant

Combination of visualization with multiresolution data representation that provide sufficient data reduction to enable interactive work on time-varying data

Page 9: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Page 10: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Multiresolution Data Representation

• Geometry Reduction (Schroeder et al, 1992; Lindrstrom & Silva, 2001;Shaffer and Garland, 2001)

• Wavelet based progressive data access– Mathematical transforms similar to Fourier

transformations– Invertible and lossless – Numerically efficient forward and inverse transform – No additional storage costs– Permit hierarchical representations of functions– See Clyne, VIIP2003

Transform(e.g. Iso, cut

plane)

RendergeometryData

Source

data Pixels

Analyze & Manipulate

Text, 2D graphics

Visualization Pipeline

Reduce Reduce

• Data reduction (Cignoni, et al 1994; Wilhelms & Van Gelder, 1994; Pascucci & Frank, 2001; Clyne 2003)

Page 11: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Putting it all together• Visual data browsing permits rapid identification of features of interest,

reducing data domain • Multiresolution data representation affords a second level of data reduction

by permitting speed/quality trade offs enabling rapid hypothesis testing• Quantitative operators and data processing enable data analysis• Result: Integrated environment for large-data exploration and discovery

Goal: Avoid unnecessary and expensive full-domain calculations

– Execute on human time scales!!!

Visual data browsing

Datamanipulation

Quantitativeanalysis

Refine

Coarsen

Page 12: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Compressible Convection

1283 5123M. Rast, 2002

Page 13: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

504x504x2048

Full

252x252x1024

1/8

126x126x512

1/64

63x63x256

1/512

Compressible plume data set shown at native and progressively coarser resolutions

Compressible plume

Resolution:

Problem size:

Page 14: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Rendering timings

0.1

1

10

100

1000

Full 1/2 1/4 1/8

Resolution

Tim

e in

sec

onds

Mdb

Vtk

0.01

0.1

1

10

Full 1/2 1/4 1/8

Resolution

Tim

e in

sec

onds

Mdb

5123 Compressible Convection 5042x2048 Compressible Plume

Reduced resolution affords responsive interaction while preserving all but finest features

SGI Octane2, 1x600MHz R14k

SGI Origin, 10x600MHz R14k

Interactive!!

Page 15: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Derived quantities

p: pressure: densityT: temperature: ionization potential

: Avogadro’s number

me: electron mass

k: Boltzmann’s constanth: Planck’s constant

(1) Tp

(2)

2323

2

2

21

kTe eN

Th

kmy

y

(3)22 u

Derived quantities produced from the simulation’s field variables as a post-process

Page 16: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Calculation timings for derived quantities

0.01

0.1

1

10

100

1000

10000

Full 1/2 1/4 1/8

Resolution

Tim

e in

Sec

onds

pressure (eq 1)ionization (eq 2)enstrophy (eq 3)

Note: 1/2th resolution is 1/8th problem size, etc

Deriving new quantities on interactive time scales only possible with data reduction

SGI Origin, 10x600MHz R14k

Page 17: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Error in approximations

• Error is highly dependent on operation performed

• Algebraic operations tested introduced low error even after substantial coarsening

• Error grows rapidly for gradient calculation

• Point-wise error gives no indication of global (average) error

Point-wise, normalized, Point-wise, normalized, maximum, absolute errormaximum, absolute error

i

iii

sss

,ˆmax

Resolution PEq 1

YEq 2

2

Eq 3Full 0 0 0

1/2 1.09 0.03 85.57

1/4 2.53 0.14 97.3

1/8 3.79 0.65 99.8

Page 18: Desktop techniques for the exploration of terascale size, time-varying data sets

Integrated visualization and analysis on interactively selected subdomains:

u

2ur

p gz

1 pr

1 pr

2ur

z

Vertical vorticity of the flow

Mach number of the vertical velocityFull domain seen from above Subdomain from side

Full domain seen from above Subdomain from side

Efficient analysis requires rapid calculation and visualization of unanticipated derived quantities. This can be facilitated by a combination of subdomain selection and resolution reduction.

Page 19: Desktop techniques for the exploration of terascale size, time-varying data sets

A test of multiresolution analysis: Force balance in supersonic downflows

Sites of supersonic downflow are also those of very high vertical vorticity. The core of the vortex tubes are evacuated, with centripetal acceleration balancing that due to the inward directed pressure gradient. Buoyancy forces are maximum on the tube periphery due to mass flux convergence.

The same interpretation results from analysis at half resolution.

1 pr

u

2ur

p gz

1 pr

2ur

z

u

2ur

p gz

1 pr

1 pr

2ur

z

Full

Half

Resolution

Subdomain selection and reduced resolution together yield data reduction by a factor of 128

Page 20: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Summary

• Presented prototype, integrated analysis environment aimed at aid investigation of high-resolution numerical fluid flow simulations

• Orders of magnitude data reduction achieved through:1. Visualization: Reduce full domain to ROI2. Multiresolution: Enable speed/quality trade-offs

• Coarsened data frequently suitable for rapid hypothesis testing that may later be verified at full resolution

Page 21: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Future work

• Quantify and predict error in results obtained with various mathematical operations applied to coarsened data

• Investigate lossy and lossless data compression• Add support for less regular meshes• Explore other scientific domains

– Climate, weather, atmospheric chemistry,…

Page 22: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Future???

Original 20:1 Lossy Compression

Page 23: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Acknowledgements

• Steering Committee– Nic Brummell - CU, JILA– Aimé Fournier – NCAR, IMAGe– Helene Politano - Observatoire de

la Cote d'Azur– Pablo Mininni, NCAR, IMAGe– Yannick Ponty - Observatoire de la

Cote d'Azur– Annick Pouquet - NCAR, ESSL– Mark Rast - NCAR, HAO– Duane Rosenberg - NCAR,

IMAGe– Matthias Rempel - NCAR, HAO– Yuhong Fan - NCAR, HAO

• Developers– Alan Norton – NCAR, SCD– John Clyne – NCAR, SCD

• Research Collaborators– Kwan-Liu Ma, U.C. Davis– Hiroshi Akiba, U.C. Davis– Han-Wei Shen, Ohio State– Liya Li, Ohio State

• Systems Support– Joey Mendoza, NCAR, SCD

Page 24: Desktop techniques for the exploration of terascale size, time-varying data sets

SC05November, [email protected]

Supercomputing • Communications • Data

NCAR Scientific Computing Division

Questions???

http://www.scd.ucar.edu/hss/dasg/software/vapor


Recommended