Presented by
End to End Computing at ORNL
Scott A. Klasky
Computing and Computational Science DirectorateCenter for Computational Sciences
2 Klasky_E2E_0611
Petascale data workspaceMassively
parallelsimulation
3 Klasky_E2E_0611
Asynchronous I/O using data in transit (NXM) techniques (M. Parashar, Rutgers; M. Wolf, GT)
Workflow automation (SDM Center; A. Shoshani, LBNL; N. Podhorszki, UC Davis)
Dashboard front end (R. Barreto, ORNL; M. Vouk, NCSU)
High-performance metadata-rich I/O (C. Jin, ORNL; M. Parashar, Rutgers)
Logistical networking integrated into data analysis/visualization (M. Beck, UTK)
CAFÉ common data model (L. Pouchard, ORNL; M. Vouk, NCSU)
Real-time monitoring of simulations (S. Ethier, PPPL; J. Cummings, Cal Tech;Z. Lin, U.C. Irvine; S. Ku, NYU) Visualization in real-time monitor Real-time data analysis
Impact areas of R&D
4 Klasky_E2E_0611
Hardware architecture
Jaguar compute
nodes
Jaguar compute
nodes
JaguarI/O
nodes
JaguarI/O
nodesEwokEwok
Lustre2
Ewok-web
Nfs DB
Ewok-sql
HPSSHPSS
Simulation control
LustreLustre
Restart files
LoginnodesLoginnodes
Job control
Seaborg
40 G/s Sockets later
5 Klasky_E2E_0611
Asynchronous petascale I/O for data in transit
High-performance I/O Asynchronous Managed buffers Respect firewall
constraints
Enable dynamic control with flexible MxN operations Transform using
shared-space framework (Seine)
User applications
Seine coupling framework interface
Other program paradigms
Shared space management Load balancing
Directory layer Storage layer
Communication layer (buffer management)
Operating system
6 Klasky_E2E_0611
Workflow automation
Automate the data processing pipeline, including transfer of simulation output to the e2e system, execution of conversion routines, image creation, archival using the Kepler workflow system
• Requirements for Petascale computing
– Easy to use– Dashboard front-end– Autonomic
7 Klasky_E2E_0611
Dashboard front end Desktop Interface uses asynchronous
Javascript and XML (AJAX); runs in web browser
Ajax is a combination of technologies coming together in powerful ways (XHTML, CSS, DOM, XML, XST, XMLhttpRequest, and Javascript)
The user’s interaction with the application happens asynchronously – independent of communication with the server
Users can rearrange the page without clearing it
8 Klasky_E2E_0611
High-performance metadata-rich I/O Two-step process to produce files:
Write out binary data + tags using parallel I/O on XT3. (May or may not use files; could use asynchronous I/O methods) The tags contain the metadata information
that is placed inside the files Workflow transfers this information to Ewok
(IB cluster with 160P)
Service on Ewok decodes files into hdf5 files and places metadata into XML file (one XML file for all of the data)
Cuts I/O overhead in GTC from 25% to <3%
Step 2:
Step 1:
9 Klasky_E2E_0611
EwokclusterEwok
cluster
Depots
JaguarCray XT3Jaguar
Cray XT3
NYU
PPPL
UCI
MIT
Portals
Directoryserver
Logistical networking: High-performance ubiquitous and transparent data access over the WAN
10 Klasky_E2E_0611
A scientist can seamlessly find input and output variables of a given run, unit, average, min and max values for a variable
CAFÉ common data model (Combustion S3D), astrophysics (Chimera), fusion (GTC/XGC Environment)
Provenance CAFÉ model
Provenance CAFÉ model
Stores and organizes four types of information about a given run: Provenance Operational profile Hardware mapping Analysis metadata
11 Klasky_E2E_0611
Real-time monitoring of simulations Using end-to-end technology, we have
created a monitoring technique for fusion codes (XGC, GTC)
Kepler is used to automate the steps. The data movement task will be using the Rutgers/GT data-in-transit routines
Metadata are placed in the movement from the XT3 to the IB cluster. Data go from NXM processors using the Seine framework
Data are archived into the High-Performance Storage System, metadata are placed in DB.
Visualization and analysis services produce data/information that are placed on the AJAX front-end
Data are replicated from ewok to other sites using logisitcal networking
Users access the files from our metadata server at ORNL
Scientists need tools to let them see and manipulate their simulation data quickly
Archiving data, staging it to secondary or tertiary computing resources for annotation and visualization, staging it yet again to a web portal….
These work, but we can accelerate the rate of insight for some scientists by allowing them to observe data during a run
The per-hour facilities cost of running a leadership-class machine is staggering
Computational scientists should have tools to allow them to constantly observe and adjust their runs during their scheduled time slice, as astronomers at an observatory or physicists at a beam source can
12 Klasky_E2E_0611
Real-time monitoring
Typical monitoring • Look at volume-averaged
quantities• At four key times, this
quantity looks good• Code had one error that didn’t
appear in the typical ASCII output to generate this graph
• Typically, users run gnuplot/grace to monitor output
Typical monitoring • Look at volume-averaged
quantities• At four key times, this
quantity looks good• Code had one error that didn’t
appear in the typical ASCII output to generate this graph
• Typically, users run gnuplot/grace to monitor output
More advanced monitoring• 5 seconds move 300 MB,
and process the data• Need to use FFT for 3-D data, and
then process data + particles– 50 seconds (10 time steps) move
and process data– 8 GB for 1/100 of the
30 billion particles• Demand low overhead <3%!
More advanced monitoring• 5 seconds move 300 MB,
and process the data• Need to use FFT for 3-D data, and
then process data + particles– 50 seconds (10 time steps) move
and process data– 8 GB for 1/100 of the
30 billion particles• Demand low overhead <3%!
13 Klasky_E2E_0611
Contact
Scott A. KlaskyLead, End-to-End SolutionsCenter for Computational Sciences(865) [email protected]
13 Klasky_E2E_0611