Paje for Exascale PerformanceVisualization Analysis
Lucas Mello Schnorr(LIG – CNRS)
TGCC, CEA, Workshop on Tools for ExascaleBruyères-le-Châtel, France
October 2nd, 2012
1/ 26
Introduction
Exascale computing → many challengesHardware: pure or hybrid? specialized or generic?→ ARM-based processors for Exascale?Languages, DAGs for Exascale?
Performance analysisRegistering behavior→ Low intrusion, necessary amount of detailsUnderstanding behavior→ Large volumes, scalable, interactive
Trace visualization for exascale performance analysis
2/ 26
Introduction
Exascale computing → many challengesHardware: pure or hybrid? specialized or generic?→ ARM-based processors for Exascale?Languages, DAGs for Exascale?
Performance analysisRegistering behavior→ Low intrusion, necessary amount of detailsUnderstanding behavior→ Large volumes, scalable, interactive
Trace visualization for exascale performance analysis
2/ 26
Introduction
Exascale computing → many challengesHardware: pure or hybrid? specialized or generic?→ ARM-based processors for Exascale?Languages, DAGs for Exascale?
Performance analysisRegistering behavior→ Low intrusion, necessary amount of detailsUnderstanding behavior→ Large volumes, scalable, interactive
Trace visualization for exascale performance analysis
2/ 26
Outline
1 Challenges on Trace Visualization
2 PajéNG Visualization Tool
3 Alternative Visualization Techniques
4 Conclusion
3/ 26
Motivation and ChallengesSupercomputers today
Sequoia (IBM BlueGene/Q) with 1,572,864 cores (#1 - Top500 - June/2012)
Exascale: billions of execution flows
Space/Time trace size explosion
Many entities in space + Very detailed behavior in time
Performance Visualization
→ How to keep the representation useful on scale?
4/ 26
Motivation and ChallengesSupercomputers today
Sequoia (IBM BlueGene/Q) with 1,572,864 cores (#1 - Top500 - June/2012)
Exascale: billions of execution flows
Space/Time trace size explosion
Many entities in space + Very detailed behavior in time
Performance Visualization
→ How to keep the representation useful on scale?
4/ 26
Motivation and ChallengesSupercomputers today
Sequoia (IBM BlueGene/Q) with 1,572,864 cores (#1 - Top500 - June/2012)
Exascale: billions of execution flows
Space/Time trace size explosion
Many entities in space + Very detailed behavior in time
Performance Visualization
→ How to keep the representation useful on scale?
4/ 26
Motivation and ChallengesSupercomputers today
Sequoia (IBM BlueGene/Q) with 1,572,864 cores (#1 - Top500 - June/2012)
Exascale: billions of execution flows
Space/Time trace size explosion
Many entities in space + Very detailed behavior in time
Performance Visualization
→ How to keep the representation useful on scale?
4/ 26
Motivation and ChallengesSupercomputers today
Sequoia (IBM BlueGene/Q) with 1,572,864 cores (#1 - Top500 - June/2012)
Exascale: billions of execution flows
Space/Time trace size explosion
Many entities in space + Very detailed behavior in time
Performance Visualization→ How to keep the representation useful on scale?
4/ 26
Example 1
BOINC availability trace file for one volunteer→ Availability is either true or false
GNUPlot to a vector file: 8-month and 12-day zoom
5/ 26
Example 1
BOINC availability trace file for one volunteer→ Availability is either true or falseGNUPlot to a vector file: 8-month and 12-day zoom
5/ 26
Trace visualization (plots)
Acroread
6/ 26
Trace visualization (plots)
Evince
6/ 26
Example 2
MPI Sweep3D – 16 processesExecution on the Griffon cluster of Grid’5000→ traced with TAU→ converted to Paje with tau2paje→ converted to CSV with pj_dump
Execution time: about 2 secondsEach MPI rank has about 20K states
Gantt-charts by R → vector file
7/ 26
Trace visualization (space/time plots)Evince
8/ 26
Trace visualization (space/time plots)Acroread
8/ 26
Space/Time views
Widespread, useful, intuitive, fast adoptionAll trace events represented, causal order
Pajehttp://paje.sf.net
Vitehttp://vite.gforge.inria.fr
Vampirhttp://vampir.eu
However...
Limited visualization scalability
9/ 26
Space/Time views
Widespread, useful, intuitive, fast adoptionAll trace events represented, causal order
Pajehttp://paje.sf.net
Vitehttp://vite.gforge.inria.fr
Vampirhttp://vampir.eu
However...
Limited visualization scalability
9/ 26
ViTE – Visual Trace Explorer
Trust the OpenGL rendering
http://vite.gforge.inria.fr10/ 26
http://vite.gforge.inria.fr
Pajé
Slashed rectangles represent time-integrated states
http://paje.sourceforge.net
11/ 26
http://paje.sourceforge.net
Pajé
Space dimension: one process per vertical pixel
http://paje.sourceforge.net12/ 26
http://paje.sourceforge.net
PajéNG – Pajé Next Generation (Early version)
Trust the rendering → without or with OpenGL
http://github.com/schnorr/pajeng/13/ 26
http://github.com/schnorr/pajeng/
PajéNG – Pajé Next Generation (Early version)
Trust the rendering → without or with OpenGL
http://github.com/schnorr/pajeng/13/ 26
http://github.com/schnorr/pajeng/
Trace visualization
Data aggregation is always present→ in different forms, with large traces
Three groups
Forbidden Data AggregationImplicit Data AggregationExplicit Data Aggregation
Towards Exascale → Explicit Data Aggregation
14/ 26
Trace visualization
Data aggregation is always present→ in different forms, with large traces
Three groups
Forbidden Data Aggregation
Implicit Data AggregationExplicit Data Aggregation
Towards Exascale → Explicit Data Aggregation
14/ 26
Trace visualization
Data aggregation is always present→ in different forms, with large traces
Three groups
Forbidden Data AggregationImplicit Data Aggregation
Explicit Data Aggregation
Towards Exascale → Explicit Data Aggregation
14/ 26
Trace visualization
Data aggregation is always present→ in different forms, with large traces
Three groups
Forbidden Data AggregationImplicit Data AggregationExplicit Data Aggregation
Towards Exascale → Explicit Data Aggregation
14/ 26
Trace visualization
Data aggregation is always present→ in different forms, with large traces
Three groups
Forbidden Data AggregationImplicit Data AggregationExplicit Data Aggregation
Towards Exascale → Explicit Data Aggregation
14/ 26
So, why explicit data aggregation?
Technical need: too much data to fit on small screens
Semantic need: aggregated data → more meaningful
Objective towards Exascale
→ Visualization techniques for aggregated traces
15/ 26
So, why explicit data aggregation?
Technical need: too much data to fit on small screens
Semantic need: aggregated data → more meaningful
Objective towards Exascale
→ Visualization techniques for aggregated traces
15/ 26
So, why explicit data aggregation?
Technical need: too much data to fit on small screens
Semantic need: aggregated data → more meaningful
Objective towards Exascale
→ Visualization techniques for aggregated traces
15/ 26
So, why explicit data aggregation?
Technical need: too much data to fit on small screens
Semantic need: aggregated data → more meaningful
Objective towards Exascale
→ Visualization techniques for aggregated traces15/ 26
PajéNG
16/ 26
PajéNG Visualization Tool
Pajé released around 1997Moto: extensibility, interactivity and scalabilityIndependent of programming modelSemantic-free visualization tool
1996 201120042000 2008
Pajé's Birth Aftermath ObjectWeb - OW2 SF.net PajeNG
2012
PajeNG – Paje Next Generation, GPL3http://github.com/schnorr/pajeng/
Space/Time Interactive Visualization ToolFramework for new visualization techniques→ Link with libpaje.so – then use Paje API to get data
Generic File Format→ http://paje.sourceforge.net/download/publication/lang-paje.pdf
ContainersStates, Events, Links, Variables
17/ 26
http://github.com/schnorr/pajeng/http://paje.sourceforge.net/download/publication/lang-paje.pdf
PajéNG Visualization Tool
Pajé released around 1997Moto: extensibility, interactivity and scalabilityIndependent of programming modelSemantic-free visualization tool
1996 201120042000 2008
Pajé's Birth Aftermath ObjectWeb - OW2 SF.net PajeNG
2012
PajeNG – Paje Next Generation, GPL3http://github.com/schnorr/pajeng/
Space/Time Interactive Visualization ToolFramework for new visualization techniques→ Link with libpaje.so – then use Paje API to get data
Generic File Format→ http://paje.sourceforge.net/download/publication/lang-paje.pdf
ContainersStates, Events, Links, Variables
17/ 26
http://github.com/schnorr/pajeng/http://paje.sourceforge.net/download/publication/lang-paje.pdf
PajéNG Visualization Tool
Pajé released around 1997Moto: extensibility, interactivity and scalabilityIndependent of programming modelSemantic-free visualization tool
1996 201120042000 2008
Pajé's Birth Aftermath ObjectWeb - OW2 SF.net PajeNG
2012
PajeNG – Paje Next Generation, GPL3http://github.com/schnorr/pajeng/
Space/Time Interactive Visualization ToolFramework for new visualization techniques→ Link with libpaje.so – then use Paje API to get data
Generic File Format→ http://paje.sourceforge.net/download/publication/lang-paje.pdf
ContainersStates, Events, Links, Variables
17/ 26
http://github.com/schnorr/pajeng/http://paje.sourceforge.net/download/publication/lang-paje.pdf
PajéNG – Associated Tools
Associated Tools (Unix-like philosophy)Only pj_validate and pj_dumpState, rank7, STATE, 0.193837, 0.193844, 7e-06, MPI_Recv()
...
State, rank9, STATE, 4.19167, 4.19168, 1.2e-05, MPI_Send()
Data Liberation Front$ pj_dump input.paje > output.csv
Then use your preferred visualization tool (gnuplot, R, GNU Octave, ...)
Some tracers and converters: SimGrid, Akypuera,Score-P, otf22paje, otf2paje, tau2paje, Poti, (X)Kaapi,EZTrace, rastro2paje, libRastro, GTG, JRastro and counting...$ otf22paje ./scorep-20120827/traces.otf2 | pajeng$ otf22paje ./scorep-20120827/traces.otf2 | pj_dump > output.csv
18/ 26
PajéNG – Associated Tools
Associated Tools (Unix-like philosophy)Only pj_validate and pj_dumpState, rank7, STATE, 0.193837, 0.193844, 7e-06, MPI_Recv()
...
State, rank9, STATE, 4.19167, 4.19168, 1.2e-05, MPI_Send()
Data Liberation Front$ pj_dump input.paje > output.csv
Then use your preferred visualization tool (gnuplot, R, GNU Octave, ...)
Some tracers and converters: SimGrid, Akypuera,Score-P, otf22paje, otf2paje, tau2paje, Poti, (X)Kaapi,EZTrace, rastro2paje, libRastro, GTG, JRastro and counting...$ otf22paje ./scorep-20120827/traces.otf2 | pajeng$ otf22paje ./scorep-20120827/traces.otf2 | pj_dump > output.csv
18/ 26
PajéNG – Associated Tools
Associated Tools (Unix-like philosophy)Only pj_validate and pj_dumpState, rank7, STATE, 0.193837, 0.193844, 7e-06, MPI_Recv()
...
State, rank9, STATE, 4.19167, 4.19168, 1.2e-05, MPI_Send()
Data Liberation Front$ pj_dump input.paje > output.csv
Then use your preferred visualization tool (gnuplot, R, GNU Octave, ...)
Some tracers and converters: SimGrid, Akypuera,Score-P, otf22paje, otf2paje, tau2paje, Poti, (X)Kaapi,EZTrace, rastro2paje, libRastro, GTG, JRastro and counting...$ otf22paje ./scorep-20120827/traces.otf2 | pajeng$ otf22paje ./scorep-20120827/traces.otf2 | pj_dump > output.csv
18/ 26
(Alternative) Visualization Techniques
19/ 26
Visualization techniquesSquarified Treemap View
Observe outliers, differences of behaviorHierarchical aggregation
B Hierarchy: Site (10) - Cluster(10) - Machine (10) - Processor (100) C Hierarchy: Site (10) - Cluster(10) - Machine (10) - Processor (100) D Hierarchy: Site (10) - Cluster(10) - Machine (10) - Processor (100) E Maximum Aggregation
Hierarchical Graph ViewCorrelate application behavior to network topologyPin-point resource contention
ClustersSites
Hosts
Grid
20/ 26
Visualization techniquesSquarified Treemap View
Observe outliers, differences of behaviorHierarchical aggregation
B Hierarchy: Site (10) - Cluster(10) - Machine (10) - Processor (100) C Hierarchy: Site (10) - Cluster(10) - Machine (10) - Processor (100) D Hierarchy: Site (10) - Cluster(10) - Machine (10) - Processor (100) E Maximum Aggregation
Hierarchical Graph ViewCorrelate application behavior to network topologyPin-point resource contention
ClustersSites
Hosts
Grid
20/ 26
Squarified Treemap View – KAAPI example
KAAPI (run DAGs, work stealing for load balacing)188 processes running on five clusters
Rennes
Toulouse
Porto Alegre Bordeaux
Nancy
~43 s
~110 s~148 s
~78 s ~65 s
~67 s
~17 s
21/ 26
Squarified Treemap View – an exampleSynthetic trace with 100 thousand processesTwo states, four-level hierarchyVisualization artifacts without spatial aggregation
A Hierarchy: Site (10) - Cluster(10) - Machine (10) - Processor (100)
22/ 26
Squarified Treemap View – an exampleSynthetic trace with 100 thousand processesTwo states, four-level hierarchy
Visualization artifacts without spatial aggregation
B Hierarchy: Site (10) - Cluster(10) - Machine (10) - Processor (100)
22/ 26
Squarified Treemap View – an exampleSynthetic trace with 100 thousand processesTwo states, four-level hierarchy
Visualization artifacts without spatial aggregation
C Hierarchy: Site (10) - Cluster(10) - Machine (10) - Processor (100)
22/ 26
Squarified Treemap View – an exampleSynthetic trace with 100 thousand processesTwo states, four-level hierarchy
Visualization artifacts without spatial aggregation
D Hierarchy: Site (10) - Cluster(10) - Machine (10) - Processor (100)
22/ 26
Squarified Treemap View – an exampleSynthetic trace with 100 thousand processesTwo states, four-level hierarchy
Visualization artifacts without spatial aggregation
E Maximum Aggregation
22/ 26
Hierarchical Graph View - an example
Squares are hosts, diamonds are network linksColors represent different applicationsor parts of it (task type, phase)
Two clusters interconnected by four network links
time slice
23/ 26
Hierarchical Graph View - a larger example
French Grid5000 platform: 2170 nodes
ClustersSites
Hosts
Grid
24/ 26
Hierarchical Graph View - a larger example
French Grid5000 platform: 2170 nodes
Clusters
Sites Grid
time slice time slice time slice time slicet0 t1 t2 t3A
BC
24/ 26
Conclusion and Future Work
Exascale Performance Visualization AnalysisExplicit Data aggregation→ Controlled + visual feedbackAlternative Visualization Techniques
PajéNG Visualization ToolFuture-proof generic file formatTowards Exascale
Incorporate trace aggregationDistribute trace handling (planned)
Future Work
Aggregation operators that consider time uncertaintyRevisit Space/Time representationsTo quantify loss of information when aggregating
25/ 26
Conclusion and Future Work
Exascale Performance Visualization AnalysisExplicit Data aggregation→ Controlled + visual feedbackAlternative Visualization Techniques
PajéNG Visualization ToolFuture-proof generic file formatTowards Exascale
Incorporate trace aggregationDistribute trace handling (planned)
Future Work
Aggregation operators that consider time uncertaintyRevisit Space/Time representationsTo quantify loss of information when aggregating
25/ 26
Conclusion and Future Work
Exascale Performance Visualization AnalysisExplicit Data aggregation→ Controlled + visual feedbackAlternative Visualization Techniques
PajéNG Visualization ToolFuture-proof generic file formatTowards Exascale
Incorporate trace aggregationDistribute trace handling (planned)
Future Work
Aggregation operators that consider time uncertaintyRevisit Space/Time representationsTo quantify loss of information when aggregating
25/ 26
Thank you for your attention
+ Questions?
Some referencesVisualizing More Performance Data Than What Fits on Your Screen.Lucas Mello Schnorr, Arnaud Legrand. The 6th International ParallelTools Workshop. Springer. 2012. (Invited paper)A Hierarchical Aggregation Model to achieve Visualization Scalability inthe analysis of Parallel Applications. Lucas Mello Schnorr, GuillaumeHuard, Philippe Olivier Alexandre Navaux. Parallel Computing. Volume38, Issue 3, March 2012, Pages 91-110.
INFRA-SONGS Project (WP-7)http://infra-songs.gforge.inria.fr/
Simulation of Next Generation SystemsWP-7: Visualization and Analysis
More information→ http://mescal.imag.fr/membres/lucas.schnorr/
26/ 26
http://infra-songs.gforge.inria.fr/http://mescal.imag.fr/membres/lucas.schnorr/
Thank you for your attention + Questions?
Some referencesVisualizing More Performance Data Than What Fits on Your Screen.Lucas Mello Schnorr, Arnaud Legrand. The 6th International ParallelTools Workshop. Springer. 2012. (Invited paper)A Hierarchical Aggregation Model to achieve Visualization Scalability inthe analysis of Parallel Applications. Lucas Mello Schnorr, GuillaumeHuard, Philippe Olivier Alexandre Navaux. Parallel Computing. Volume38, Issue 3, March 2012, Pages 91-110.
INFRA-SONGS Project (WP-7)http://infra-songs.gforge.inria.fr/
Simulation of Next Generation SystemsWP-7: Visualization and Analysis
More information→ http://mescal.imag.fr/membres/lucas.schnorr/
26/ 26
http://infra-songs.gforge.inria.fr/http://mescal.imag.fr/membres/lucas.schnorr/
Challenges on Trace VisualizationPajéNG Visualization ToolAlternative Visualization TechniquesConclusion