+ All Categories
Page 1: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

Profile Analysis with ParaProf

Sameer ShendePerformance Reseaerch Lab, University of Oregon


Page 2: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

TAU Performance System® (http://tau.uoregon.edu)

• Parallel performance framework and toolkit– Supports all HPC platforms, compilers, runtime system– Provides portable instrumentation, measurement, analysis

Page 3: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

TAU Performance System®

• Instrumentation– Fortran, C++, C, UPC, Java, Python, Chapel– Automatic instrumentation

• Measurement and analysis support– MPI, OpenSHMEM, ARMCI, PGAS, DMAPP– pthreads, OpenMP, hybrid, other thread models– GPU, CUDA, OpenCL, OpenACC– Parallel profiling and tracing– Use of Score-P for native OTF2 and CUBEX generation– Efficient callpath proflles and trace generation using Score-P

• Analysis– Parallel profile analysis (ParaProf), data mining (PerfExplorer)– Performance database technology (PerfDMF, TAUdb)– 3D profile browser

Page 4: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

TAU Analysis

Page 5: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf Profile Analysis Framework

Page 6: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

Parallel Profile Visualization: ParaProf

Page 7: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

Parallel Profile Visualization: ParaProf

Page 8: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: 3D Communication Matrix

Page 9: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

Hands-on: Profile report exploration

• The Live-DVD contains Score-P experiments of BT-MZ– class “B“, 4 processes with 4 OpenMP threads each– collected on a dedicated node of the SuperMUC HPC system

at Leibniz Rechenzentrum (LRZ), Munich, Germany

• Start TAU‘s paraprof GUI with default profile report


% cd% cd workshop-vihps/supermuc_expts% lsperiscope-1.5 scorep_bt-mz_B_4x4_sumREADME scorep_bt-mz_B_4x4_sum+metsrun.out scorep_bt-mz_B_4x4_tracescorep-20120913_1740_557443655223384

% paraprof scorep-20120913_1740_557443655223384/profile.cubexOR% paraprof scorep_bt-mz_B_4x4_trace/scout.cubex

Page 10: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Manager Window: scout.cubex

Metrics in the profile

Page 11: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Main window

Page 12: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Options

Unselect this to expand each routine in its own


Page 13: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering


Each color represents an event executing on one or

more threads

Page 14: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Windows

Right click on a given node to choose other


Page 15: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Thread Statistics Table

Click to sort by a given metric, drag and move to

rearrange columns

Page 16: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

Example: Score-P with TAU (LU NPB)

Page 17: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Thread Callgraph Window

Click on options to choose a different color or to

resize the box based on metrics

Page 18: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Callpath Thread Relations Window

Page 19: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf:Windows -> 3D Visualization -> Bar Plot

Page 20: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: 3D Scatter Plot

Page 21: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Scatter Plot

Page 22: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: 3D Topology View for a Routine

Page 23: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Topology View 3D Torus (IBM BG/P)

Page 24: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf:Topology View (6D Torus Coordinates BG/Q)

Page 25: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Node View

Page 26: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Add Thread to Comparison Window

Page 27: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Score-P Profile Files, Database

Page 28: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: File -> Preferences

Page 29: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Group Changer Window

Page 30: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

ParaProf: Options -> Derived Metric Panel

Page 31: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

Sorting Derived Flops Metric by Exclusive Time

Page 32: Profile Analysis with ParaProf Sameer Shende Performance Reseaerch Lab, University of Oregon .

SC’13: Hands-on Practical Hybrid Parallel Application Performance Engineering

• U.S. Department of Energy (DOE)– Office of Science


• U.S. Department of Defense (DoD)– HPC Modernization Office (HPCMO)

• NSF Software Development for Cyberinfrastructure (SDCI)

• Juelich Supercomputing Center, NIC

• Argonne National Laboratory

• Technical University Dresden

• ParaTools, Inc.


Support Acknowledgments

Top Related