VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Insightful Automatic Performance Modeling
Software
Alexandru Calotoiu1, Torsten Hoefler2, Martin Schulz3, Sergei Shudler1 and Felix Wolf1
1 TU Darmstadt , 2 ETH Zürich , 3 Lawrence Livermore National Laboratory
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Installing Extra-P
! Download and install QT4 ! Download and install Python3+ and PYQT ! Download and install Cube ! http://www.scalasca.org/software/cube-4.x/download.html
! Download Extra-P ! http://www.scalasca.org/software/extra-p/download.html
! Unpack & install Extra-P ! ./configure --prefix=<extra-p-install-dir> --with-cube=<cube-dir> CPPFLAGS=-I<python.h path>; make; make install
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 3
5
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Automatic performance modeling with Extra-P
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 5
Myapp.cpp main() { foo() bar() compute() }
Instrumentation & Measurement
Performance measurements (profiles)
Input
Output
Myapp.weak.p128.r1/profile.cubex
Score-P
Myapp.weak.p256.r1/profile.cubex Myapp.weak.p512.r1/profile.cubex
Myapp.weak.p1024.r1/profile.cubex Myapp.weak.p2048.r1/profile.cubex
Results.cubex Results.xtrap
Region 1: main Model: (3) + (3.14 * x^( 2 ))
...
Extra-P
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Extra-P input in text form
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 6
! Useful to debug or when a small data set must be modeled
! Example provided in input.txt
POINTS 1000 2000 4000 8000 16000 EXPERIMENT ImportantData
DATA 1 1 1 1 1DATA 4 4 4 3.99 4.01DATA 16 15.999 16.01 16.01 15.99DATA 64 64 64 64.01 63.99DATA 256.01 255.99 256 256
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Extra-P input in text form
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 7
! Useful to debug or when a small data set must be modeled
! Example provided in input.txt
POINTS 1000 2000 4000 8000 16000 EXPERIMENT ImportantData
DATA 1 1 1 1 1DATA 4 4 4 3.99 4.01DATA 16 15.999 16.01 16.01 15.99DATA 64 64 64 64.01 63.99DATA 256.01 255.99 256 256
Measurement points Use at least 4, preferably 5, but in general the more the better
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Extra-P input in text form
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 8
! Useful to debug or when a small data set must be modeled
! Example provided in input.txt
POINTS 1000 2000 4000 8000 16000 EXPERIMENT ImportantData
DATA 1 1 1 1 1DATA 4 4 4 3.99 4.01DATA 16 15.999 16.01 16.01 15.99DATA 64 64 64 64.01 63.99DATA 256.01 255.99 256 256
Experiment identifier Acts as both identifier and separator for experiments and their data
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Extra-P input in text form
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 9
! Useful to debug or when a small data set must be modeled
! Example provided in input.txt
POINTS 1000 2000 4000 8000 16000 EXPERIMENT ImportantData
DATA 1 1 1 1 1DATA 4 4 4 3.99 4.01DATA 16 15.999 16.01 16.01 15.99DATA 64 64 64 64.01 63.99DATA 256.01 255.99 256 256
Data points Each row corresponds to a point; all values in a row are considered repeat measurements of the same experiment
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Extra-P input in text form
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 10
! Useful to debug or when a small data set must be modeled
! Example provided in input.txt
POINTS 1000 2000 4000 8000 16000 EXPERIMENT ImportantData
DATA 1 1 1 1 1DATA 4 4 4 3.99 4.01DATA 16 15.999 16.01 16.01 15.99DATA 64 64 64 64.01 63.99DATA 256.01 255.99 256 256
Measurement points
Experiment identifier
Data points
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Extra-P output – Text form
Measurements and model data for each experiment and metric:
Callpath/Region: exp4 Metric: Test
Data: ( 1000, 1e+06) 95% CI [1.00001e+06, 999989]( 2000, 4e+06) 95% CI [4.00003e+06, 3.99998e+06]
( 4000, 1.6e+07) 95% CI [1.6e+07, 1.6e+07]( 8000, 6.4e+07) 95% CI [6.4e+07, 6.4e+07]( 16000, 2.56e+08) 95% CI [2.56e+08, 2.56e+08]
Model: 0+1*(p^2)
RSS: 3.35017 Adjusted R^2: 1
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 12
Measurements for each input element (e.g., #processes)
Adjusted R2 (explained previously)
Best-fit model
RSS: Residual sum of squares
Metric name; either Score-P metrics (time, bytes, etc.) or custom metrics
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Extra-P Cube input description
! Modeling tool expects Cube files in the following format: <DIR>/<PREFIX><X><POSTFIX>.r<{1,..,REPS}>/<FILENAME>
! DIR, PREFIX, X, POSTFIX, REPS and FILENAME must all be defined.
! X – value of varied parameter e.g. number of processes
! REPS – number of repeated experiments with same parameter value
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 14
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Extra-P Cube input description
<DIR>/<PREFIX><X><POSTFIX>.r<{1,..,REPS}>/<FILENAME>
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 15
Best effort approach to identify and populate the
fields automatically based on selected directory contents.
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Extra-P user interface
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 17
Call tree exploration Plot of the model
Selected kernel(s)
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Extra-P call tree view
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 18
Call tree exploration
Model
Quality of fit metrics: Residual sum of squares
and Adjusted R2
Asymptotic view of model functions vs. value at
given value
Impact of each kernel on the metric at the selected process count compared to the other kernels
Metric selection
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Extra-P model view
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 19
Measurement values
X axis scale control for prediction of behavior at other process counts
Models selected in the Call path view
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Extra-P model generation tool
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 20
Search space definition
Note: When a new experiment is loaded the values here are used for the initial modeling
Number of terms. Recommended value 1, as it captures dominant behavior
VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING
Feedback
! What additional features would you like to see?
! What additional capabilities would you like to see?
INSIGHTFUL AUTOMATIC PERFORMANCE MODELING TUTORIAL 21