Automatic Detection of Performance Anomalies inTask-Parallel Programs
Work in progress on the Aftermath trace analysis tool
Andi Drebes
Universite Pierre et Marie CurieLaboratoire d’Informatique de Paris VI
Joint work with:Antoniu Pop, Karine Heydemann
Albert Cohen, Nathalie Drach
RACING’14, May 30th, 2014
Open tream
Context
Hardware and software environment
I Multi-core / many-core systems
I How to exploit the hardware efficiently?
I Task-parallel languages based on fine-grained tasks
Performance debugging
I Requires analysis of complex interactions at execution time:Application / Run-time / Machine
I Possible solution: Record dynamic events to a trace file
I Post-mortem (offline) analysis
Aftermath
I Trace visualization and support for manual analysis
I Originally developed for OpenStream language & run-time
I Work in progress: Automate repetitive tasks
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 1 / 16
Context
Hardware and software environment
I Multi-core / many-core systems
I How to exploit the hardware efficiently?
I Task-parallel languages based on fine-grained tasks
Performance debugging
I Requires analysis of complex interactions at execution time:Application / Run-time / Machine
I Possible solution: Record dynamic events to a trace file
I Post-mortem (offline) analysis
Aftermath
I Trace visualization and support for manual analysis
I Originally developed for OpenStream language & run-time
I Work in progress: Automate repetitive tasks
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 1 / 16
Context
Hardware and software environment
I Multi-core / many-core systems
I How to exploit the hardware efficiently?
I Task-parallel languages based on fine-grained tasks
Performance debugging
I Requires analysis of complex interactions at execution time:Application / Run-time / Machine
I Possible solution: Record dynamic events to a trace file
I Post-mortem (offline) analysis
Aftermath
I Trace visualization and support for manual analysis
I Originally developed for OpenStream language & run-time
I Work in progress: Automate repetitive tasks
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 1 / 16
Outline
1. Aftermath
2. Insufficient parallelism and its causes
3. Performance anomalies during task execution
4. Work in progress: Status
5. Summary & Questions
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs
Aftermath
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 2 / 16
Aftermath
TimelineTimeline
Fil
ters
Fil
ters
Detailed text viewDetailed text view
Sta
tist
ics
Sta
tist
ics
Menu bar: derived metricsMenu bar: derived metrics
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 2 / 16
Aftermath
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 2 / 16
Aftermath
Time
Procesors
Activityduringexecution
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 2 / 16
Aftermath
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 2 / 16
Aftermath
Task execution (dark blue)
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 2 / 16
Aftermath
Task Task Task
Instance Instance Instance
Task execution (dark blue)
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 2 / 16
Aftermath
Task creation (white)
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 2 / 16
Aftermath
Searching for work (light blue)
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 2 / 16
Aftermath
Basic statistics for run-time states
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 2 / 16
Aftermath
Heatmap indicating task duration (white: fast, red: slow)
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 2 / 16
Aftermath
NUMA heatmap indicating locality of memory accesses(blue: local, pink: remote)
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 2 / 16
Navigating through a trace
I Huge amounts of high-dimensional data
I Lots of features → lots of possibilities for analysis
I Where to look? What to look for?
I Expertise & lots of time required
Guide the user through performance analysisI Refine analysis in several steps
I Start by analyzing parallelismI Analyze what happens inside tasks
I Automate repetitive tasks
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 3 / 16
Navigating through a trace
I Huge amounts of high-dimensional data
I Lots of features → lots of possibilities for analysis
I Where to look? What to look for?
I Expertise & lots of time required
Guide the user through performance analysisI Refine analysis in several steps
I Start by analyzing parallelismI Analyze what happens inside tasks
I Automate repetitive tasks
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 3 / 16
Detecting insufficient parallelism
P0
Time
P1
P2
Pn-1
...
...
Task execution
Other states
Ideal situationAll CPUs are in task executionstate without any interruption
P0
Time
P1
P2
Pn-1...
...
Task execution
Other states
Realistic scenarioTask creation, memoryallocation, idle time,over-synchronization, . . .
Detect insufficient parallelism / high overhead automatically
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 4 / 16
Detecting insufficient parallelism
P0
Time
P1
P2
Pn-1
...
...
Task execution
Other states
Ideal situationAll CPUs are in task executionstate without any interruption
P0
Time
P1
P2
Pn-1...
...
Task execution
Other states
Realistic scenarioTask creation, memoryallocation, idle time,over-synchronization, . . .
Detect insufficient parallelism / high overhead automatically
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 4 / 16
Detecting insufficient parallelism
P0
Time
P1
P2
Pn-1
...
...
Task execution
Other states
Ideal situationAll CPUs are in task executionstate without any interruption
P0
Time
P1
P2
Pn-1...
...
Task execution
Other states
Realistic scenarioTask creation, memoryallocation, idle time,over-synchronization, . . .
Detect insufficient parallelism / high overhead automatically
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 4 / 16
Threshold-based analysis of parallelism
P0
Time
d
P1
P2
Pn-1
...
...
Task execution
Other states
d : Duration of the intervalde,i : Time that processor i spends in task execution statete : Threshold for task execution, e.g. te = 0.95
Consider that there is sufficient parallelism if inequation holds:
n∑i=1
de,i > te · n · d
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 5 / 16
Threshold-based analysis of parallelism
P0
Time
d
P1
P2
Pn-1
...
...
Task execution
Other states
d : Duration of the interval
de,i : Time that processor i spends in task execution statete : Threshold for task execution, e.g. te = 0.95
Consider that there is sufficient parallelism if inequation holds:
n∑i=1
de,i > te · n · d
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 5 / 16
Threshold-based analysis of parallelism
P0
Time
d
P1
P2
Pn-1
...
...
Task execution
Other states
de,0+ + + +
d : Duration of the intervalde,i : Time that processor i spends in task execution state
te : Threshold for task execution, e.g. te = 0.95
Consider that there is sufficient parallelism if inequation holds:
n∑i=1
de,i > te · n · d
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 5 / 16
Threshold-based analysis of parallelism
P0
Time
d
P1
P2
Pn-1
...
...
Task execution
Other states
de,1+
d : Duration of the intervalde,i : Time that processor i spends in task execution state
te : Threshold for task execution, e.g. te = 0.95
Consider that there is sufficient parallelism if inequation holds:
n∑i=1
de,i > te · n · d
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 5 / 16
Threshold-based analysis of parallelism
P0
Time
d
P1
P2
Pn-1
...
...
Task execution
Other states
de,2+ + + +
d : Duration of the intervalde,i : Time that processor i spends in task execution state
te : Threshold for task execution, e.g. te = 0.95
Consider that there is sufficient parallelism if inequation holds:
n∑i=1
de,i > te · n · d
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 5 / 16
Threshold-based analysis of parallelism
P0
Time
d
P1
P2
Pn-1
...
...
Task execution
Other states
de,n-1+
d : Duration of the intervalde,i : Time that processor i spends in task execution state
te : Threshold for task execution, e.g. te = 0.95
Consider that there is sufficient parallelism if inequation holds:
n∑i=1
de,i > te · n · d
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 5 / 16
Threshold-based analysis of parallelism
P0
Time
d
P1
P2
Pn-1
...
...
Task execution
Other states
d : Duration of the intervalde,i : Time that processor i spends in task execution statete : Threshold for task execution, e.g. te = 0.95
Consider that there is sufficient parallelism if inequation holds:
n∑i=1
de,i > te · n · d
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 5 / 16
Threshold-based analysis of parallelism
P0
Time
d
P1
P2
Pn-1
...
...
Task execution
Other states
d : Duration of the intervalde,i : Time that processor i spends in task execution statete : Threshold for task execution, e.g. te = 0.95
Consider that there is sufficient parallelism if inequation holds:
n∑i=1
de,i > te · n · d
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 5 / 16
Detecting the cause of insufficient parallelism
Multiple stages during analysis
I If inequation does not hold, find out why
I Possible causes: task creation overhead, memory allocation,not enough tasks available for execution, . . .
I Use thresholds for associated states:tc (task creation), ti (idle time)
1 2 3 4 5 6 7 8 9 10
Interval selection
I Multiple intervals: initialization, termination, etc.
I Repeat analysis for different intervals
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 6 / 16
Detecting the cause of insufficient parallelism
Multiple stages during analysis
I If inequation does not hold, find out why
I Possible causes: task creation overhead, memory allocation,not enough tasks available for execution, . . .
I Use thresholds for associated states:tc (task creation), ti (idle time)
1 2 3 4 5 6 7 8 9 10
Interval selection
I Multiple intervals: initialization, termination, etc.
I Repeat analysis for different intervals
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 6 / 16
Per-interval analysis of parallelism & overhead
ChooseInterval
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 7 / 16
Per-interval analysis of parallelism & overhead
ChooseInterval
Analyze taskexecution time
AboveSufficientparallelism
No intervalleft
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 7 / 16
Per-interval analysis of parallelism & overhead
ChooseInterval
Analyze taskexecution time
Analyze task
synchronizationAnalyze task
creation time...
Below
AboveSufficientparallelism
No intervalleft
Analyzeidle time
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 7 / 16
Per-interval analysis of parallelism & overhead
ChooseInterval
Analyze taskexecution time
Analyzeidle time
Analyze task
synchronizationAnalyze task
creation time...
Inefficientsynchroni-
zation
Notenough
parallelismexposed
High taskcreationoverhead
...
BelowAbove BelowAbove BelowAbove BelowAbove
Below
AboveSufficientparallelism
No intervalleft
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 7 / 16
Detecting performance anomalies during task execution
During task executionI Performance anomaly possible even at 100% task execution
(ineffective use of caches, remote memory accesses, branchmisprediction)
Number of tasks
Duration
Number of tasks
Duration
Impact on the distribution of task durationI Slowdown of all tasksI Different groups / peaks
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 8 / 16
Detecting performance anomalies during task execution
During task executionI Performance anomaly possible even at 100% task execution
(ineffective use of caches, remote memory accesses, branchmisprediction)
Number of tasks
DurationNumber of tasks
Duration
Impact on the distribution of task durationI Slowdown of all tasksI Different groups / peaks
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 8 / 16
Using performance counters
Hardware performance counters
I Implemented in hardware, no slowdown of the application
I Low tracing overhead if sampled at beginning / end of a task
I Dozens of hardware events can be monitored
Automatic analysis of performance counters
I Which hardware events are relevant?
I Manual testing tedious & time consuming
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 9 / 16
Using performance counters
Hardware performance counters
I Implemented in hardware, no slowdown of the application
I Low tracing overhead if sampled at beginning / end of a task
I Dozens of hardware events can be monitored
Automatic analysis of performance counters
I Which hardware events are relevant?
I Manual testing tedious & time consuming
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 9 / 16
Analyzing performance counters
Time
Counter value
v(c,i,t)Pi
Per-CPU performance counterI Absolute value v(c , i , t), monotonically increasing
I c : Counter (e.g. cache misses)I i : Processor identifierI t: Timestamp
I Sampled at the beginning and end of a task
Break down counter evolution to task instances
I Increase of c by task T : Nc,T = v(c, i , e)− v(c , i , s)
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 10 / 16
Analyzing performance counters
T
Time
Counter value
v(c,i,t)Pi
Per-CPU performance counterI Absolute value v(c , i , t), monotonically increasing
I c : Counter (e.g. cache misses)I i : Processor identifierI t: Timestamp
I Sampled at the beginning and end of a task
Break down counter evolution to task instances
I Increase of c by task T : Nc,T = v(c, i , e)− v(c , i , s)
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 10 / 16
Analyzing performance counters
T
Time
Counter value
s e
v(c,i,t)Pi
Per-CPU performance counterI Absolute value v(c , i , t), monotonically increasing
I c : Counter (e.g. cache misses)I i : Processor identifierI t: Timestamp
I Sampled at the beginning and end of a task
Break down counter evolution to task instances
I Increase of c by task T : Nc,T = v(c, i , e)− v(c , i , s)
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 10 / 16
Analyzing performance counters
T
Time
Counter value
s e
v(c,i,t)Pi Nc,T
Per-CPU performance counterI Absolute value v(c , i , t), monotonically increasing
I c : Counter (e.g. cache misses)I i : Processor identifierI t: Timestamp
I Sampled at the beginning and end of a task
Break down counter evolution to task instances
I Increase of c by task T : Nc,T = v(c, i , e)− v(c , i , s)
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 10 / 16
Linear regression
Duration
Nc,TjPerformance indicator value
dTj
Perform linear regression
I Assume linear model: dTj= α · Nc,Tj
+ β (α and βconstant)
I Compare coefficient of determination with threshold
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 11 / 16
Linear regression
Duration
Nc,TjPerformance indicator value
dTj
Perform linear regression
I Assume linear model: dTj= α · Nc,Tj
+ β (α and βconstant)
I Compare coefficient of determination with threshold
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 11 / 16
Linear regression
Duration
Nc,TjPerformance indicator value
dTj
Perform linear regression
I Assume linear model: dTj= α · Nc,Tj
+ β (α and βconstant)
I Compare coefficient of determination with threshold
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 11 / 16
Shortcuts & refinement
Variation of task duration
I Determine coefficient of variation for task duration
I Only perform analysis if significant
Task typesI Different task types in an application
I Auxiliary tasks: initialization, terminationI Work tasks: matrix multiplication, decomposition, etc.
I Performance anomaly not necessarily present in all types
Topology of the machine
I Anomaly only present on subset of processors
I Example: Memory accesses local for one NUMA node, remoteon another
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 12 / 16
Shortcuts & refinement
Variation of task duration
I Determine coefficient of variation for task duration
I Only perform analysis if significant
Task typesI Different task types in an application
I Auxiliary tasks: initialization, terminationI Work tasks: matrix multiplication, decomposition, etc.
I Performance anomaly not necessarily present in all types
Topology of the machine
I Anomaly only present on subset of processors
I Example: Memory accesses local for one NUMA node, remoteon another
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 12 / 16
Shortcuts & refinement
Variation of task duration
I Determine coefficient of variation for task duration
I Only perform analysis if significant
Task typesI Different task types in an application
I Auxiliary tasks: initialization, terminationI Work tasks: matrix multiplication, decomposition, etc.
I Performance anomaly not necessarily present in all types
Topology of the machine
I Anomaly only present on subset of processors
I Example: Memory accesses local for one NUMA node, remoteon another
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 12 / 16
Per-interval analysis of parallelism & overhead
Choose task type
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 13 / 16
Per-interval analysis of parallelism & overhead
Choose task typeChoose set
of processors
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 13 / 16
Per-interval analysis of parallelism & overhead
Choose task typeChoose set
of processorsCheck varation
of task duration
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 13 / 16
Per-interval analysis of parallelism & overhead
Choose task typeChoose set
of processorsCheck varation
of task duration
Low
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 13 / 16
Per-interval analysis of parallelism & overhead
Choose set of
performance
counters
Choose task typeChoose set
of processorsCheck varation
of task duration
LowHigh
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 13 / 16
Per-interval analysis of parallelism & overhead
Choose set of
performance
counters
Choose task typeChoose set
of processors
Break down values
to task instances
Check varation
of task duration
LowHigh
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 13 / 16
Per-interval analysis of parallelism & overhead
Choose set of
performance
counters
Choose task typeChoose set
of processors
Correlate per-task
instance values
and duration
Break down values
to task instances
Check varation
of task duration
LowHigh
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 13 / 16
Per-interval analysis of parallelism & overhead
Choose set of
performance
counters
Choose task typeChoose set
of processors
Correlate per-task
instance values
and duration
Break down values
to task instances
Check varation
of task duration
Low
Event set
irrelevant
LowHigh
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 13 / 16
Per-interval analysis of parallelism & overhead
Choose set of
performance
counters
Choose task typeChoose set
of processors
Correlate per-task
instance values
and duration
Break down values
to task instances
Check varation
of task duration
Low
High
Event set
irrelevant
Event set
relevant
LowHigh
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 13 / 16
Per-interval analysis of parallelism & overhead
Choose set of
performance
counters
Choose task typeChoose set
of processors
Correlate per-task
instance values
and duration
Break down values
to task instances
Check varation
of task duration
Low
High
Event set
irrelevant
Event set
relevant
LowHigh
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 13 / 16
Per-interval analysis of parallelism & overhead
Choose set of
performance
counters
Choose task typeChoose set
of processors
Correlate per-task
instance values
and duration
Break down values
to task instances
Check varation
of task duration
Low
High
Event set
irrelevant
Event set
relevant
No set left
LowHigh
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 13 / 16
Per-interval analysis of parallelism & overhead
Choose set of
performance
counters
Choose task typeChoose set
of processors
Correlate per-task
instance values
and duration
Break down values
to task instances
Check varation
of task duration
Low
High
No set left
Event set
irrelevant
Event set
relevant
No set left
LowHigh
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 13 / 16
Example: K-means branch misprediction
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 14 / 16
Example: K-means branch misprediction
0
500
1000
1500
2000
2500
0
2e+
06
4e+
06
6e+
06
8e+
06
1e+
07
1.2
e+07
1.4
e+07
1.6
e+07
1.8
e+07
2e+
07
Num
ber
of
tasks
Task duration [cycles]
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 14 / 16
Example: K-means branch misprediction
0.0×100
2.0×106
4.0×106
6.0×106
8.0×106
1.0×107
1.2×107
1.4×107
1.6×107
1.8×107
2.0×107
0
100
00
200
00
300
00
400
00
500
00
600
00
700
00
800
00
900
00
100
000
Task d
ura
tion [
cycle
s]
Branch mispredictions
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 14 / 16
Example: K-means branch misprediction
0.0×100
2.0×106
4.0×106
6.0×106
8.0×106
1.0×107
1.2×107
1.4×107
1.6×107
1.8×107
2.0×107
0
100
00
200
00
300
00
400
00
500
00
600
00
700
00
800
00
900
00
100
000
Task d
ura
tion [
cycle
s]
Branch mispredictions
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 14 / 16
Example: K-means branch misprediction
Low mispre-diction rateLow mispre-diction rate
High mispre-diction rateHigh mispre-diction rate
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 14 / 16
Work in progress: Status
Analysis of parallelism
I Per-interval analysis of time spent on task execution
I Per-interval analysis of time spent in run-time states
I Support for thresholds
I Loop performing analysis on set of intervals
Correlation of performance indicators
I Support for performance counters
I Task duration histogram
I Analysis of the variation of task durations
I Breaking down performance counter values to task instances
I Linear regression
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 15 / 16
Work in progress: Status
Analysis of parallelism
I Per-interval analysis of time spent on task execution
I Per-interval analysis of time spent in run-time states
I Support for thresholds
I Loop performing analysis on set of intervals
Correlation of performance indicators
I Support for performance counters
I Task duration histogram
I Analysis of the variation of task durations
I Breaking down performance counter values to task instances
I Linear regression
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 15 / 16
Work in progress: Status
Analysis of parallelism
I Per-interval analysis of time spent on task execution
I Per-interval analysis of time spent in run-time states
I Support for thresholds
I Loop performing analysis on set of intervals
Correlation of performance indicators
I Support for performance counters
I Task duration histogram
I Analysis of the variation of task durations
I Breaking down performance counter values to task instances
I Linear regression
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 15 / 16
Work in progress: Status
Analysis of parallelism
I Per-interval analysis of time spent on task execution
I Per-interval analysis of time spent in run-time states
I Support for thresholds
I Loop performing analysis on set of intervals
Correlation of performance indicators
I Support for performance counters
I Task duration histogram
I Analysis of the variation of task durations
I Breaking down performance counter values to task instances
I Linear regression
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 15 / 16
Summary
Aftermath
I Tool for trace-based analysis of task-parallel programs
I Currently provides only support for manual analysis
I Available at http://openstream.info/aftermath
Automatic analysis of parallelism based on thresholds
I Amount of time spent on task execution sufficiently high?
I If not, perform subsequent threshold-based analysis for statesassociated with overhead of the run-time system
Automatic correlation of performance indicators
I Indicate which events are relevant
I Break down counter evolution to task instances
I Correlate with task duration
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 16 / 16
Summary
Aftermath
I Tool for trace-based analysis of task-parallel programs
I Currently provides only support for manual analysis
I Available at http://openstream.info/aftermath
Automatic analysis of parallelism based on thresholds
I Amount of time spent on task execution sufficiently high?
I If not, perform subsequent threshold-based analysis for statesassociated with overhead of the run-time system
Automatic correlation of performance indicators
I Indicate which events are relevant
I Break down counter evolution to task instances
I Correlate with task duration
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 16 / 16
Summary
Aftermath
I Tool for trace-based analysis of task-parallel programs
I Currently provides only support for manual analysis
I Available at http://openstream.info/aftermath
Automatic analysis of parallelism based on thresholds
I Amount of time spent on task execution sufficiently high?
I If not, perform subsequent threshold-based analysis for statesassociated with overhead of the run-time system
Automatic correlation of performance indicators
I Indicate which events are relevant
I Break down counter evolution to task instances
I Correlate with task duration
Andi Drebes – Automatic Detection of Performance Anomalies in Task-Parallel Programs 16 / 16