Date post: | 10-Dec-2015 |
Category: |
Documents |
Upload: | lucas-ingram |
View: | 213 times |
Download: | 0 times |
Electrical and Computer Engineering
Fun Size Your Data: Using Statistical Techniques to
Efficiently Compress and Exploit Benchmarking Results
David J. Lilja
Electrical and Computer Engineering
University of Minnesota
Electrical and Computer Engineering
The Problem
We can generate heaps of data But it’s noisy Too much to understand or use efficiently
Heaps o’ data445 446 397 226388 3445 188 100247762 432 54 1298 345 2245 883977492 472 565 9991 34 882 545 4022827 572 597 364 …
Benchmarkprograms
Electrical and Computer Engineering
A Solution
Statistical design of experiments techniques Compress complex benchmark results Exploit the results in interesting ways Extract new insights
Demonstrate using Microarchitecture-aware floorplanning Benchmark classification
Electrical and Computer Engineering
Why Do We Need Statistics?
Draw meaningful conclusions in the presence of noisy measurements Noise filtering
Aggregate data into meaningful information Data compression
Heaps o’ data445 446 397 226388 3445 188 100247762 432 54 1298 345 2245 883977492 472 565 9991 34 882 545 4022827 572 597 364 …
...x
Electrical and Computer Engineering
Why Do We Need Statistics?
Draw meaningful conclusions in the presence of noisy measurements Noise filtering
Aggregate data into meaningful information Data compression
Heaps o’ data445 446 397 226388 3445 188 100247762 432 54 1298 345 2245 883977492 472 565 9991 34 882 545 4022827 572 597 364 …
...x
Electrical and Computer Engineering
Design of Experiments for Data Compression
445 446 397 226388 3445 188 100247762 432 54 1298 345 2245 883977492 472 565 9991 34 882 545 4022827 572 597 364 …
A B C
V1 √ √
V2 √ √ √
V3 √ √
V4 √ √
Effects of each input A, B, C
Effects of interactions AB, AC, BC, ABC
Electrical and Computer Engineering
Types of Designs of Experiments
Full factorial design with replication O(vm) experiments = O(43)
Fractional factorial designs O(2m) experiments = O(23)
Multifactorial design (P&B) O(m) experiments = O(3) Main effects only – no interactions
m-factor resolution x designs k O(2m) experiments = k O(23) Selected interactions
A B C
V1 √ √
V2 √ √ √
V3 √ √
V4 √ √
Electrical and Computer Engineering
Example:Architecture-Aware
Floor-Planner
V. Nookala, S. Sapatnekar, D. Lilja, DAC’05.
Electrical and Computer Engineering
Motivation
Imbalance between device and wire delays
Global wire delays > system clock cycle in nanometer technology
wire
Layout
Electrical and Computer Engineering
Solution
Wire-pipelining If delay > a clock cycle → insert flip-
flops along a wire Several methods for optimal FF insertion
on a wire • Li et al. [DATE 02]
• Cocchini et al. [ICCAD 02]
• Hassoun et al. [ICCAD 02]
wire
Layout
FF
But what about the performance impact of the pipeline delays?
Electrical and Computer Engineering
Impact on PerformanceExecution time = num-instr * cycles/instr (CPI) * cycle-timeExecution time = num-instr * cycles/instr (CPI) * cycle-time
Wire-pipelining
Electrical and Computer Engineering
Impact on Performance
Key idea Some buses are critical Some can be freely pipelined without (much) penalty
Execution time = num-instr * cycles/instr (CPI) * cycle-timeExecution time = num-instr * cycles/instr (CPI) * cycle-time
Wire-pipelining
Electrical and Computer Engineering
Change Objective Function
Traditional physical design objectives Minimize area, total wire length, etc.
New objective Optimize only throughput critical wires to maximize
overall performance
Execution time = num-instr * cycles/instr (CPI) * cycle-timeExecution time = num-instr * cycles/instr (CPI) * cycle-time
Wire-pipelining
Electrical and Computer Engineering
Conventional Microarchitecture Interaction with Floor Planner
Simulation Methodology
Physical Design
µ-arch
Benchmarks
CPI info
Frequency
Electrical and Computer Engineering
Microarchitecture-aware Physical Design
Incorporate wire-pipelining models into the simulator Extra pipeline stages in processor Simulator needs to adjust operation latencies
Simulation Methodology
Physical Design
µ-arch
Benchmarks
CPI info
FrequencyLayout
Electrical and Computer Engineering
But There are Problems
Simulation is too slow 2000-3000 instructions per simulated instruction Numerous benchmark programs to consider
Exponential search space Thousands of combinations tried in physical design step
Simulation Methodology
Physical Design
µ-arch
Benchmarks
CPI info
FrequencyLayout
Electrical and Computer Engineering
Design of Experiments Methodology
Design of Experiments based
Simulation Methodology
Floorplanning Validation
µ-arch
benchmarks
benchmarksBus, interaction weights
Layout
MinneSPECReduced input sets
# Simulations is linear in the number of buses (if no interactions)
Frequency
Electrical and Computer Engineering
Related Floorplanning Work Simulated Annealing (SA)
CPI look up table [Liao et al, DAC 04] Bus access ratios from simulation profiles
Minimize the weighted sum of bus latencies [Ekpanyapong et al, DAC 04]
Throughput sensitivity models for a selected few critical paths Limited sampling for a large solution space
[Jagannathan et al, ASPDAC 05] Our approach
Design of experiments to identify criticality of each bus
Electrical and Computer Engineering
Microarchitecture and factors22 buses → 19 factors in
experimental design Some factors model multiple
buses
Fetch Decode
RUU
REG
BPRED
IL1DL1
L2ITLB
LSQ
DTLB
IADD1
IADD2
IADD3
IMULT
FMULT
FADD
Electrical and Computer Engineering
2-level Resolution III Design2-levels for each factor
Lowest and highest possible values (range)
Latency range of buses Min = 0 Max = Chip corner-corner wire latency
19 factors 32 simulations (nearest power of 2) Captured by a design matrix (32x19)
• 32 rows - 32 simulations
• 19 columns - Factor values
Electrical and Computer Engineering
Experimental setup Nine SPEC 2000 benchmarks
MinneSPEC reduced input sets
SimpleScalar simulator Floorplanner -- PARQUET
Simulated annealing based
Objective functionMinimize the weighted sum of bus latencies Secondarily minimize aspect ratio and area
Electrical and Computer Engineering
Comparisons
Case Description
SFP Our “statistical floorplanner”
acc Access ratios from [Ekpanyapong et al, DAC 04]
minWL Traditional floorplanning
Electrical and Computer Engineering
Averaged Over All Benchmarks
Compared to acc 3-7% point
improvement
Better improvements over acc at higher frequencies
SFP-comb ≈ SFP (within about 1-3% points)
Electrical and Computer Engineering
Summary
Use statistical design of experiments Compress benchmark data into critical bus weights
Used by microarchitecture-aware floorplanner Optimizes insertion of pipeline delays on wires to
maximize performance
Extend methodology for other critical objectives Power consumption Heat distribution