Post on 22-Mar-2016
description
transcript
Program Phase Directed Dynamic Cache Way
ReconfigurationSubhasis BanerjeeSurendra GS.K.Nandy
Presented by:Xin GuanMar. 29, 2010
OUTLINE Introduction to Program Phase
Hardware Phase Detector
Cache Reconfiguration
Experiment Results
2
OUTLINE Introduction to Program Phase
Hardware Phase Detector
Cache Reconfiguration
Experiment Results
3
What is Program Phase? Informally, a phase is a period of execution whose
characteristics are qualitatively different from those of the neighboring periods.
How do we detect phases? Phase boundary Instruction Stream (eg, certain code sections) Data Stream (eg, data access pattern) Asynchronous external events (eg, incoming
message)
Program Phase
Program Phase
Mpeg2decode phase profile in
terms of IPC (Instruction Per Clock), ROB
(Reorder Buffer) occupancy, Issue Rate
Program Phase Conflict Miss
Insufficient number of cache way Program locality : generating conflict miss patterns
Increasing cache associativity does
not gain much performance
improvements in some cases, so
reconfigure it to save power.
OUTLINE Introduction to Program Phase
Hardware Phase Detector
Cache Reconfiguration
Experiment Results
7
Hardware Phase Detector Counter Array
Cache Sets
01101000 23
Tag is used to identify the cache
block during a conflict miss.
Counter is used to get the number of times
of conflict miss.
Hardware Phase Detector Interval Vector
Cache Sets
#1 100
150
23140
#2#3#4#5#6#7…… ……
Normalization Interval Vector
Start to count at the beginning
of every interval
Hardware Phase Detector Clustering
Hardware Phase Detector Clustering
x
y
d1
d2
d3
If the minimum distance
d3 < threshold,
Same cluster
Hardware Phase Detector Phase History Table
Every cluster corresponds
to a phase
Phase ID Phase vector
Way Configurati
on#1 Geometric
centroid of cluster 1
2 way set associative
#2 Geometric centroid of cluster 2
4 way set associative
Hardware Phase Detector # of phases VS threshold
According to experiments,
using threshold 1.1 most
benchmarks exhibit 8 phases
totally.
OUTLINE Introduction to Program Phase
Hardware Phase Detector
Cache Reconfiguration
Experiment Results
14
Phase Directed Reconfiguration Architecture
Way selec
t signa
lEvery 2 million
instructions, the internal vector is calculated, and phase is
found
The phase is fed into cache
controller, which
decide the way
configuration.
Way select signal enable/disable the pre-charge and sense amp
Phase Directed Reconfiguration Algorithm
If miss rate is too high,
enable one more way
If miss rate is low enough,
shut down one way to save
power
Disabled Cache Ways Coherency
Valid cache block should be accessible for future references.
Data residing in disabled cache ways should be coherent when the disabled cache way is enabled again.
3 approaches. 1.Flush the disabled way.2.Fill buffer.3.Victim buffer.
Disabled Cache Ways Fill Buffer
Fill buffer can move the data in disabled way to enabled way, with several penalty
cycles.x, y, z
Disabled Cache Ways Victim Buffer
Instead of moving data from disabled
way to enabled way, the data can be
stored in a victim buffer. This
approach is adopted in this
implementation.
x, y, z
OUTLINE Introduction to Program Phase
Hardware Phase Detector
Cache Reconfiguration
Experiment Results
20
Experiment Results
Experiment Results
Experiment Results
Experiment Results
Average Saving of 32% of L1 data cache power with
almost negligible loss of performance.
Conclusion Hardware Program Phase Detector
Dynamic Cache Reconfiguration
Saving average 32% power consumption with no performance degradation
Questions ?