+ All Categories
Home > Documents > Mosaic: Exploiting the Spatial Locality of Process Variation to...

Mosaic: Exploiting the Spatial Locality of Process Variation to...

Date post: 11-Mar-2020
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
33
Mosaic: Exploiting the Spatial Locality of Process Variation to Reduce Refresh Energy in On-Chip eDRAM Modules Aditya Agrawal, Amin Ansari and Josep Torrellas http://iacoma.cs.uiuc.edu
Transcript
Page 1: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Mosaic: Exploiting the Spatial Locality of Process Variation to Reduce Refresh Energy in On-Chip

eDRAM Modules Aditya Agrawal, Amin Ansari and Josep Torrellas

http://iacoma.cs.uiuc.edu

Page 2: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

MOTIVATION

• eDRAM

• Periodic Refresh Requirement

• Refresh Reduction Techniques

Agrawal, Ansari and Torrellas, HPCA 2014 2

Page 3: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

eDRAM

• A 1T1C dynamic memory technology.

• The bit is stored as charge on the capacitor.

• Area and leakage energy savings.

• Increasing adoption in commercial processors: IBM POWER 7, POWER 8, Intel Haswell.

• Constraint: The charge on the capacitor has to be refreshed periodically.

Agrawal, Ansari and Torrellas, HPCA 2014 3

Page 4: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Periodic Refresh Requirement

• Blocks normal accesses.

• Has temperature dependence (2x every 10 oC increase).

• Susceptible to device variations.

• Refresh rate in DRAM ~ once in 64 msec (at 85 oC).

• Refresh rate in eDRAM ~ once in 100 μsec (at 95 oC).

• Impacts energy and performance.

Agrawal, Ansari and Torrellas, HPCA 2014 4

Page 5: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Refresh Reduction Techniques

• Access Patterns to Memory

– Smart Refresh (MICRO 2007): DRAM

– Refrint (HPCA 2013): eDRAM

• Variation in Retention Times

– RAPID (HPCA 2006): DRAM

– Hi-ECC (ISCA 2010): eDRAM

– RAIDR (ISCA 2012): DRAM

– Mosaic (HPCA 2014): eDRAM

Agrawal, Ansari and Torrellas, HPCA 2014 5

Page 6: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Contribution

• Expose the on chip spatial locality in retention times.

– A mathematical model accessible to architects.

• Exploit the spatial locality for refresh reduction.

– A hardware only solution.

– Low area overhead (2%).

– Significant refresh reduction (20x).

Agrawal, Ansari and Torrellas, HPCA 2014 6

Page 7: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

BACKGROUND

• eDRAM Cell Retention Time

• Retention Time Distribution

• Bulk Distribution, Tail Distribution

• Main Idea

Agrawal, Ansari and Torrellas, HPCA 2014 7

Page 8: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

eDRAM Cell Retention Time

Tret = A * 10(Vt*B) sec

Using published data from IBM at 65 nm, Tret ~ 25 msec.

However, in practice eDRAMs are refreshed at ~ 50-100 usec.

Agrawal, Ansari and Torrellas, HPCA 2014 8

storagecapacitor

accesstransistor

Ioff

bit lines

wo

rd li

ne

s

Page 9: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Retention Time Distribution

Kong et. al. [ITC Oct, 2008] 9

Page 10: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Bulk Distribution

• Area under the curve from (-4 σ, ∞).

– 99.9968% of the cells.

• Follows a log-normal distribution.

• Caused by process variation in Vt of the access transistor.

– Includes systematic and random components.

We also know,

– Vt variation has a normal distribution.

– log10(Tret) = Vt/B + log10(A)

Therefore,

– Normal distribution in Vt Log normal distribution in Tret.

Agrawal, Ansari and Torrellas, HPCA 2014 10

Page 11: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Tail Distribution

• Area under the curve from (-∞, -4 σ).

– 0.0031% of the cells (31 ppm).

• Follows a log normal distribution.

• Caused by random manufacturing defects.

• Only a small fraction (3 ppm) is considered defective.

Agrawal, Ansari and Torrellas, HPCA 2014 11

Page 12: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Main Idea

• Tret is a function of Vt.

• Vt variation has spatial locality (systematic component).

Therefore,

• Tret will have spatial locality.

• Exploiting this spatial locality can reduce refresh energy at low area and energy overheads.

Agrawal, Ansari and Torrellas, HPCA 2014 12

Page 13: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

EXPLOITING SPATIAL LOCALITY

• Spatial Map of Retention Times

• Opportunity & Tradeoffs

Agrawal, Ansari and Torrellas, HPCA 2014 13

Page 14: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Step 1

• Obtain a spatial map of Vt using VARIUS.

• Includes the systematic and random components of Vt variation.

Agrawal, Ansari and Torrellas, HPCA 2014 14

Page 15: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Step 2

• Cell by cell translation from Vt values to Tret for the bulk distribution.

• Spatial map remains the same, the scale changes from linear to log10.

Agrawal, Ansari and Torrellas, HPCA 2014 15

Page 16: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Step 3

• From IBM data: 20 ppm cells follow the tail distribution.

• Superimposing the tail distribution on the bulk distribution gives the total per-cell Tret distribution.

Agrawal, Ansari and Torrellas, HPCA 2014 16

Page 17: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Step 4

• Memory is accessed at a line granularity.

• We obtain a per-line Tret distribution by taking the minima of the cells in the line.

Agrawal, Ansari and Torrellas, HPCA 2014 17

Page 18: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Opportunity

• Lower bound on the number of refreshes

– Profile, track and refresh each line at its own rate.

– Huge area and energy overheads.

• A better solution (Mosaic): Exploit spatial locality of Tret

– Logically group co-located lines into tiles.

– Profile each tile and save the information (in a SRAM).

– Track (using counters) and refresh each tile at its own rate.

– Potentially with small area and energy overheads.

Agrawal, Ansari and Torrellas, HPCA 2014 18

Page 19: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Mosaic of Tiles

Mosaic with Tile Size = 16 Mosaic with Tile Size = 64

Agrawal, Ansari and Torrellas, HPCA 2014 19

Page 20: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Tradeoffs

Refresh energy savings - counter size - tile size.

• Small tiles => high refresh savings, high area overheads.

• Small counters => low refresh savings, low area overheads.

Next,

• A simple HW solution to track and refresh each tile.

• Best combination of tile size and counter size (Mosaic).

• Compare Mosaic against baseline and lower bound.

Agrawal, Ansari and Torrellas, HPCA 2014 20

Page 21: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

ARCHITECTURE

• Mosaic Hardware

• Mosaic Operation

Agrawal, Ansari and Torrellas, HPCA 2014 21

Page 22: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Mosaic Refresh Hardware

Agrawal, Ansari and Torrellas, HPCA 2014 22

Augment the cache controller

• SRAM with a profile of tile retention times.

• Logic to track and trigger per tile refresh.

Programmable Clock Divider

Retention Profile SRAM

Tile1

Tile2

Tilen

. . .

Cache Bank

. . .Per-tile down

counters

Mosaic HW

Sequencer

Chip’s Reference Clock

Controller for a Cache Bank

stepLUTAdder

Multiplier

Page 23: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Mosaic Operation

Agrawal, Ansari and Torrellas, HPCA 2014 23

At every step (50 μsec)

for (all tiles in the cache) {

Decrement counter

if (count == 0) {

Schedule tile refresh

Read retention profile SRAM

Reload counter

}

}

Programmable Clock Divider

Retention Profile SRAM

Tile1

Tile2

Tilen

. . .

Cache Bank

. . .Per-tile down

counters

Mosaic HW

Sequencer

Chip’s Reference Clock

Controller for a Cache Bank

stepLUTAdder

Multiplier

Page 24: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

EVALUATION SETUP

• Architectural Parameters

• Tools & Applications

• Design Comparison

Agrawal, Ansari and Torrellas, HPCA 2014 24

Page 25: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Evaluation Setup

Architectural parameters

Chip CMP with 16 2-issue cores

IL1/DL1 32 KB, private

L2 256 KB, private

L3 (eDRAM) 16 MB, 16 banks, shared

L3 bank 1 MB

Network 4 x 4 torus

Coherence MESI directory at L3

Agrawal, Ansari and Torrellas, HPCA 2014 25

Page 26: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Evaluation Setup

Tools & Applications

Architectural Simulator SESC

Timing & Power McPAT & CACTI

Synthesis Design Compiler

Statistics R

Variation VARIUS

Applications SPLASH-2, PARSEC

Agrawal, Ansari and Torrellas, HPCA 2014 26

Page 27: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Design Comparison

• Baseline:

– All lines refreshed at 50 μsec.

• RAIDR:

– Applied to eDRAMs.

– Lines refreshed at 50, 100 or 200 μsec.

• Mosaic:

– Tile size of 32 lines, 6 bit counter per tile.

– L3 area overhead of 2%.

• Ideal (lower bound):

– Tile size of 1.

Agrawal, Ansari and Torrellas, HPCA 2014 27

Page 28: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

EVALUATION

• Refresh Count

• Execution Time

• L3 Energy

Agrawal, Ansari and Torrellas, HPCA 2014 28

Page 29: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Refresh Count

• RAIDR reduces the number of L3 refreshes by 4x.

• Mosaic reduces the number of L3 refreshes by 20x.

• Mosaic is within 2.5x of the lower bound (ideal).

Agrawal, Ansari and Torrellas, HPCA 2014 29

Page 30: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Execution Time

• Performance improves because of reduced cache blocking.

• Mosaic reduces execution time by 9%.

• Ideal reduces execution time by 10%.

Agrawal, Ansari and Torrellas, HPCA 2014 30

Page 31: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

L3 Energy

• L3 energy reduction comes from savings in refresh energy and leakage energy.

• Mosaic saves 43% of L3 energy.

Agrawal, Ansari and Torrellas, HPCA 2014 31

Page 32: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Conclusion

• Exposed the on chip spatial locality of retention times.

– A mathematical model accessible to architects.

• Exploited the spatial locality for refresh reduction.

– A hardware only solution.

– Low L3 area overhead (2%).

– Significant refresh reduction (20x).

– Saves 43% energy in L3.

Agrawal, Ansari and Torrellas, HPCA 2014 32

Page 33: Mosaic: Exploiting the Spatial Locality of Process Variation to …iacoma.cs.uiuc.edu/iacoma-papers/PRES/present_hpca14_1.pdf · 2014-02-19 · Mosaic Refresh Hardware Agrawal, Ansari

Mosaic: Exploiting the Spatial Locality of Process Variation to Reduce Refresh Energy in On-Chip

eDRAM Modules Aditya Agrawal, Amin Ansari and Josep Torrellas

http://iacoma.cs.uiuc.edu


Recommended