+ All Categories
Home > Documents > UNDERSTANDING THE ROLE OF THE POWER DELIVERY NETWORK IN 3-D STACKED MEMORY DEVICES Manjunath...

UNDERSTANDING THE ROLE OF THE POWER DELIVERY NETWORK IN 3-D STACKED MEMORY DEVICES Manjunath...

Date post: 25-Dec-2015
Category:
Upload: bonnie-pierce
View: 237 times
Download: 2 times
Share this document with a friend
Popular Tags:
19
UNDERSTANDING THE ROLE OF THE POWER DELIVERY NETWORK IN 3-D STACKED MEMORY DEVICES Manjunath Shevgoor, Niladrish Chatterjee, Rajeev Balasubramonian, Al Davis University of Utah Aniruddha N. Udipi ARM R&D 1 Jung-Sik Kim DRAM Design Team, Samsung Electronics
Transcript

UNDERSTANDING THE ROLE OF THE POWER DELIVERY NETWORK IN 3-D STACKED MEMORY DEVICES

Manjunath Shevgoor, Niladrish Chatterjee, Rajeev Balasubramonian, Al DavisUniversity of Utah

Aniruddha N. UdipiARM R&D

1

Jung-Sik KimDRAM Design Team, Samsung Electronics

Background

2

1.5V

GND

A BWire Resistance

1.5V 1.2V

Circuit Element

Voltage along Wire A-B

• Only part of the Supply Voltage reaches the circuit elements• This loss of Voltage over the Power Delivery Network (PDN) is

called IR -Drop

• 3D stacking increases current density • Top layer in a 9 high stack needs to go

through 8 TSV layers• IR Drop violations can lead to

correctness issues

DIE 2

DIE 3

DIE 1

3

Addressing IR Drop

Reduce Resistance Make wires wider Add more VDD/VSS bumps Increases Cost

Reduce current Control Activity on chip Decreases Performance

This paper tries to reduce the performance impact without increasing cost

Relationship between pin count and package costSource: Dong el al. Fabrication Cost Analysis and Cost-Aware DesignSpace Exploration for 3-D ICs

4

A big shift going forward

Current limiting constraints already exist DDR3 uses tFAW and tRRD

Recent work on PCM (Hay et al.) using Power Tokens to limit PCM current draw

These solutions use Temporal Constraints, which are not optimal to address IR Drop in 3D DRAM

Quality of Power Delivery also depends on location on die

We propose Spatial Constraints to leverage this disparity

5

DRAM Layout – Spatial Dependence

• We use an HMC like architecture for our evaluation• IR Drop worsens as distance from TSVs increase

VDD on M1 on Layer 9

X Coordinate

VD

D

Y C

oord

inate

6

IR Drop Profile

• Figures illustrate IR Drop when all banks in the 3D stack are executing ACT

• IR Drop worsens as the distance from the source increases

Layer 2

Layer 3

Layer 4

Layer 5

Layer 6

Layer 7

Layer 8

Layer 9

7

Iso-IR Drop Regions

• IR Drop worsens as distance from TSVs increase

• IR Drop also worsens as height from C4 bumps increase

• We define activity Constraints on a region by region basis

8

DRAM Currents

Symbol Value (mA) Description Consumed By

IDD0 66 One bank Activate to Precharge

Local Sense Amps,Row Decoders, and I/O Sense Amps

IDD4R 235 Burst Read Current

Peripherals, Local Sense Amps, IO Sense Amps, Column Decoders

IDD4W 171 Burst Write Current

Peripherals, IO Sense Amps, Column Decoders

Source: Micron Data Sheet for 4Gb x16 part

• Read Consumes the highest current of any DRAM command• To keep design complexity down, we define all other currents in

terms of READS

9

Different Regions have very different IR Drop characteristics

To not be constrained by the worst region, we determine max. number of Reads supported by each Region

IR Drop in any region is not determined by just the activity in that region

Memory controller complexity increases with the number of ‘Regions’

Region based Read constraints

10

Region based Read constraints

To keep the memory controller simple, four kinds of constraints are created for Reads Single Region Constraints- Assume all

Reads happen in only Region Two Region Constraints- Assume all

Reads happen in only two adjacent regions Four Region Constraints- Assume all

Reads happen in either top four or bottom four dies

Die Stack wide constraint- Reads can be happening any where in the die stack

11

Read Based constraints

To limit controller complexity, we define ACT, PRE and Write constraints in terms of Read

The Read-Equivalent is the min. number of ACT/PRE/WR that cause the same IR-Drop as the Read with the least IR-Drop in that Region

Command Read Equivalent

ACT 2

PRE 6

WR 1

12

Proposals 1- Controlling Starvation

As long as Bottom Regions are serving more than 8 Reads, Top Regions can never service a Read

Requests mapped to Top regions suffer Prioritize Requests that are older than

N* Avg. Read Latency(N is empirically determined to be 1.2 in our simulations)

Die Stack Wide Constraint

At least one Rd in Top Regions

8 Reads allowed

No Top Region Reads

16 Reads allowed

13

Case Study– Page Placement

Profile Applications to find out the most accessed pages

Map most accessed pages to the most IR Drop resistant regions (Bottom Regions)

The profile is divided into 8 sections. The 4 most accessed sections are mapped to Bottom regions

The rest are mapped to C_TOP, B_TOP, D_TOP, A_TOP, in that order

14

Modeling IR Drop

• Power assigned to each block is assumed to be distributed evenly over the block

• Current sources are used to model the power consumption• More details in the paperSource: Sani R. Nassif, Power Grid Analysis Benchmarks

15

Methodology

HMC based memory system Simics coupled with augmented USIMM SPEC CPU 2006 mp

CPU Configuration

CPU 8-core Out-of-Order CMP, 3.2 GHz

L2 Unified Last Level Cache 8MB/8-way, 10-cycle access

Memory Configuration

Total DRAM Capacity 8 GB in 1 3D stack

DRAM Configuration 2 16-bit uplinks, 1 16-bit downlink@ 6.4 Gbps

32 banks/DRAM die, 16 vaults8 DRAM dies/3D-stacktFAW honored on each die

16

Results

With All Constraints, (Real PDN) performance falls by 4.6X

With Starvation management, gap is reduced to 1.47X

Profiled Page Placement with Starvation Control is within 15.35% of unrealistic Ideal PDN

17

Future Work

We present a Case study, which explores the performance improvement of IR Drop aware page placement A realistic page placement/migration scheme is

required to leverage the disparity in IR Drop tolerance of different regions

Metrics other than number of page accesses might be more appropriate when identifying critical pages

Prioritizing requests to Bottom regions could help overcome the detrimental effects of the Top regions

Exploring the complexity – performance tradeoff for memory controllers

18

Conclusions

This paper presents the problem of problem IR Drop in 3D DRAM

We reduce the impact of worst case IR Drop by introducing Region based constraints

Memory controller complexity is limited by simplifying IR Drop constraints

By addressing both the spatial and the temporal aspects of the problem, we achieve performance that is very close to that of the Ideal PDN

19

Thank You


Recommended