+ All Categories
Home > Documents > Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer...

Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer...

Date post: 05-Jan-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
91
Understanding DRAM Architecture R. Govindarajan Computer Science & Automation Supercomputer Edn. & Res. Centre Indian Institute of Scinece, Bangalore [email protected]
Transcript
Page 1: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Understanding DRAM Architecture

R. Govindarajan Computer Science & Automation

Supercomputer Edn. & Res. Centre

Indian Institute of Scinece, Bangalore

[email protected]

Page 2: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Why Study Memory System?

• Memory Wall [McKee’94]

– CPU-Memory speed disparity

– 100’s of cycles for off-chip access

DRAM

(2X/10 yrs)

Processor-Memory

Performance Gap:

(grows 50% / year)

Proessor

(2X/1.5yr)

Perf

orm

an

ce

Year

Moore’s Law

2

Page 3: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Memory Hierarchy : Recap

3

CPU MMU

L1

I-Cache

L1

D-Cache

L2 Unified Cache

Memory

Page 4: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Memory Hierarchy in Multicore

4

L2-Cache

C0 C1

L1$ L1$

C2 C3

L1$ L1$

Memory

Memory Controller

1 – 2 Cycles

100 – 300 cycles

10 – 15 cycles

Page 5: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Core

0

Core

1

Core

15

...

L1D

L1I

L1D

L1I

L1D

L1I

L2

Cache

(Off

Chip)

Main

Memory

Hit

Mem

ory

Co

ntr

oll

er

Miss

Multi-Core Processor

L3

Cache

(LLSC)

L2

Cache

Core

14

L1D

L1I

Memory Hierarchy in Multicore

Page 6: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Memory Bandwidth Demand for Multicores

• Memory Wall [McKee’94]

– CPU-Memory speed disparity

– 100’s of cycles for off-chip access

• Bandwidth Wall [ISCA’09]

– More cores and limited off-chip bandwidth

– Cores double every 18 months

– Pincount grows only by 10%

Off-chip accesses are expensive ! Memory System Performance is Critical

Page 7: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Big Picture of Memory

Page 8: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Big Picture of Memory

Page 9: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Big Picture of Memory

Page 10: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Memory Controller

Data Read & Write operations

Control

Address

Data

DIMM

Rank

Device

Overview of a DRAM Memory

Bank

10

Rows

Columns

Bank

Logic

Row Buffer

DRAM Bank

Page 11: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Basic DRAM Operations

• ACTIVATE Bring data from DRAM core into the row-buffer

• READ/WRITE Perform read/write operations on the contents in the row-buffer

• PRECHARGE Store data back to DRAM core (ACTIVATE discharges capacitors), put cells back at neutral voltage

Ld Ld

Memory Requests

PRE RD ACT

Ld

RD

Row buffer hits are faster and consume less power

PRE RD ACT

Row Buffer

Miss

Row Buffer

Hit Row Buffer

Miss

Page 12: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Access Address

(Row 0, Column 0)

Ro

w d

eco

der

Row address 0

Empty

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 13: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Access Address

(Row 0, Column 0)

Ro

w d

eco

der

Row address 0

Empty

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 14: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Access Address

(Row 0, Column 0)

Ro

w d

eco

der

Row 0

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 15: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Access Address

(Row 0, Column 0)

Ro

w d

eco

der

Column decoder Column address 0

Row 0

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 16: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Access Address

(Row 0, Column 0)

Ro

w d

eco

der

Column decoder Column address 0

Row 0

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 17: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Access Address

(Row 0, Column 0)

Ro

w d

eco

der

Column decoder Column address 0

Data

Row 0

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 18: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Row 0

Access Address

(Row 0, Column 1) Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 19: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Row 0

Access Address

(Row 0, Column 1)

HIT

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 20: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Row 0

Access Address

(Row 0, Column 1)

Column address 1

HIT

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 21: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Row 0

Access Address

(Row 0, Column 1)

Column address 1

HIT

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 22: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Row 0

Access Address

(Row 0, Column 1)

Column address 1

HIT

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 23: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Row 0

Access Address

(Row 1, Column 0) Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 24: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Row 0

Access Address

(Row 1, Column 0)

CONFLICT !

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 25: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Access Address

(Row 1, Column 0)

CONFLICT !

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 26: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Access Address

(Row 1, Column 0)

CONFLICT !

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 27: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Access Address

(Row 1, Column 0)

Row address 1

CONFLICT !

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 28: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Access Address

(Row 1, Column 0)

Row address 1

CONFLICT !

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 29: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Access Address

(Row 1, Column 0)

Row 1 CONFLICT !

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 30: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Access Address

(Row 1, Column 0)

Row 1

Column address 0

CONFLICT !

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 31: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Access Address

(Row 1, Column 0)

Row 1

Column address 0

CONFLICT !

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 32: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

12

DRAM Bank Operation

Row Buffer

Ro

w d

eco

der

Column decoder

Data

Access Address

(Row 1, Column 0)

Row 1

Column address 0

Columns

Ro

ws

Slide Source: Onur Mutlu, CMU

Page 33: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

DRAM Command Summary

Slide Source: S. Rixner

Page 34: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

DRAM Memory Controller

• Frontend

– Request/Response Buffers

– Memory mapping

– Arbiter

• Controller Backend

– Command Generator

• Timing to be obeyed

Slide Source: S. Rixner

Page 35: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Memory Controller Control

Address

Data

Bank Level Parallelism in DRAM

Bank

Page 36: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Memory Controller Control

Address

Data

Bank Level Parallelism in DRAM

Bank

Memory Requests

Ld A0

Page 37: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Memory Controller Control

Address

Data

Bank Level Parallelism in DRAM

Bank

Memory Requests

PRE RD ACT

Ld A0

Page 38: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Memory Controller Control

Address

Data

Bank Level Parallelism in DRAM

Bank

Ld B1

Memory Requests

PRE RD ACT

Ld A0

PRE RD ACT

Page 39: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Memory Controller Control

Address

Data

Bank Level Parallelism in DRAM

Bank

Ld B2 Ld B1

Memory Requests

PRE RD ACT

Ld A0

PRE RD ACT

PRE RD ACT

Page 40: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Memory Controller Control

Address

Data

Bank Level Parallelism in DRAM

Bank

Ld B2 Ld B1

Memory Requests

PRE RD ACT

Ld A0

PRE RD ACT

PRE RD ACT

Ld C1

RD

Page 41: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Memory Controller Control

Address

Data

Bank Level Parallelism in DRAM

Bank

Ld B2 Ld B1

Memory Requests

PRE RD ACT

Ld A0

PRE RD ACT

PRE RD ACT

Ld C1

RD

Bank Level Parallelism • Improves perf. with Parallelism and Row Buffer Hit • Hurts perf. due to bank-to-bank switch delay

Page 42: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

DRAM Refresh

• Capacitors leak and lose charge Need periodic restoration of charge

• JEDEC Spec: At normal temp, cell retention time limit is 64ms. At high (extended) temp, retention time halves to 32ms.

• The memory controller issues refresh operations periodically.

Normal Access Normal

Access

Normal

Access

Normal

Access Refresh

Normal

Access

• Assume 4GB DRAM with 2KB pages, organized as 16 banks

• 2M pages total, 128K pages per bank

• Refreshing a page takes 20ns (ACTIVATE+PRECHARGE)

• Refreshing all pages in a bank 2.6ms!

• 2.6/64 = 4% overhead!

Page 43: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

43

Memory Access Scheduling: FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0

Row 0;Col 0

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 0;Col 1

Row 1;Col 0

Row 0;Col 4

Page 44: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

43

Memory Access Scheduling: FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0

Row 0;Col 0

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 0;Col 1

Row 1;Col 0

Row 0;Col 4

Page 45: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

43

Memory Access Scheduling: FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0

Row 0;Col 0

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 0;Col 1

Row 1;Col 0

Row 0;Col 4

Page 46: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

43

Memory Access Scheduling: FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0 Request Buffer

Slide Source: Onur Mutlu, CMU

Row 0;Col 1

Row 1;Col 0

Row 0;Col 4

Page 47: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

43

Memory Access Scheduling: FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0 Request Buffer

Slide Source: Onur Mutlu, CMU

Row 0;Col 1

Row 1;Col 0

Row 0;Col 4

Row 0;Col 5

Page 48: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

43

Memory Access Scheduling: FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0 Request Buffer

Slide Source: Onur Mutlu, CMU

Row 1;Col 0

Row 0;Col 4

Row 0;Col 5

Page 49: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

43

Memory Access Scheduling: FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 1;Col 0

Row 0;Col 4

Row 0;Col 5

Page 50: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

43

Memory Access Scheduling: FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 1;Col 0

Row 0;Col 4

Row 0;Col 5

Page 51: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

43

Memory Access Scheduling: FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 1;Col 0

Row 0;Col 4

Row 0;Col 5

Row 1

Page 52: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

43

Memory Access Scheduling: FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 1;Col 0

Row 0;Col 4

Row 0;Col 5

Row 1

Page 53: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

43

Memory Access Scheduling: FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 0;Col 4

Row 0;Col 5

Page 54: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

54

Memory Access Scheduling : FR-FCFS

• A row-conflict memory access takes significantly

longer than a row-hit access

• Current controllers take advantage of the row

buffer

• Commonly used scheduling policy (FR-FCFS) [Rixner, ISCA’00]

(1) Row-hit (column) first: Service row-hit memory

accesses first

(2) Oldest-first: Then service older accesses first

• This scheduling policy aims to maximize DRAM

throughput

Slide Source: Onur Mutlu, CMU

Page 55: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0

Row 0;Col 0

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 0;Col 1

Row 1;Col 0

Row 0;Col 4

Page 56: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0

Row 0;Col 0

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 0;Col 1

Row 1;Col 0

Row 0;Col 4

Page 57: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0

Row 0;Col 0

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 0;Col 1

Row 1;Col 0

Row 0;Col 4

Page 58: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0 Request Buffer

Slide Source: Onur Mutlu, CMU

Row 0;Col 1

Row 1;Col 0

Row 0;Col 4

Page 59: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0 Request Buffer

Slide Source: Onur Mutlu, CMU

Row 0;Col 1

Row 1;Col 0

Row 0;Col 4

Row 0;Col 5

Page 60: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0 Request Buffer

Slide Source: Onur Mutlu, CMU

Row 1;Col 0

Row 0;Col 4

Row 0;Col 5

Page 61: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0 Request Buffer

Slide Source: Onur Mutlu, CMU

Row 1;Col 0

Row 0;Col 5

Page 62: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0 Request Buffer

Slide Source: Onur Mutlu, CMU

Row 1;Col 0

Page 63: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Row 0

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 1;Col 0

Page 64: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 1;Col 0

Page 65: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 1;Col 0

Row 1

Page 66: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Request Buffer

Slide Source: Onur Mutlu, CMU

Row 1

Page 67: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

55

Memory Access Scheduling : FR-FCFS

Row Buffer R

ow

deco

der

Column decoder

Data

Request Buffer

Slide Source: Onur Mutlu, CMU

Page 68: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Emerging Memory Technology

• Non-Volatile Memory technology

– Phase Change Memory (PCM), Magnetic RAM

(MRAM), Resistive RAM (RRAM), Spin Torque

Transfer RAM (STT-RAM), …

68 Slide Source: Moin Quereshi, Georgia Tech.

Page 69: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Emerging Memory Technology

• Phase Change Memory

– Data stored by changing phase of special material

– Data read by detecting material’s resistance

– Phase change material (chalcogenide glass) exists in two states:

1. Amorphous: high resistivity – reset state or 0

2. Crystalline: low resistivity – set state or 1

– Non-volatality and low idle power (no refresh)

– Expected to scale (to 9nm), denser than DRAM, and can store multiple bits/cell

– Higher Write latency and write-energy

– Endurance issues (cell dies after 108 writes)

69 Slide Source: Onur Mutlu, CMU

Page 70: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

DRAM – PCM Hybrid Memory

• PCM-based (main) memory be organized?

• Hybrid PCM+DRAM

– How to partition/migrate data between PCM and DRAM

– Is DRAM a cache for PCM or part of main memory?

– How to design the hardware and software 70 Slide Source: Onur Mutlu, CMU

Page 71: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

PCM-based Main Memory

• How should PCM-based (main) memory be organized?

• Pure PCM main memory [Lee et al., ISCA’09, Top Picks’10]:

– How to redesign entire hierarchy (and cores) to overcome PCM shortcomings

71 Slide Source: Onur Mutlu, CMU

Page 72: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Expanding the Multicore Memory Hierarchy

72

L2-Cache

C0 C1

L1$ L1$

C2 C3

L1$ L1$

Memory

Memory Controller

DRAM

Cache

Page 73: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Stacked DRAM

• DRAM vertically stacked over the processor die.

• Stacked DRAMs offer

– High bandwidth

– Large capacity

– Same or slightly lower latency.

3-D Stacked DRAM 2.5-D Stacked DRAM

Page 74: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Stacked DRAM

• DRAM vertically stacked over the processor die.

• Stacked DRAMs offer

– High bandwidth

– Large capacity

– Same or slightly lower latency.

3-D Stacked DRAM 2.5-D Stacked DRAM

Can be used as

Cache or

Part of Memory

Page 75: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Multicore With DRAM Cache

Core

0

Core

1

Core

N

.

.

.

L1D

L1I

L1D

L1I

L1D

L1I

L2

(LLSC)

DRAM

Cache

(Vertically

Stacked)

(Off

Chip)

Main

Memory

Hit

Memory

Controller

Miss

Processor with Stacked DRAM

Page 76: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Overview of Different Designs

PA_M PA_DRAM $ = Hash (PA_M) + Offset

Page 77: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Problems in Architecting Large Caches • Organizing at cache line granularity (64 B)

reduces wasted space and wasted bandwidth

• Problem: Cache of hundreds of MB needs tag-

store of tens of MB

• E.g. 256MB DRAM cache needs ~20MB tag store

(5 bytes/line)

• But big blocks have their own issues

– Wasted off-chip bandwidth

– Wasted cache space

Page 78: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Problems in Architecting Large Caches • Organizing at cache line granularity (64 B)

reduces wasted space and wasted bandwidth

• Problem: Cache of hundreds of MB needs tag-

store of tens of MB

• E.g. 256MB DRAM cache needs ~20MB tag store

(5 bytes/line)

• But big blocks have their own issues

– Wasted off-chip bandwidth

– Wasted cache space

Option 1: SRAM Tags

Fast, But Impractical

(Not enough transistors)

Option 2: Tags in DRAM

Naïve design has 2x latency

(Two accesses -- tag and data)

Page 79: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Stacked DRAM Caches

Tags-On-DRAM

• Cache tags on DRAM itself

• Typically 64B blocks

• Due to overhead of tag

access from DRAM,

requires some form of

predictor/cache in SRAM

• Several recent proposals

(Loh-Hill, AlloyCache,

ATCache, Bi-Modal)

Tags-On-SRAM

• Cache tags on SRAM

• Expensive SRAM

• Large storage overhead

• So typically uses larger block

sizes to reduce overhead

(~ 1KB)

• Off-chip bandwidth and

cache utilization are

concerns

• Several recent proposals

(FootPrintCache, CHOP)

Page 80: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Stacked DRAM Cache Orgn.

Core

0

Core

1

Core

N

.

.

.

L1D

L1I

L1D

L1I

L1D

L1I

L2

(LLSC)

DRAM

Cache

(Vertically

Stacked)

(Off

Chip)

Main

Memory

Hit

Memory

Controller

Miss

Processor with Stacked DRAM

Page 81: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Stacked DRAM Cache Orgn.

Core

0

Core

1

Core

N

.

.

.

L1D

L1I

L1D

L1I

L1D

L1I

L2

(LLSC)

DRAM

Cache

(Vertically

Stacked)

(Off

Chip)

Main

Memory

Hit

Memory

Controller

Miss

Processor with Stacked DRAM

MetaData

on SRAM

Page 82: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Stacked DRAM Cache Orgn.

Core

0

Core

1

Core

N

.

.

.

L1D

L1I

L1D

L1I

L1D

L1I

L2

(LLSC)

DRAM

Cache

(Vertically

Stacked)

(Off

Chip)

Main

Memory

Tag-

Pred

Hit

Memory

Controller

Miss

Processor with Stacked DRAM

MetaData

on DRAM

Page 83: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Overview of Different Designs

• For DRAM caches, critical to optimize first for

latency, then hit-rate

PA_M PA_DRAM $ = Hash (PA_M) + Offset

Page 84: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Overview of Different Designs

• For DRAM caches, critical to optimize first for

latency, then hit-rate

PA_M PA_DRAM $ = Hash (PA_M) + Offset

Hash (PA_M) + Offset

Page 85: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Overview of Different Designs

• For DRAM caches, critical to optimize first for

latency, then hit-rate

PA_M PA_DRAM $ = Hash (PA_M) + Offset

Hash (PA_M) + Offset

Hash (PA_M)

Page 86: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Overview of Bi-Modal Cache

• Tags-In-DRAM organization

• With 3 new organizational features:

1) Cache Sets are Bi-Modal – they can hold

a combination of big (512B) and small

(64B) blocks

2) Parallel Tag and Data Accesses

3) Eliminating Most Tag Accesses via a

small SRAM based Way Locator

Reduce Hit

Latency

Improves Hit Rate

And

Reduces Off-Chip

Bandwidth

Page 87: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Supporting Bi-Modal Block Sizes

• Each Set can hold some big

(512B) and some small (64B)

blocks.

• Block Size Predictor

– Blocks with high spatial reuse

fetch 512B

– Blocks with little spatial reuse

fetch 64B

512B 512B 512B 64 64 64 …

Predict

Block

Size

DRAM Cache Miss

Fetch 512B

Block

Fetch 64B

Block

Big Small

DRAM Cache Set

Page 88: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Parallel Tag and Data Accesses

High Row Buffer Hit Rate in

the Metadata Bank!

Channel 0

D M D D

Channel 1

D M D D

Tag Access Data Access

T T T T T T T T T T T T T T T T

Data Data

Page in DRAM Cache

Page 89: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Eliminating a Majority of Tag Accesses using the Way Locator

Set MRU: Tag and Way MRU-1: Tag and Way

Set 0 Tag a1 Way 3 Tag a2 Way 1

Set 1 Tag a3 Way 2 Tag a4 Way 0

Set 2 Tag a5 Way 0 Tag a6 Way 3

Set N Tag am Way x Tag an Way y

Addr

Set

2-way Set Associative Cache

Each entry specifies tag and associated way

(DRAM column) where data is stored

Page 90: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Putting them together Last Level Cache Miss

Way

Locator

Hit?

DRAM

Cache

(Data)

Hit

Return

Data

DRAM

Cache

(Tag)

DRAM

Cache

(Data)

Miss

Tag

Match?

Cache Hit

Return

Data

Predict

Block

Size

Fetch 512B

Block

Fetch 64B

Block

Big Small

Cache Miss

Lowest

Latency

Parallel

Access

Page 91: Understanding DRAM ArchitectureSlide Source: Onur Mutlu, CMU . 12 DRAM Bank Operation Row Buffer Access Address (Row 0, Column 0) er Row address 0 Empty Columns s Slide Source: Onur

Hit Latency Improvement

Common Case:

Over 85% of

accesses


Recommended