Post on 12-Nov-2014
description
transcript
DSys
t
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM Circuit and Architecture Basics
• Overview
• Terminology
• Access Protocol
• ArchitectureStorage elemen
Switching element
Bit Line
Word Line(capacitor)
DSys
... Bit Lines...
MemoryArray
DRAM
Sense Amps
olumn Decoder
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM Circuit Basics
DRAM Cell
Row
Dec
od
er
. .. W
ord
Lin
es ..
.
Storage element
Switching element
Bit Line
Word Line Data In/OutBuffers
C(capacitor)
DSys
Row, B
node node
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
itlines and Wordlines DRAM Circuit Basics
“Row” Defined
Bit Lines
Word Line
“Row” of DRAM
Row Size: 8 Kb @ 256 Mb SDRAM4 Kb @ 256 Mb RDRAM
DSys
SenseandAmplify
4
5
6
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM Circuit Basics
Sense Amplifier I
6 Rows shown
1
2
3
DSys
Vcc/2
SenseandAmplify
4
5
6
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM Circuit Basics
Sense Amplifier II : Precharged
Vcc (logic 1) Gnd (logic 0)
precharged to Vcc/2
1
2
3
DSys
ead
SenseandAmplify
4
5
6
Vcc/2
WordlineDriven
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM Circuit Basics
Sense Amplifier III : Destructive R
1
2
3
Vcc (logic 1) Gnd (logic 0)
DSys
l
... Bit Lines...
MemoryArray
DRAM
Sense Amps
olumn Decoder
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM Access Protoco
ROW ACCESS
AKA: OPEN a DRAM Page/Row
RAS (Row Address Strobe)
or
orACT (Activate a DRAM Page/Row)
BUSMEMORY
CONTROLLERCPU
Row
Dec
od
er
. .. W
ord
Lin
es ..
.
Data In/OutBuffers
C
DSys
once tthe bitsubsetto the picks o
big poiRAS ountil findata ..on thethe senread b
DRAM on chip
idth (4, 8,16, 32)th (128 bit fixed)
(1, 2, 4, 8)
. DDR SDRAM
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
he data is valid on ALL of lines, you can select a of the bits and send them output buffers ... CAS ne of the bits
nt: cannot do another r precharge of the lines ished reading the column
. can’t change the values bit lines or the output of se amps until it has been
y the memory controller
DRAM Circuit Basics
“Column” Defined
“One Row” of DRAM
Column: Smallest addressable quantity of
SDRAM*: column size == chip data bus wRDRAM: column size != chip data bus wid
4 bit wide columns
SDRAM*: get “n” columns per access. n =RDRAM: get 1 column per access.
#2 #3 #4 #5#0 #1
* SDRAM means SDRAM and variants. i.e
DSys
l
... Bit Lines...
MemoryArray
DRAM
Sense Amps
olumn Decoder
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM Access Protoco
COLUMN ACCESS I
READ Commandor
CAS: Column Address Strobe
BUSMEMORY
CONTROLLERCPU
Row
Dec
od
er
. .. W
ord
Lin
es ..
.
Data In/OutBuffers
C
DSys
then thbus ... are usmight or a lothe ne(this is
l
with CAS
... Bit Lines...
MemoryArray
DRAM
Sense Amps
olumn Decoder
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
e data is valid on the data depending on what you ing for in/out buffers, you be able to overlap a litttle t of the data transfer with xt CAS to the same page PAGE MODE)
DRAM Access Protoco
Column Access II
note: page mode enables overlap
BUSMEMORY
CONTROLLERCPU
Row
Dec
od
er
. .. W
ord
Lin
es ..
.
Data In/OutBuffers
C
... with optional additionalCAS: Column Address Strobe
Data Out
DSys
NOTE
AM cell to
... Bit Lines...
MemoryArray
DRAM
Sense Amps
olumn Decoder
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM “Speed” Part I
How fast can I move data from DRsense amp?
RCD (Row Command Delay)
BUSMEMORY
CONTROLLERCPU
Ro
w D
eco
der
. .. W
ord
Lin
es ..
.
Data In/OutBuffers
C
tRCD
DSys
se amps
... Bit Lines...
MemoryArray
DRAM
Sense Amps
olumn Decoder
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM “Speed” Part II
How fast can I get data out of senback into memory controller?
BUSMEMORY
CONTROLLERCPU
Ro
w D
eco
der
. .. W
ord
Lin
es ..
.
Data In/OutBuffers
C
CAS: Column Address Strobe
tCAS aka
tCL
tCASL aka
CASL: Column Address Strobe Latency
CL: Column Address Strobe Latency
DSys
I
AM cell into
... Bit Lines...
MemoryArray
DRAM
Sense Amps
olumn Decoder
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM “Speed” Part II
How fast can I move data from DRmemory controller?
RAC (Random Access Delay)
BUSMEMORY
CONTROLLERCPU
Ro
w D
eco
der
. .. W
ord
Lin
es ..
.
Data In/OutBuffers
C
tRAC = tRCD + tCAS
DSys
rray so I can
... Bit Lines...
MemoryArray
DRAM
Sense Amps
olumn Decoder
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM “Speed” Part IV
How fast can I precharge DRAM aengage another RAS?
RP (Row Precharge Delay)
BUSMEMORY
CONTROLLERCPU
Ro
w D
eco
der
. .. W
ord
Lin
es ..
.
Data In/OutBuffers
C
tRP
DSys
t rows?
... Bit Lines...
MemoryArray
DRAM
Sense Amps
olumn Decoder
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM “Speed” Part V
How fast can I read from differen
RC (Row Cycle Time)
BUSMEMORY
CONTROLLERCPU
Row
Dec
od
er
. .. W
ord
Lin
es ..
.
Data In/OutBuffers
C
tRC = tRAS + tRP
DSys
ary I
Cycle Time
Precharge Delay
dom Access Delay
umn Address Strobe
Address Strobe
tems designersctuers
hitect:d code
t traversal
Command Delay
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM “Speed” Summ
What do I care about?
RC :Row
tRC = tRAS + tRP
RP :Row
tRP
RAC :Ran
tRAC = tRCD + tCAS
CAS: Col
tCAS
RAS: Row
tRCD Seen in ads.Easy to explain
Embedded sysDRAM manufa
Computer ArcLatency bouni.e. linked lis
Easy to sell
RCD: Row
DSys
ary IIom ss (tRAC)
Row Cycle Time (tRC)
60 ns
60 ns
70 ns
25 ns
25 ns
to beable
pensive
n standard
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM “Speed” Summ
DRAM Type FrequencyData Bus Width (per chip)
Peak Data Bandwidth (per Chip)
RandAcceTime
PC133 SDRAM
133 16 200 MB/s 45 ns
DDR 266 133 * 2 16 532 MB/s 45 ns
PC800RDRAM
400 * 2 16 1.6 GB/s 60 ns
FCRAM 200 * 2 16 0.8 GB/s 25 ns
RLDRAM 300 * 2 32 2.4 GB/s 25 ns
DRAM is “slow”But doesn’t havetRC < 10ns achiev
Higher die cost
Not commodity Ex
Not adopted i
DSys
DRAMdetermRAS+Csignificthe CPcontroEach toverheoverheThis mlonger
DRAM
E2/E3
1
ueuetrolleruencesqueued)
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
“latency” isn’t inistic because of CAS or AS, and there may be
ant queuing delays within U and the memory
llerransaction has some ad. Some types of ad cannot be pipelined. eans that in general, bursts are more efficient.
“DRAM latency”
B
CD
E
F
A
CPU MemController
A: Transaction request may be delayed in QB: Transaction request sent to Memory ConC: Transaction converted to Command Seq
(may be D: Command/s Sent to DRAME1: Requires only a CAS orE2: Requires RAS + CAS or
F: Transaction sent back to CPU
“DRAM Latency” = A + B + C + D + E + F
E3: Requires PRE + RAS + CAS
DSys
NOTE
sics
x8 DRAM
... Bit Lines...
Sense Amps
. . .
.
rs
DRAM
Ro
w D
eco
der
MemoryArray
Column Decoder
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM Architecture Ba
PHYSICAL ORGANIZATION
This is per bank …Typical DRAMs have 2+ banks
x2 DRAM x4 DRAM
DataBuffe
x8
... Bit Lines...
Sense Amps. .
. .
DataBuffers
x2 DRAM
Ro
w D
eco
der
MemoryArray
Column Decoder
... Bit Lines...
Sense Amps
. . .
.
DataBuffers
x4 DRAM
Ro
w D
eco
der
MemoryArray
Column Decoder
DSys
let’s loway .. portray
[explai
main psignalslatchescolumn
sics
AM
dut
Data Transfer
Column Access
Row Access
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
ok at the interface another the say the data sheets it.
n]
oint: the RAS\ and CAS\ directly control the that hold the row and addresses ...
DRAM Architecture Ba
Read Timing for Conventional DR
RowAddress
ColumnAddress
ValidDataout
RAS
CAS
Address
DQ
RowAddress
ColumnAddress
ValiDatao
DSys
since Dhave bto the to Burschangmodifictarget
[discus
EverytSDRAinexpeconsidwas eslatch, PSDRAhowevideas, considthere idirectiohas ap
[ do LAwith EINTER
ee
Structural
M
, DDR/2 Future Trends
. . .
. . .
. .
. .
S
M
AM
$
ModificationsTargetingLatency
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
RAM’s inception, there een a stream of changes design, from FPM to EDO t EDO to SDRAM. the
es are largely structural ations -- nimor -- that
THROUGHPUT.
s FPM up to SDRAM
hing up to and including M has been relatively nsive, especially when ering the pay-off (FPM sentially free, EDO cost a BEDO cost a counter,
M cost a slight re-design). er, we’re run out of “free” and now all changes are ered expensive ... thus s no consensus on new ns and myriad of choices peared
TENCY mods starting SDRAM ... and then the FACE mods ]
DRAM Evolutionary Tr
(Mostly) Structural Modifications
Interface Modifications
Conventional
FPM EDO ESDRA
Rambus
. . . .
MOSY
FCRA
VCDRTargeting Throughput
Targeting Throughput
DRAM
SDRAMP/BEDO
DSys
NOTE
AM
idout
Data Transfer
Column Access
Transfer Overlap
Row Access
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
DRAM Evolution
Read Timing for Conventional DR
RowAddress
ColumnAddress
ValidDataout
RAS
CAS
Address
DQ
RowAddress
ColumnAddress
ValData
DSys
FPM aesenseCAS c
much
problevalue ibuffer data is
ValidDataout
Data Transfer
Column Access
Transfer Overlap
Row Access
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
allows you to keep th amps actuve for multiple
ommands ...
better throughput
m: cannot latch a new n the column address until the read-out of the complete
DRAM Evolution
Read Timing for Fast Page Mode
RowAddress
ColumnAddress
ValidDataout
ColumnAddress
ColumnAddress
ValidDataout
RAS
CAS
Address
DQ
DSys
solutioinsteadbuffers
by putcolumnaddressoone
ut
Data Transfer
Column Access
Transfer Overlap
Row Access
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
n to that problem -- of simple tri-state , use a latch as well.
ting a latch after the mux, the next column s command can begin
r
DRAM Evolution
Read Timing for Extended Data O
RowAddress
ColumnAddress
ValidDataout
RAS
CAS
Address
DQ
ColumnAddress
ColumnAddress
ValidDataout
ValidDataout
DSys
by drivan intean extecycle tbus wa30% Data Transfer
Column Access
Transfer Overlap
Row Access
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
ing the col-addr latch from rnal counter rather than rnal signal, the minimum
ime for driving the output s reduced by roughly
DRAM Evolution
Read Timing for Burst EDO
RowAddress
ColumnAddress
RAS
CAS
Address
DQ ValidData
ValidData
ValidData
ValidData
DSys
“pipelinof the toggleaddrestogglebus. thwhen tstops t
DO
Data Transfer
Column Access
Transfer Overlap
Row Access
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
e” refers to the setting up read pipeline ... first CAS\ latches the column s, all following CAS\
s drive data out onto the erefore data stops coming he memory controller oggling CAS\
DRAM Evolution
Read Timing for Pipeline Burst E
RowAddress
ColumnAddress
RAS
CAS
Address
DQ ValidData
ValidData
ValidData
ValidData
DSys
main bor memhavinginternacontroother tcyclesthoughlatencyschemthroug
AM
d Bus)
Data Transfer
Column Access
Transfer Overlap
Row Access
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
enefit: frees up the CPU ory controller from
to control the DRAM’s l latches directly ... the
ller/CPU can go off and do hings during the idle instead of “wait” ... even the time-to-first-word actually gets worse, the e increases system hput
DRAM Evolution
Read Timing for Synchronous DR
(RAS + CAS + OE ... == Comman
Command
Address
DQ
Clock
RowAddr
ColAddr
ValidData
ValidData
ValidData
ValidData
ACT READ
RAS
CAS
DSys
outputto staraccess
latch wallowsRAS saccessPREC
AM
ValidData
ValidData
ValidData
ValidData
idta
ValidData
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
latch on EDO allowed you t CAS sooner for next s (to same row)
hole row in ESDRAM -- you to start precharge & ooner for thee next page -- HIDE THE
HARGE OVERHEAD.
DRAM Evolution
Inter-Row Read Timing for ESDR
Command
Address
DQ
Clock
RowAddr
ColAddr
ValidData
ValidData
ValidData
ValidData
ACT READ
RowAddr
ColAddr
ACT READPRE
“Regular” CAS-2 SDRAM, R/R to same bank
Command
Address
DQ
Clock
RowAddr
ColAddr
ValidData
ValidData
ValidData
ValidData
ACT READ
RowAddr
ColAddr
ValidData
ValidData
ValDa
ACT READ
ESDRAM, R/R to same bank
PRE
Bank
Bank
DSys
neat febufferi
sive?)
1/0
ColAddr
ValidData
ValidData
ValidData
READ
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
ature of this type of ng: write-around
DRAM Evolution
Write-Around in ESDRAM
(can second READ be this aggres
Command
Address
DQ
Clock
RowAddr
ColAddr
ValidData
ValidData
ValidData
ValidData
ACT READ
RowAddr
ColAddr
ValidData
ValidData
ValidData
ValidData
ACT WRITEPRE
“Regular” CAS-2 SDRAM, R/W/R to same bank, rows 0/
Command
Address
DQ
Clock
RowAddr
ColAddr
ValidData
ValidData
ValidData
ValidData
ACT READ
RowAddr
ColAddr
ValidData
ValidData
ValidData
ValidData
ACT WRITE
ESDRAM, R/W/R to same bank, rows 0/1/0
PRE
Bank
Bank
RowAddr
ACTPRE
Bank
ColAddr
ValidData
ValidData
ValidData
ValidData
READ
DSys
main tbunch rambuyou mudirectlynot theextra c... howbandwcache,into cawant itreduciSDRAthis mathroug
nel
duces energy
$
t/Outputuffer
DQs
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
hing ... it is like having a of open row buffers (a la s), but the problem is that st deal with the cache (move into and out of it), DRAM banks ... adds an ouple of cycles of latency ever, you get good idth if the data you want is and you can “prefetch” che ahead of when you ... originally targetted at ng latency, now that M is CAS-2 and RCD-2, ke sense only in a
hput way
DRAM Evolution
Internal Structure of Virtual Chan
Segment cache is software-managed, re
Row Decoder
2Kb Segment
2Kb Segment
2Kb Segment
2Kb Segment
Bank A
Bank B16 “Channels”
InpuB
Sel/Dec
(segments)
SenseAmps
2Kbit # DQs
Activate PrefetchRestore
ReadWrite
DSys
FCRAdata aportion
8K rowselect (assumthe da
AM
/access
tRCD = 5ns
8M Array
Sense Amps
FCRAM
(one clock)
(?)
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
M opts to break up the rray .. only activate a of the word line
s requires 13 bits tto ... FCRAM uses 15 ing the array is 8k x 1k ...
ta sheet does not specify)
DRAM Evolution
Internal Structure of Fast Cycle R
Reduces access time and energy
tRCD = 15ns
8M Array13 bits
Sense Amps
15 bits
SDRAM
(two clocks)
Ro
w D
eco
der
Ro
w D
eco
der
(8Kr x 1Kb)
DSys
MoSysfurtherinterfaenergy
[physic
auto retransptthrougthem w
but whrepeatduratiorefresh
solutioCACHshould(magic
AM
. . .
. . .
. .
. . . . . .
RAM Memorytem: Lecture 2
Spring 2003
Bruce JacobDavid Wang
University ofMaryland
takes this one step ... DRAM with an SRAM ce & speed but DRAM
al partitioning: 72 banks]
fresh -- how to do this arently? the logic moves h the arrays, refreshing hen not active.
at is one bank gets ed access for a long n? all other banks will be ed, but that one will not.
n: they have a bank-sized E of lines ... in theory, never have a problem )
DRAM Evolution
Internal Structure of MoSys 1T-SR
addr
BankSelect
DQs
$
AutoRefresh