Memory Interfaces & Controllers - Sandeep Kulkarni, Lattice

Post on 22-Apr-2015

2,689 views 4 download

description

 

transcript

Memories; interfaces & controllers

Sandeep KulkarniArea Technical Managerg

Memory Types

Does not req ire refresh access is

VolatileVolatile

SRAMSRAM Does not require refresh, access is easier. Special types based on access methods. Used for faster access and low power

Di it lDi it l

DRAMDRAM Dynamic RAM, requires periodic refreshing. Uses transistor and capacitor to store charge. Is compact and denser

Digital MemoryDigital

Memory

EEPROMEEPROM Byte erasable, limited write cycles, faster read, ser/parallel

NonVolatileNonVolatile

NOR & NAND type block eraseFLASHFLASH

NOR & NAND type, block erase, lower cost, denser,ser/parallel

SRAM sub-types & applications

AsyncAsync• upto32Mb, fast 8ns

• Upto 333Mhz, concurrent R/W burstQDRIIQDRII concurrent R/W,burst support, DDR data

• Sync/Async,250MhzSRAMSRAM FIFOFIFO

Sync/Async,250Mhz

• Random access uptoDPRAM/MPMDPRAM/MPM • Random access, upto 200Mhz

• Associative returnsCAMCAM

Associative, returns address based on data search

SDRAM memory subtypes

SDR • Upto 133Mhz,LVCMOS, used in

DDR

pembedded systems

• Upto 200Mhz, SSTL18, source synchronous

DDR2 • Upto 400Mhz,SSTL18, diff. strobe.

SDRAMDDR3

strobe.

• Upto 800Mhz,SSTL15,flyby hit t

RLDRAM/2

architecture• Reduced latency, 533Mhz, high

bandwidth, high density SRAM like random access

LPDDR/2SRAM-like random access

• Lowpower, upto 400Mhz

FPGA On Chip Ram

• FPGA has primarily 2 types of on-chip RAMp y yp p– Block RAM

» SRAM memory block of size 9K/18K/36KS t lti l d f ti» Supports multiple modes of operation: ROM/RAM/DPRAM/FIFO etc.

» Parameterisable aspect ratios, cascadableFAST t 600Mh» FAST upto 600Mhz

– Distributed RAM» LUT configured as memory:4i/p LUT = 16x1g y p» Localized Very FAST & efficient» Supports multiple modes of operation:

ROM/RAM/DPRAM/FIFO» Cascadable, used for shallow /small memory requirement

On chip flash - FlashBAK Technology

Make Infinite Reads & Writes to EBR @ Speeds of

up to 350MHz

Write to Flash During Programming

Flash

FPGAEBR

JTAG / SPIPORT

up to 350MHz

FPGALogic Write From Flash to EBRs

During Configuration /

Write From EBRs to Flash

• Use FlashBAK to Store:

Write From EBRs to Flash on User Command

– Error Codes, POST Results, Serial Numbers and uP Code

• Erase and Reprogram Flash in <3 seconds • sysMEM EBR 166 to 885Kbits• Unlimited Random Read and Write Capability through EBR• Other types are SerialTag,UFM etc.

Memory in Typical Networking Application

Memory Organization – DDR2

Source:Micron

Read Cycle – DDR2

DDR2 Access

R d f W it t

• Source Synchronous Data(DQ) from memory is edge aligned w.r.t. strobe(DQS).

Read from memory Write to memory

g g ( Q )• Data writes to memory have to be centre aligned• Tight timing budget Timing for data valid window• Tight timing budget. Timing for data valid window

at 266MHz ~1ns. Precise timing control is crucial.

DDR2 IO implementation

• To capture read data properly data strobe alignment has to be performed in the fpga io’s g p pgwhich should be compensated for PVT and works on wide range of frequency. Multiple techniques exists to accomplish this.exists to accomplish this.

DQSDLL+DQSBUF Method

• Dedicated circuitry in the IOB takes care of the data strobe alignment

READREAD

DQSI

SCLK

DQSDLL provides digital delay code for PVT compensated 90 degree shift

DDR Registers in IOB

• The IOB contains DDR registers to perform– DDR to SDR– DDR to SDR – Half clock transfer– Synchronization & Clock transfer

IOB DDR Data Transfer timing diagram

Abstraction

• Memory Controllers offer abstraction and ease of use to designer

• Can be parameterized to support a many types of memories, data width, speed etc.

• Takes care of initializing the memory• Tracks the Read/Write and controls Refresh• Takes care of the memory timing requirements• Offers a complete data/command/add interface to

user for integration in the design.• Command queuing and command burst improves

b tili ti d th h tbus utilization and throughput• Intelligent bank management to optimize

performanceperformance

Typical DDR Memory Controller Block Diagram

Memory Controller User Interface

• Local interface signals groups simplify operationI i i li i & A R f h– Initialization & Auto Refresh

– Command & Addr– Data Writeata te– Data Read

• Example command interfacep

USER Commands & Data R/W

Data Write on User Interface

USER Commands

READ Data on User Interface

DDR Memory controller implementation

1. Core generation (Using IPexpress)2. Simulation (Eval scripts)3. Implementation (Synthesis & PAR)p ( y )4. Result evaluation (Utilization, Static timing)5 Pinout validation (PCB layout)5. Pinout validation (PCB layout)6. Backend design

Comparison of DDR Memory Standards

DDR3 Advantages

• Lower Power Lower Power– 1.5V

• Higher Speed– 400MHz ~ 800MHz

• Master ResetInitialization– Initialization

• More Performance– 2x DDR2

• Larger Densities– 8Gb/32GB

DDR3 Power Advantage

• Supply voltage reduced from 1 8V to1 5V• Supply voltage reduced from 1.8V to1.5V– More than 15% power saving• Slower core speed– DDR2-800:DDR2 (400MHz) / Core (200MHz)– DDR3-800:DDR3 (400MHz) / Core (100MHz)• Lower I/O buffer power– 34 ohm driver vs. 18 ohm driver (DDR2)• ~25 to 30% lower power than the same performance 25 to 30% lower power than the same performance

DDR2

DDR3 8n-Prefetch Architecture

DDR3 High Speed Signaling

• Fly-by routing• Write and Read Levelingg• ZQ Calibration through ZQ resistor• Dynamic ODT for improved WRITE signaling• Dynamic ODT for improved WRITE signaling

APPENDIX

Market Trends-Technology transition

Source:iSuppli

Market trends-Price per bit

Source: Microsoft

Key Memory Timing parameters

• CAS Latency : CL– The time between sending a column address to the memory and the

beginning of the data in response. This is the time it takes to read thebeginning of the data in response. This is the time it takes to read the first bit of memory from a DRAM with the correct row already open.

• ACTIVATE-to-READ or WRITE delay: tRCD– The number of clock cycles required between the opening a row of y q p g

memory and accessing columns within it. The time to read the first bit of memory from a DRAM without an active row is TRCD + CL.

• PRECHARGE period: tRP– The number of clock cycles required between the issuing of the

precharge command and opening the next row. The time to read the first bit of memory from a DRAM with the wrong row open is TRP + TRCD + CL.

• ACTIVATE to PRECHARGE delay: t• ACTIVATE-to-PRECHARGE delay: tRAS– The number of clock cycles required between a bank active command and issuing

the precharge command. This is the time needed to internally refresh the row, and overlaps with TRCD. Typically approximately equal to the sum of the previous three numbers.numbers.

• Others:tRC,tRRD,tRFC,tRTP,tWTR etc.