+ All Categories
Home > Documents > Complexities in Developing a High- Performance DDR ... · PDF fileComplexities in Developing a...

Complexities in Developing a High- Performance DDR ... · PDF fileComplexities in Developing a...

Date post: 30-Mar-2018
Category:
Upload: duongkhue
View: 223 times
Download: 1 times
Share this document with a friend
20
Chung Huang, Amjad Qureshi, and Kishore Kasamsetty Complexities in Developing a High- Performance DDR Subsystem at 3200Mbps on 16FF+ and 10FF
Transcript

Chung Huang, Amjad Qureshi, and Kishore Kasamsetty

Complexities in Developing a High-Performance DDR Subsystem at 3200Mbps on 16FF+ and 10FF

2 © 2015 Cadence Design Systems, Inc. All rights reserved.

•  Challenge of timing budget •  Challenge of system uncertainty vs. training/calibration •  Challenge of signal and power integrity •  T16FF+ LP4-PoP 3200 test chip data •  T16FF+ LP4-DSC 3200 test chip data •  10nm vs. 16FF data

Table of contents

3 © 2015 Cadence Design Systems, Inc. All rights reserved.

Challenge of high-speed DDR timing budget

DDR4 SDRAM Cadence PHY

Critical Timing – DDR4 Timing Budget Breakdown

Shrinking Trend of Read Eye Window

4 © 2015 Cadence Design Systems, Inc. All rights reserved.

DDR subsystem timing budget trend (read timing)

0.000     100.000     200.000     300.000     400.000     500.000     600.000    

LPDDR4-­‐3200  

DDR4-­‐2400  

DDR3-­‐2133  

LPDDR3-­‐1866  

LPDDR2-­‐1066  

DRAM  SPEC  TREND(Read  ;ming)  

PHY   CHANNEL  

•  DRAM spec consumes largest percentage of timing budget •  Channel budget gets worse in absolute “ps” as frequency scales •  SoC needs to pick up the slacks of DRAM timing

5 © 2015 Cadence Design Systems, Inc. All rights reserved.

VT drift implications on training and leveling

6 © 2015 Cadence Design Systems, Inc. All rights reserved.

item Training Uncertainty in Question

1 CA (Vref) DRAM Vref variation

2 CA (Data eye) Delay variation across CA bit lines against CLK

3 Read gate Read DQS preamble placement

4 Read data eye Delay variation across DQ bit lines against DQS

5 Write leveling WR DQS-CLK timing variation

6 Write DQ Delay variation across DQ bit lines against DQS

7 DQ VREF DRAM Vref variation

8 Per-bit deskew Delay variation across bit lines

9 IO PVT calibration Process variation, VT drift of IO impedance/Ibias

10 Delay line PVT calibration Process variation, VT drift of delay line against CLK

DDR training offsetting system uncertainty

DDR training supported by JEDEC spec

7 © 2015 Cadence Design Systems, Inc. All rights reserved.

•  Supply noise induces jitter along the clock or data path across multiple voltage domains

•  Clock or data jitter consumes timing budget, which means running the bus at lower speed

Minimizing supply noise improves clock and data jitter

DDR4-3200, Non-ideal VDDQ

DDR4-3200, Ideal VDDQ = 1.2v

Supply  Noise  

Supply  Noise  

Supply  Noise  

8 © 2015 Cadence Design Systems, Inc. All rights reserved.

•  Proper die/package/board decap per speed bin

•  PLL wake-up power sequencing

•  Die level voltage domain Isolation: VDDPLL, VDDCLK, VDDQ, VDD

•  SPG ratio recommendation (after trade-off between pin counts vs SI/PI)

•  DBI mode support per JEDEC standards (DDR4/LPDDR3)

DDR supply noise reduction technique

PKG VDDA (PLL)

VDD (PHY/IO)

PCB

Vreg1

DDR PHY

VDDQ (IO)

Vreg2

DDR supply domain isolation

DDR bump SPG 2:1:1 ratio

VSS VSS VSS VSS VSS

VDDQ VDDQ VDDQ VDDQ VDDQ

PAD_MEM_DM[0]

PAD_MEM_DATA[5]

PAD_MEM_DQS_M[0]

PAD_MEM_DATA[2]

PAD_MEM_DATA[0]

PAD_MEM_DATA[7]

PAD_MEM_DATA[6]

PAD_MEM_DQS_P[0]

PAD_MEM_DATA[3]

PAD_MEM_DATA[1]

VSS PAD_MEM_DATA[4] VSS VSS VSS

VSS VDD VDD VDD VDDPLL1

VSS VDDQ VSS VSS VDDPLL2

VDDQ PAD_MEM_DATA[13] VDDQ VDDQ VDDQ

PAD_MEM_DM[1]

PAD_MEM_DATA[12]

PAD_MEM_DQS_M[1]

PAD_MEM_DATA[10]

PAD_MEM_DATA[8]

PAD_MEM_DATA[15]

PAD_MEM_DATA[14]

PAD_MEM_DQS_P[1]

PAD_MEM_DATA[11]

PAD_MEM_DATA[9]

VSS VSS VSS VSS VSS

DQ[3] VDDQ DQ[2] DQ[1] VDDQ DQ[0]

DQ[7] VSS DQ[6] DQ[5] VSS DQ[4]

DDR BGA SPG 4:1:1 ratio

VDDQCK (CK)

LPF

LPF

9 © 2015 Cadence Design Systems, Inc. All rights reserved.

On-die decap sources: •  Device cap •  MIM cap •  Metal cap

Importance of on-die decoupling

Die decap •  21pF per IO •  50pF per IO •  80pF per IO •  107pF per IO •  135pF per IO

DDR4 PHY

ZPDN_VDDQ (Die+PKG+PCB) vs on-die decap density PDN input impedance •  Die decap affects Q of

high-frequency PDN resonance (typically ~ 100-800 MHz)

•  Adding on-die decap suppresses the peaking amplitude and lowers the peaking freq

data slice

data slice

data slice

data slice

PLLPLLPLL PLLPLL

memclk

adrslice

adrslicead

rctl

slic

e

adrc

tlsl

iceIO

calibration

DDR4 IO pads

data data data dataAC AC AC ACClk

DECAP

Digital PHY

10 © 2015 Cadence Design Systems, Inc. All rights reserved.

Benefits of DECAP on data eye on SI/PI

DQS with non-ideal PDN DQ with non-ideal PDN DQS with ideal PDN DQ with ideal PDN VDDQ with non-ideal PDN VDDQ with ideal PDN

•  Increase decap density → reduce supply noise → reduce jitter

•  Some jitter are correlated between DQ and DQS → differential DQ-to-DQS jitter < single-ended DQ jitter

Decap=160pf/IO SSO DBI disabled

Ideal PDN SSO DBI disabled

Decap=50pf/IO SSO DBI disabled

11 © 2015 Cadence Design Systems, Inc. All rights reserved.

DDR4 3200 signal integrity challenges DDR4 3200 Gb/s, 1DPC, 1R, DQ Write Eye

DDR 1-DIMM Topology

DDR 2-DIMM Topology

DDR4 3200 Gb/s, 2DPC, 1R, DQ Write Eye @ Near-DIMM (DIMM0)

DDR4 3200 Gb/s, 2DPC, 1R, DQ Write Eye @ Far-DIMM (DIMM1)

12 © 2015 Cadence Design Systems, Inc. All rights reserved.

At 3200, xtalk consumes substantial amount of link budget for DIMM topology.

Channel impairment due to ISI and X-Talk at DDR4 3200

At 3200, DDR4 channel dominates the link budget. SI becomes critical, I/O needs to meet very tight budget

13 © 2015 Cadence Design Systems, Inc. All rights reserved.

T16FF+ LPDDR4 test chip

Item Value

Process TSMC 16FF LL+

Protocol LPDDR4 3200

Bus Width One x16 LPDDR4 channel 2 rank

Package 2-2-2 FC-BGA

LPDDR4 Die Floor Plans CDNS T16FF+ LP4-PoP Test Board

14 © 2015 Cadence Design Systems, Inc. All rights reserved.

T16FF+ LP4-PoP 3.2Gbps silicon correlation

Measured DQ/DQS eye

Simulated DQ eye

Measured DQ (WR Burst) simulated DQ (WR Burst)

VDDQ measured VDDQ simulation

15 © 2015 Cadence Design Systems, Inc. All rights reserved.

T16FF+ LP4-Discrete 3.2Gbps silicon correlation

CDNS T16FF+ LP4-DSC Test Board

Measured DQ/DQS eye

Simulated DQ eye

16 © 2015 Cadence Design Systems, Inc. All rights reserved.

Challenges in measuring and fe-embedding DDR signals

DRAM Eye @ Pin

DRAM Eye @ Pad

Little reflection on Rx pad

Large reflection on Rx pin

17 © 2015 Cadence Design Systems, Inc. All rights reserved.

DDR IO & PHY comparison

IO PHY 28HPM 16FF 10FF 28HPM 16FF 10FF

Performance (phy STA margin or IO jitter)

2667Mbps 3200Mbps 4267Mbps 2667Mbps 3200Mbps 4267Mbps

Area 100% 88% 65% 100% 40% 20%

18 © 2015 Cadence Design Systems, Inc. All rights reserved.

Achieving 3200 performance on 16FF/10FF

Simulation conditions •  DDR4 memory down channel •  34ohm PU/PD driver strength •  60ohm termination •  Nominal PVT conditions •  PRBS data pattern •  3200Mbps data rate

VDDIO Transient Current

Receiving Data Eye

19 © 2015 Cadence Design Systems, Inc. All rights reserved.

Signal integrity modeling and simulation flow

© 2015 Cadence Design Systems, Inc. All rights reserved worldwide. Cadence and the Cadence logo are trademarks of Cadence Design Systems, Inc. in the United States and other countries.


Recommended