HBM (High Bandwidth Memory) for 2.5D
Dr. Hongshin JunSK hynix Inc.
ICT Industry Outlook – Changes & Preparations
DRAM Scaling Limits
ICT Trend & Value-chain
Big Data
IoT, IoE
Data in Motion
Analytics
simultaneous
Preparations
Data Center Traffic
3.1ZB @’13 8.6ZB @’18 (CAGR23%)
Vertical Integration towardService-centric Industry
2006 (~um) 2016(1x nm)
~9$
~0.x$# of Process Steps
CAGR < 5%
# of Process Steps
CAGR > 5%
ASP CAGR -30%
(1Gb Eq.)
Follow the scalability
requirement of system with
existing infra- structure
DDR4E, P-DDR4,
P-LPDDR4
Architecture Breakthrough to
meet future system
performance requirement
Total Optimization
Memory hierarchy tiering
Stacking technology
Evolutionary
Revolutionary
Database Serversin-Memory
Computing
System Architecture
HPC Memory Bandwidth/node
100~500GB/s @’15 2~4TB/s @’19
Computing Intensive Data Intensive
Data
?
Memory Hierarchy
simultaneous
Changes
Networks
Role Sharing of Memory Solutions
: B/W vs. Capacity
HBM*
(B/W)Capacity
* HBM : High Bandwidth Memory
Query
++
System & Memory Architecture Projection
HPC & Server(B/W & Capacity)
Network& Graphics
(B/W)
Client-DT& NB
(B/W & Cost)
Mobile& Wearable
(LP, Small Form Factor, B/W & Cost)
+
Bandwidth Solution
Cost Solution
+
Bandwidth Solution
Bandwidth Solution
+
Bandwidth Solution Capacity Solution
B/W
B/W &Capacity
B/W
B/WB/W & Cost
LP & Bandwidth Solution
LP & B/W
HBM
+
+
General Memory Requirements Each application has different memory requirement, but most common are
high bandwidth and power based on real time random operation.
Graphics Networking
HPC
Datacenter
Real time Based
RandomOperation
BandwidthDensityPower
BandwidthPowerLatency
DensityBandwidthPowerLatency
BandwidthLatencyPower
Technology Challenges : Bandwidth HBM can overcome all DRAM’s challenges for high bandwidth
Bandwidth Challenges High B/W with many I/O
GDDR6 (?)
LPDDR5 (?)
DDR5 (?)
(Gb/s) (GB/s)
Max Bandwidth (GB/s)B/W per Pin (Gbps)
(Gb/s)
----- GDDRx----- LPDDRx----- DDRx
HBM
1
HBM
2
Source : SK hynix
Technology Challenges : Power Efficiency
7.0Gbps(x32) 2.0Gbps(x16) 2.0Gbps(x16)
(Watt)
1.0Gbps(x1K)
GDDR5 DDR3 DDR4 HBM1
Low Speed/pin and Low Cio of HBM reduce power consumption
and increase power efficiency
HBM
1
Power Efficiency Power Consumption@128GB/s
2002 2003 2012 2014
DDR 256Mb
DDR2 1Gb
DDR3 4Gb
DDR4 4Gb
-57%
-79%
No
rmal
ized
DD
R2
56
= 1
00
%
DDR DDR2 DDR3 DDR4
Gbps 0.4 0.8 1.6 3.2
Source : SK hynix
-70%
-74%
SK Hynix TSV Development History
‘084Gb Flash
‘104Gb DRAM DDP-WLP
‘1116Gb DRAM
9MCP
‘1116/32GB
DIMM
‘114hi KGSD
WIO
’135mKGSD
HBM
Volume Production of HBM1
HBM2 Universal Daisy Chain
9mKGSD HBM2 Development
’159mKGSD
HBM2 D/C
Various TSV development experiences have resulted
in real production of HBM1
SK hynix TSV chronicle SK hynix’s Plan on year 2015
Mass Production of the World 1st HBM
SK hynix completed the qualification for mass production (March’15)
(Bottom View) (Top View) (Section View)
Worldwide first HBM provider
Mass Production Start from Apr.’15
HBM2 design wins in progress with major SoCs
in multiple markets
SK hynix World-First HBM Products Public Announcement of HBM1
“It is shipping mass
production volumes of
1st generation High
Bandwidth Memory (HBM1)”
Read more at http://www.legitreviews.com/sk-hynix-ramps-
production-of-high-bandwidth-
memory_166147#bVuxuc4p95KCkvb4.99
HBM2 Product Configuration
8Gb based 9mKGSD 5mKGSD 3mKGSD
Density/Cube (GB) 8GB 4GB 2GB
IO 1024 1024 1024
Speed/pin (Gbps) 1.0 1.6 2.0 1.0 1.6 2.0 1.0 1.6 2.0
Bandwidth(GB/s)
128 204 256 128 204 256 128 204 256
Usage HPC, Server HPC, Server, Graphics, Network Graphics, Cache
Config. / system 8 / 6 / 4 Cube 4 / 2 / 1 Cube 2 / 1 Cube
8 Cores
Base die
4 Cores 2 Cores
Industry is moving to HBM solution as TSV memory
SoC+10 Design Win
OEMIP
X86
FPGA
ASIC ASSP
Datacenter EDA
Foundry
Graphics
Network
Foundry
HPC
+6 Design Win
+5 Design Win
+2 Design Win
The number of design win means number of company (More number of projects)
HBM Architecture Overview
• 4 Core DRAM + 1 Base logic die (Chip on Wafer)
DA POWER/TSV PHY
Core Die 0
Core Die 1
Core Die 2
Core Die 3
Base Logic Die
128 I/O
Microbump array
Probe PAD(Microbump
depopulated)
128 I/O
1024 TSV I/O
B3 B1
B2 B0
B7 B5
B6 B4
B3 B1
B2 B0
B7 B5
B6 B4
Items Target
# of Stack 4(Core) + 1(Base)
Ch./Slice 2
Total Ch. for KGSD 8
IO/Ch. 128
Total IO/KGSD 1024(=128 x 8)
Address/CMD Dual CMD
Data Rate DDR
Source: D.U Lee, SK hynix, ISSCC 2014
HBM1 Base Die Architecture
• Base die consists of 3 Areas – PHY, TSV, Test Port Area
HBM ballout area6,050x3,264 μm
Source: D.U Lee, SK hynix, ISSCC 2014
HBM1 Core Design Architecture
• Each core die has 2 channels
• 1 channel consists of 128 TSV I/O with 2n pre-fetch
B0 B1
YCTRL YCTRL
B2 B3
XC
TRL
XC
TRL
C
B0 B1
YCTRL YCTRL
B2 B3X
CTR
LX
CTR
L
C
B4 B5
YCTRL YCTRL
B6 B7
XC
TRL
XC
TRL
C
B4 B5
YCTRL YCTRL
B6 B7
XC
TRL
XC
TRL
C
DWORD 0
32 I/O
DWORD 1
32 I/OAWORD
DWORD 2
32 I/O
DWORD 3
32 I/O
B0 B1
YCTRL YCTRL
B2 B3
XC
TRL
XC
TRL
C
B0 B1
YCTRL YCTRL
B2 B3
XC
TRL
XC
TRL
C
B4 B5
YCTRL YCTRL
B6 B7
XC
TRL
XC
TRL
C
B4 B5
YCTRL YCTRL
B6 B7
XC
TRL
XC
TRL
C
DWORD 0
32 I/O
DWORD 1
32 I/OAWORD
DWORD 2
32 I/O
DWORD 3
32 I/O
CH-Left CH-Right
1 bank : 2 sub-banks(64Mb) non-shared I/O between sub-banks
Source: D.U Lee, SK hynix, ISSCC 2014
2Gb Core Die
HBM2 Core Architecture: Pseudo Channel
• HBM2 core die supports 4 pseudo channels or 2 channels
• Each channel consists of 2 Pseudo Channels. Only BL4 is supported
CH-0
B0 B1
B2 B3
B4 B5
B6 B7
64I/O
B0 B1
B2 B3
B4 B5
B6 B7
64I/OADD
CMD
PS-CH0 PS-CH1CH-1
B0 B1
B2 B3
B4 B5
B6 B7
64I/O
B0 B1
B2 B3
B4 B5
B6 B7
64I/OADD
CMD
PS-CH0 PS-CH1
CH-2 CH-3CH-4 CH-5
CH-6 CH-7
B0~3
B4~7
B8~11
B12~15
B0~3
B4~7
B8~11
B12~15
B0~3
B4~7
B8~11
B12~15
B0~3
B4~7
B8~11
B12~15
8Gb Core Die
Pseudo Channel Mode
• Pseudo channels share AWORD(CMD), but have separated banks &
independent 64 I/Os
Channel #CMD
Channel #ACT#-RD#
128DQD0128
D1128
D2128
D3128
16Bank / CH
Channel #CMD
CH #_PC0ACT#0-RD#0
64DQD064
D164
D264
D364
CH #_PC1ACT#1-RD#1
64DQD064
D164
D264
D364
16Bank/PC1
16Bank/PC0
Restriction of tFAW in Legacy mode• In Legacy mode, each channel has 2KB page size
• Restriction of Gapless Bank Activation by tFAW (4 Activate Window)– Suppose tCK=2ns, tFAW=30ns, tRRD=4ns
– tFAW=30ns > 4Bank*tRRD=16ns
Lower efficiency of Band Width
Gap
Gap
Gap
Benefit of Pseudo Channel • Pseudo channel has reduced page size 1KB (2KB in Legacy mode)
• Lower Active Power(IDD0) by 1K Page size
• Define tEAW (1KB x 8 ACT) instead of tFAW (2KB x 4 ACT)
• Bandwidth improvement by more Activations during tFAW
tEAW
Mechanical Outline : molded KGSD
Item ValueBump
RemarkCD Pitch
(a)Gen1 - Package Dimension (X, Y) 5.48 mm x 7.29mm
25um
(As Reflow)55um
Gen2 - Package Dimension (X, Y) 7.75 mm x 11.87mm
(b)Gen1 - Package Body Height (Z) 0.49 mm
Gen2 - Package Body Height (Z) 0.72 mm
Micro Bump Array (MPGA) JEDEC - - JC11-2.883, JC11-4.884
• mKGSD (1 Base + 2/4/8 DRAM (Core) ; molded Known Good Stacked Die)
Side Mold
Silicon
Side Mold
Base DieMicro Bumps
(b)
(a)
HBM 2.5D SiP Structure
PHYTSVDA ball
DRAM Slice
DRAM Slice
DRAM Slice
DRAM Slice
Interposer
SoC
KGSD
PHY
HBM in 2.5D SiP
Sid
e M
old
ing
Sid
e M
old
ing
3D Memory (HBM)
Base dieSilicon die
PKG Substrate
• System-in-Package implementation with KGSD
HBM KGSD Test
KGSD
Wafer Wafer
DRAM Test Flow
KGSD Speed Test
PKG
Hot/Cold Test
Package Process
WFBI
TDBI
Hot/Cold Test
Repair
Hot/Cold Test
Stack Process(KGSD)
Logic Test
HBM Test Flow
Hot/Cold Test, Repair
Repair
WFBI
Speed Test Speed Test
B/I Stress B/I Stress (BISS DFT)
• Dynamic Stress (BISS)
KGSD (4 DRAMs + 1 Logic)
Bump & Stack Process
JEDEC IEEE1500
DRAM Die
Base Die
DRAM Die
KGSD Core Test (Productivity)
=
HBM KGSD Test• DA pads for productivity
• KGSD Test covers TSV, DRAM cell, PHY, IEEE1500, and repairs TSV, DRAM cells
• TSVOS test and Repair
• uBump Test
– Screen leakage failure from uBump (DC)
– At-speed loopback test (AC)
• At-speed test on PHY, DRAM cells
Test
AreaFunction Detail item Coverage
PHYFunction Test RD/WT,CL,BL 100%
Margin Test Speed, VDD, Setup/Hold Timing 100%
TSVFunction Test RD/WT,CL,BL,TSV interface 100%
OS Check TSV Open/Short Check 100%
LogicFunction Test IEEE1500, Function, BIST, Repair 100%
Margin Test VDD, Speed, Setup/Hold 100%
Core
Function Test RD/WT, Self Ref, Power Down 100%
Margin Test Speed, VDD, Async, Refresh 100%
Repair Cell Repair 100%
Test CoverageKGSD (Base/Core)
HBM Test Features for 2.5D
Item Description
IEEE
1500
BYPASS ALL-Channel Bypass
EXTEST RX/TX uBump boundary scan Rx/Tx test (Open/short)
MBIST Memory Built-In Self Test
SOFT_REPAIR Soft repair of failing DRAM bit cells
HARD_REPAIR Hard repair of failing DRAM bit cells
DWORD MISR Read back signature in DWORD MISR
AWORD MISR Read back signature in AWORD MISR
SOFT/HARD_LANE_REPAIR Perform Lane remapping
DEVICE ID Read JTAG Device ID
TEMPERATURE Read 8-bits binary temperature code
MODE REGISTER DUMP Read/Write the DRAM’s Mode Register
• HBM has test/repair features for 2.5D accessible from host ASIC through IEEE1500
• Available at 2.5D SiP after assembly
PPR: Soft/Hard Repair
normal i /o line
Controller
PHY
micro bump
PHY
Interposer
PKG Ball Bump
DRAM Core Slice3
DRAM Core Slice0
DRAM Core Slice1
DRAM Core Slice2
Base diePHYChannel0 Channel1
Channel6 Channel7
DRAM Cell Test and Repair through IEEE1500
After SiP Assembly
4-Row Rep / 2-Bank (Repair DRAM) (64 row / channel @ 8Hi)
Procedure
1) Run MBIST: Pass/Fail, Report fail row addresses
2) Run Soft Repair: Soft repair the fail row addresses by writing register
3) Run Hard Repair: Hard repair the fail row addresses by cutting e-Fuse
Lane Repair HBM support interconnect lane remapping through IEEE1500 instructions
After SiP Assembly
Lane remapping is independent for each channel
Procedure
1) Test lanes between HOST and HBM using EXTEST and MISR instructions
2) Run SOFT_LANE_REPAIR: Perform lane remapping by writing register.
3) Run HARD_LANE_REPAIR: Perform lane remapping by cutting eFuse.
SimpleDBI lost
ComplexDBI maintained
Challenges in 2.5D Failure Analysis
Traditional FA for DRAM Component
RMA RequestFailure
Duplication@ System level
DRAM ComponentATE Test
Customer System Based
Application Test
Further Analysis
& Root Cause
New FA for HBM @ SiP Level
RMA RequestFailure
Duplication@ System level
SiP LevelATE Test
Further Analysis& Root Cause
SiP LevelApplication Test
No good solution for evaluating the interface status between ASIC and HBM inside of SiP
Conclusion
• HBM provides break through memory solutions for next generation high
performance systems with higher bandwidth, lower power, and smaller
form factor.
• SK hynix set up HBM 3D Stacking and Test flow to deliver KGSD.
• SK hynix is shipping mass production volumes of HBM.
• 2.5D Ecosystem requires close collaboration among all stakeholders.
OSATs
HBM
Vendor
IP
Enablers
SiP/Set
Maker
SiP Ass’y & Test
SoC &
InterposerKGSD/KG
D
PHY & MC
IP
SiP Design/Biz
Model
Foundries
Thank You
If you have further questions about SK hynix memory information, please ask to SK hynix DRAM Product Planning & Enabling Office
([email protected]) or SK hynix Japan. Thank you.