© T. Kuroda (1/48)
Lecture 6: CMOS Proximity Wireless Communications for 3D Integration (1)
Tadahiro KurodaVisiting MacKay ProfessorDepartment of EECSUniversity of California, [email protected], [email protected]://bwrc.eecs.berkeley.edu/Classes/ee290c_s07http://www.kuroda.elec.keio.ac.jp/
EE290c Spring 2007, Tues & Thurs 9:30-11:00, 212 Cory UCB
© T. Kuroda (2/48)
Memory Bus
Trend of Chip Performance and Pin Bandwidth
Chip Performance
Year70 80 90 00 10
4004
8086
286
Intel386
Intel486
Pentium
Pentium4
X1.70 / year
100
MIP
S [i
nstr
uctio
n/s]
0.01
0.1
1
10
1000
10000
100000
1000000
Pin Bandwidth Data R
ate [MB
/s]
1
10
100
1000
10000
Bus on BoardATA (HDD)
Ethernet
ISA
PCI
PCI-EXSerial ATA
X1.44 / year
Fast Ethernet
Courtesy: Intel, Fujitsu Lab
© T. Kuroda (3/48)
The Gap Is Caused by Topological Difference
TRANSISTOR Scaling Per yearGate length [x] 0.87Voltage [V] 0.87Capacitance [c]~[x2/x] 0.87Current [i]~[v2/x] 0.87Speed [i/cv] 1.15
WIREWire pitch [x] 0.87Chip size [s] 1.06Tracks [t]~[s/x] 1.22Grids [g]~[t2] 1.49
Rent’s rule:Bandwidth demand from a module withcapacity C (grids*speed) grows as C0.7.Required pin bandwidth: x1.45/year
Pin bandwidth1.15 (Speed) x 1.11 (Pin #) = x1.28/year
Periphery
Chip performance1.15 (Speed) x 1.49 (Grids) = x1.71/year
Area
Moore’s Law
Chip perform
ance : 1.71/year
Pin bandwidth : 1.45/year
Scaling:1.28/year
circuitinnovation
Year
Dat
a R
ate
© T. Kuroda (4/48)
Challenges in Wireline Link
’96 ’98 ’00 ’02 ’04 ’06
100G
1T
Dat
a R
ate
[b/s
]
Year’ ’ ’ ’ ’ ’’ ’ ’ ’ ’ ’’ ’ ’ ’ ’ ’
1G
10G
Speed-wall
Stanford (1ch)
Stanford (1ch)
NEC (1ch)
Power/Area-wall
Toshiba(4ch,4W)
Rambus(26ch)
NTT(16ch,8W)
Hotrail(32ch)
NEC (4ch,5W)
NEC (21ch,3W)
NEC (20ch,8W)
TI (20ch,6W)
Intel (32ch,15W)
IBM,Sony,Toshiba (48ch,6W)
Hitachi(1024ch,12W)
© T. Kuroda (5/48)
150nm and 65nm Single-chip Supercomputer at Max Performance
Courtesy: NEC
Leakage and IO Power Rapidly Increasing
65nm (estimated)150nm
Vector
Scalar FPU
Logic
IO
Clock
SRAM
x1.3(Total Power)
VectorFPU
Scalar FPU
LogicIO
Clock
SRAM
leakageFPU
IO
Clock
IO
Clock
© T. Kuroda (6/48)
I/O Power Recently Increasing
0.010.11
10
10000
1
1000
100
Technology [µm]
Pow
er [m
W]
fCV2 ∝ 1/κ1.3
© T. Kuroda (7/48)
I/O Power DetailsPo
wer
[mW
]
50
200
100
0
150
250
90nm, 10Gb/s
CDR
PLL
CLK
EQDMXDRMUX
25mW, 10%
70mW, 30%
25mW, 10%
30mW, 12%
30mW, 12%
25mW, 10%
25mW, 10%15mW, 6%
180nm, 2.5Gb/s
20mW, 17%
25mW, 21%
20mW, 17%5mW, 4%
10mW, 8%25mW, 21%
5mW, 4%10mW, 8%
245mW
120mW
OtherTx
Rx
Clock
Power in red is increasing
© T. Kuroda (8/48)
+5% Speed Requires +20% Power in Clock
Normalized speed(by changing the number & size of repeaters)
0.9 0.95 10.7
0.8
0.9
1N
orm
aliz
ed p
ower
exponential
Ref. [11]
© T. Kuroda (9/48)
SoC Improves I/O PerformanceSeparate
chips 891mW
240mW
DRAMLogic & memory
EmbeddedDRAM
Power 16MbitDRAM
Speech codec
Multiplexer
MPEG-4 VideoCodec
HostI/F
DRAM
I/F
PLLCamI/F
DisplayI/F
Pre-filter
VTVT
VT VT
MPEG4 codec
DRAM - logic interface
70% power reduction by DRAM embedding technology
Courtesy: Toshiba
© T. Kuroda (10/48)
DRAM
PC Board
Package
PCBDRAM
SRAM
Analog
CPU
Connection
DRAM
PC Board
Package
PCBDRAM
SRAM
Analog
CPU
Connection DRAM SRAM
Analog CPU
DRAM SRAM
Analog CPU
From SoC to SiPSystem-on-a-Board
CPU
SRAM DRAM
Analog
CPU
SRAM DRAM
Analog
Lower Cost, QTAT
Low power, High speed
System-in-a-Package (SiP)
System-on-a-Chip (SoC)
Mas
k Se
t Cos
t [M
$]
0
0.5
1
1.5
2
2.5
0.4 0.25 0.2 0.180.15 0.13 90n65n0
0.5
1
1.5
2
2.5
0.4 0.25 0.2 0.180.15 0.13 90n65n0
0.5
1
1.5
2
2.5
0.4 0.25 0.2 0.180.15 0.13 90n65n
Generation [µm]Source: Intel
ASIC
ASSP
02000400060008000100001200014000
2000 2001 2002Year
Des
ign
star
ts
ASIC
ASSP
02000400060008000100001200014000
2000 2001 2002Year
Des
ign
star
ts
Number of design starts is declining from 1997.
Source: EE Times
© T. Kuroda (11/48)
Chip Stacking and Wire Bonding in SiP
Courtesy: Toshiba
© T. Kuroda (12/48)
From Periphery to Area
MPU
Memories
Sensor / RF / Analog
Bonding (Conv.) Through Si Via (Future)(+) area contact:
large # of connections(~10000)short distance(~0.1mm)
(-) expensive process / reliability issue(-) low yield due to Known Good Die
issue : difficult to test in fine pitch(-) scaling limit due to mechanical
contacts (~10µm pitch)
(+) low cost, practical(-) peripheral contact:
small # of connections(~100)
long distance(~10mm)
© T. Kuroda (13/48)
From Mechanical to Electrical
TSV Wireless Interface
Proposalwireless transceiver arrays
(-) process(-) KGD(-) scaling limit
(+) no addition in process, no reliability issue(+) KGD solvable : easy to attach and remove(+) high density channels (below 10µm pitch)(+) 3D scaling scenario (thinning a chip)(+) channels through active devices (+) low power : no ESD protection required
© T. Kuroda (14/48)
Communication Bottleneck Resolved in SiP
DRAM
SRAMAnalog CPU
DRAM
SRAMAnalog CPU
DRAM
PC Board
Package
PCB
DRAM
SRAM
Analog
CPU
Connection
DRAM
PC Board
Package
PCB
DRAM
SRAM
Analog
CPU
Connection
Chip perform
ance : 1.71/year
Pin bandwidth : 1.45/year
Scaling:1.28/year
circuitinnovation
Year
Dat
a R
ate
Chip perform
ance : 1.71/year
Pin bandwidth : 1.71/year
Year
Dat
a R
ate
Electrical Area Interface for 3D Integration
© T. Kuroda (15/48)
Area Interface for 3D Integration
CapacitiveCoupling [3]
Wired Wireless2
Chi
ps(F
ace-
to-F
ace)
Ove
r 3 C
hips
(Fac
e up
/dn)
[1] ISSCC’04, Sony [2] ISSCC’01, MIT [3] ISSCC’03, Univ. Tokyo, Keio Univ. [4] ISSCC’04, Keio Univ.
InductiveCoupling [4]
Micro-Bump [1]
Through-Si Via [2]
© T. Kuroda (16/48)
Capacitive-Coupling Link
[8] CICC’03, Sun Microsystems[9] ISSCC’04, Sun Microsystems
Chip to Chip[5] CICC’02, NC State Univ.[6] ISSCC’05, NC State Univ.
Chip to Interposer
Base chip
Face-down chips
30um pitch mini-pads (Top metal layer)
substrate
IC #1 IC #2
Trench Interconnection layer
DC connection(Solder bump)
AC connections DC connection(Solder bump)
substrate
IC #1 IC #2
Trench Interconnection layer
DC connection(Solder bump)
AC connections DC connection(Solder bump)
2µm
[3] ISSCC’03, Univ. Tokyo and Keio Univ.[7] CICC’05, Univ. Bologna
1µm
Chip 1Chip 2
Chip 3
Txdata Rxdata
TxdataRxdata
Chip 1Chip 2
© T. Kuroda (17/48)
Inductive-Coupling Link
VR Vbias
+
-
Rxdata
Rxclk
Rxdatab
Dl
Dlb
TxdataIT
Tx/Rx
VR Vbias
+
-
Rxdata
Rxclk
Rxdatab
Dl
Dlb
TxdataIT
Tx/Rx
Digital CMOS Circuits
Multi-layer Wires
ISSCC 2006 (1Tb/s)ISSCC 2005 (200Gb/s)ISSCC 2004 (1Gb/s) [4] ISSCC’04, Keio Univ. [10] ISSCC’05, Keio Univ.
[11] ISSCC’05, Hiroshima Univ.[14] ISSCC’06, Keio Univ.
Transmitter Chip (Top)
1024ch Data Transceivers
Receiver Chip (Bottom)
Data Transceiver
30µm
SEM Photo
15µm
Clock Transceiver
200µm
Tx Chip
Rx Chip
Clock Transceivers
BIS
TC
lock
Con
trolfo
r Top
Chi
p
DC
Pro
be
for Top Chip
195 Transceiver
Channel Array
AC Probefor Bottom
Chip
for B
otto
m C
hip
AC Probe
DC
Pro
be
50µm
Channel
Tx
Rx
Bon
ding
Wire
s fo
r Bot
tom
C
hip
Bon
ding
Wire
s fo
r Top
Chi
p
Tx ArrayRx Array
Bottom Chip
Top Chip
Inductor TEG
VLSI 2005 VLSI 2006[12] VLSI’05, Hiroshima Univ.[13] VLSI’05, NC State Univ.
[15] VLSI’06, Keio Univ.
© T. Kuroda (18/48)
D=60µm D=60µm
Inductive vs. Capacitive: Loss by Body
0
0.2
0.4
0.6
0.8
1
Nor
mal
ized
S21
Resistivity, ρ [Ωcm]10-6 10-5 10-4 10-3 10-2 10-1 100 101 102 ∞
@10GHz
~ ~~ ~
Loss=exp(-T ) [12]ρωµ 2/
Typical
Indu
ctiv
e
Cap
aciti
ve
Inductor
Si
Capacitor
T=60µmChargeEddy
Capacitive: only for 2 chips, placed face-to-faceInductive: for 2 chips (face-to-face) and >3 chips (face up/down)
© T. Kuroda (19/48)
Inductive vs. Capacitive: Package Flexibility
Bed
Face-up ChipFace-up ChipFace-up ChipFace-up Chip
Bed
Face-up (Logic)Face-down (Memory)
Cavity
Bed
Face-down (Logic)Face-up (Memory)
Inductive Coupling
Capacitive Coupling
[7]
Inductive: compatible with conventional wire/area bonding
Capacitive: need new technology for power delivery
© T. Kuroda (20/48)
Inductive vs. Capacitive: ScalabilityCoupling coefficient is enlargedby increasing # of metal layers.
Transmission power can besecured even at low VDD’s.
Supply Voltage VDD [V]
0.2
0.4
0.6
0.8
1
0.2 0.4 0.6 0.8 1 1.2
Nor
mal
ized
Tra
nsm
issi
on P
ower Inductive
Capacitive
VR
+ -
IT
VR
VT
VDD VDD
1
3
5
7
9
11
13
3 4 5 6 7 8 9 10
15
Number of Metal Layers
Nor
mal
ized
S21
Inductive
Capacitive
Tx
Rx
10µm
2µm
CRX
© T. Kuroda (21/48)
Channel Design
)()1(1
11
2ToutTTTTTout
RTRRT
R
LRRCjRCLRLLkj
RCjVV
+++−××
+=
ωωω
ω
0 2 4 6 8
0.2
|VR
/ V T
| CR=0.1pF
0 2 4 6 8
0.2
|VR
/ V T
|
CR=1pF
Frequency [GHz] Time [ns]
0 4 8 12
0 4 8 12
0
-0.2
0.2
V R[V
]
0
-0.2
0.2
V R[V
]
16
16
0 2 4 6 8
0.2
|VR
/ V T
| CR=0.1pF
0 2 4 6 8
0.2
|VR
/ V T
|
CR=1pF
Frequency [GHz] Time [ns]
0 4 8 12
0 4 8 12
0
-0.2
0.2
V R[V
]
0
-0.2
0.2
V R[V
]
16
16
w
nd
TRT ILLkjω
IT
VT+ -
VR+ -
CT
LT
RR/2 RR/2
CR
+ -
RT/2 RT/2
Rout
k
TRT ILLkjω
IT
VT+ -
VR+ -
CT
LT
RR/2 RR/2
CR
+ -
RT/2 RT/2
Rout
k
N. Miura, et al., “Analysis and Design of Transceiver Circuit and Inductor Layout for Inductive Inter-chip Wireless Superconnect,” 2004 Symposium on VLSI Circuits Digest of Technical Papers, pp. 246-249 June 2004.
Modeling
Design knowledge
Optimization
© T. Kuroda (22/48)
NRZ vs. BPM
Tx Rx+VR-Rxdata
ITTxdata
Txclk
Crosstalk
0 1 2 3 4Time [ns]
0
1.80
1.8
Txcl
k [V
]Tx
data
,
-50
0
50
-50
0
50
V R [m
V]V R
[mV]
NRZ Signal
BPM
Rx Dead Zone
1 10Pulse Energy [pJ/b]
10-12
10-9
10-6
10-3
BER
10-15
NRZ
BPMx1/3
30.3
x1/6
0 011 0
© T. Kuroda (23/48)
PulseGenerator
Txclk
Txdata
VR+ -
Vb
Rxdata
Rxclk
Rxdata
IT
Txdata
I T [m
A]
05
-5V R
[mV]
0
50
-50
Txda
ta
0
1.8V
Txcl
k1.8V
0
Rxc
lk1.8V
0
Rxd
ata
0
1.8V
Time [ns]2 4 60
Transceiver Circuit for Data
[14] ISSCC’06, Keio Univ.
© T. Kuroda (24/48)
Transceiver Circuit for Clock
Rxc
lk[V
]
0
1.80.7
1.10.9
V SA
[V]
-0.1
0.10
V RC
[V]
1
-10
I TC[m
A]
Txcl
k[V
]
0
1.8
0 321Time [ns]
54
VRC
Vbias
+ -
RxclkRxclk
VDD
ITC
Vbrx
VSA
TxclkTxclk
IDC
[16] ISSCC’07, [20.2] , Keio Univ.
© T. Kuroda (25/48)
Txdata
Rxdata
Clock and Data Link
Data Link (one clock latency)Clock Link
・Both clock and data are linked by inductive coupling
[14] ISSCC’06, Keio Univ.
1Gb/s, 223-1 PRBS Data, BER<10-13
Clk
TxC
lkR
x
ITC
φ
V
+-
RC
φ
Dat
a Tx
Dat
a R
x
IT
VR
+-
Rxdata0
Txclk
7.4ps-rms Jitter in Rxclk
20ps
1GHz Rxclk
100ps
Txdata0
IT
VR
+-Txdata6
3
Rxdata63
64ch
© T. Kuroda (26/48)
LAN Wire vs. Wireless
path loss
multi-path fading
Multiple access in free space: cell, TDMA, FDMA, CDMA
Wired LAN (Ethernet) Wireless LAN (WiFi)802.3u(100BASE-T) 802.11btwisted pair 2.4GHz<100m <100m
Data rate High speed (100Mbps) Low speed (11Mbps)Reliability High (BER<10-14) Low (BER~10-4)Cost Inexpensive (~$15) Expensive (~$100)Power Low (~100mA) High (~400mA)Size Small Large (w/ antenna)Connection Easy (plug and play) Complex (authentication)Usability Messy/Difficult (ie.wall) Neat, Simple, EasyMobility Low/Immobile Movable
© T. Kuroda (27/48)
Inter-Chip 3D Link Wire vs. WirelessWired Inter-Chip Link Wireless Inter-Chip Link
Micro-bump (2 chips) Capacitive coupling (2 chips)TSV ( >3 chips) Inductive coupling (>3 chips)<100µm <100µm
Data rate High speed Low speed ?Reliability High-reliable Low-reliable ?Cost Inexpensive Expensive ?Power Low High ?Size Small Large (w/ Antenna) ?Connection Easy (plug on play) Complex?Usability Messy/Difficult (ie.wall) Neat, Simple, Easy ?Mobility Low/Immobile Movable ?
100µm: 0.001 wave length (proximity)100m :1000 wave length
© T. Kuroda (28/48)
World Fastest (1Tb/s) Data Rate
[14] ISSCC’06, Keio Univ.
Dat
a R
ate
[b/s
]
’961G
10G
100G
1T
Year’98 ’00 ’02 ’04 ’06
Hitachi (1024ch, 12W)
NEC (4ch, 5W)
NEC (21ch, 5W)
NEC (20ch, 8W)
TI (20ch 6W)NTT(16ch, 8W)
Intel (32ch, 15W)
Toshiba (4ch, 4W)
Rambus (26ch)
Hotrail(32ch)
Sony (1300ch), TeraChip (16ch, 15W)
Rambus FlexIO(48ch, 6W)
Power/Area Wall
NEC (1ch)
Stanford (1ch)
Stanford (1ch)
Keio(1Gb/s/ch, 1024ch, 3W)
Keio (1ch, 46mW)
Keio (195ch, 1.2W)
Speed Wall
© T. Kuroda (29/48)
Maximum Data Rate per Channel
[17] Symp. on VLSI Circuits’06, Keio Univ.
0
-0.2
0.2
V R[V
]
0
-0.2
0.2
V R[V
]
IT
CR
LRRR/2
k
VR
CT
LT
RR/2
RT/2 RT/2X
D=X
Time [ns]0 4 8 12 16
Time [ns]0 4 8 12 16
0
-0.2
0.2
V R[V
]
0
-0.2
0.2
V R[V
]
Communication Distance, X [µm]
0.1
1
10
100
1000
0 20 40 60 80 100 120
Max
imum
Dat
a R
ate
[Gb/
s] fSR = 1/2π LC
fSR/3
[14] [10] [10]
[13]
© T. Kuroda (30/48)
Communication Distance vs. Inductor Size
Comparator
LNA +Comparator (40dB)
Receiver Lower Limits
10-8
10-6
10-4
10-2
1
10µm 100µm 1mm 10mm
Cou
plin
g C
oeffi
cien
t k
Communication Distance X
D
X
D=10µm D=100µm D=1mm
LNA noise floor… difficult to eavesdrop
© T. Kuroda (31/48)
Bus Probing for Debugging
[20.3] “An Attachable Wireless Chip-Access Interface for Arbitrary Data Rate Using Pulse-Based Inductive-Coupling through LSI Package”
Probe IC on FCB
Inductorsin FCB
PCB
glue
Inductors in FCB
Target LSI on PCB
CLK
TX RX
PCB
Target LSI(in SSOP pkg)
Probe IC(transceiver)
Flexible-Circuit-Board(FCB)
To/Fromin-Circuit-Emulator
On-chip inductors [18] ISSCC’07, Keio Univ.
© T. Kuroda (32/48)
World Lowest Energy (0.14pJ/b)
1Tb/s, 0.14W (0.14pJ/b), 1mm2
300Gb/s, 6W (20pJ/b), 3.8mm2
1
10
100
1000
Ener
gy D
issi
patio
n [p
J/b]
Year’96 ’98 ’00 ’02 ’04 ’06
Toshiba (350nm)
NTT (250nm)
Hitachi(250nm)
NEC (250nm)
NEC (130nm)Intel (180nm)
TI (180nm)
NEC (130nm)
Keio (350nm)
Keio(250nm)
[2]Keio(180nm
)
TeraChip (130nm)
Rambus(90nm)
Sun (350nm)
[1]SFT(180nm)
Fujitsu(90nm)
0.1
Inductive Coupling (180nm)Inductive Coupling (90nm)
’07
’05’03’01’99’97
1/150
[20.2] “A 0.14pJ/b Inductive-Coupling Inter-Chip Data Transceiver with Digitally-Controlled Precise Pulse Shaping”
[16] ISSCC’07, Keio Univ.
© T. Kuroda (33/48)
World Smallest (1mm2/Tb/s)
L-coupling30µm pitch (incl. transceiver circuits)Thinner packaging (no solder bump)
Micro-bump60µm pitch
TSV50µm pitch (excl. transceiver circuits)Need additional area for circuits
100
1k
10k
100k
10
1
Layo
ut A
rea
/ Dat
a R
ate
[mm
2 /Tb/
s]
Year’96 ’98 ’00 ’02 ’04 ’06
Sonyµ-bump
TI
NECIntel
NECNEC
Hotrail
HitachiNTT
Toshiba
IBM,Sony,Toshiba
Keio(1024ch)
Keio(195ch)
Keio(1ch)
TxRx
Circuits
TSV
Si Substrate
InductorCircuits
TSV
Si Substrate
Circuits
TSV
Si Substrate
Circuits
TSV
Circuits
TSV
Si Substrate
Inductor
[14] ISSCC’06, Keio Univ.
© T. Kuroda (34/48)
Channel Pitch vs. Crosstalk
[19] CICC’04, Keio Univ.
X
Inductor Array
YD
-22.5
-17.5
-12.5
-7.5
-2.5
1 2 3 4
BER=10-12
X=D
5
Inte
rfer
ence
-to-S
igna
l Rat
io (I
SR) [
dB]
Y/DChannel Pitch Y [µm]120 1200240 360 480600 840
10-4
10-3
10-2
10-1
1
Nor
mal
ized
Cro
ssta
lk
1/Y 3Slope
X=D
© T. Kuroda (35/48)
Narrower Pitch by Time Interleaving
-50
0
50
-50
0
50
-50
0
50
0 1 2 3Time [ns]
Rec
eive
d Vo
ltage
[mV]
Crosstalk
Signal
50mV
25mV
10mV
w/oTime Interleaving
2-phaseTime Interleaving
4-phaseTime Interleaving
Ch Array
[14] ISSCC’06, Keio Univ.
[10] ISSCC’05, Keio Univ.
© T. Kuroda (36/48)
As Reliable As Wireline (BER<10-13)
10-13
10-7
10-4
10-3
∆T [ps]
Bit
Erro
r Rat
e
300 350250 400
1Gb/s
φ
φ
Data Tx
Data Rx
Rxdata
Txdata
223-1 PRBS Generator
TxclkClk Tx
Clk Rx
Error Counter
∆TRxclk
1GHz Clock
Timing Margin=150ps
10-11
10-5
10-6
10-8
10-9
10-10
10-12
・Easy to synchronize[14] ISSCC’06, Keio Univ.
© T. Kuroda (37/48)
Interference ImmunityInterference to circuits:negligibly small for digital
Interference from environment:negligibly small for receiverdiminishing by scaling
1
10
100
1000
10000
100000
10k 1M 100M 10G 100G0.00001
0.0001
0.001
0.01
0.1
1
10
1G10M100kFrequency [Hz]
Mag
netic
Fie
ld In
tegr
ity, B
EMI[m
G]
Noi
se, V
N[m
V]
D=60µm
RX
BEMI
VN
BEMI(Regulation)
-30
-20
-10
0
10
20
30
-60 -45 -30 -15 0 15 30 45 60Position, Y [µm]
Noi
se V
olta
ge [m
V]
D=60µm
TX
Y
IT
5mA
125psM2
M1
1mm S
ignal
Line
© T. Kuroda (38/48)
Misalignment Tolerance
・3µm alignment error can be compensated by 5% power increase.
0 5 10 15-5-10-15Misalignment, ∆Y, ∆Z [µm]
Normalized (VR – Vcrosstalk)
Rx ChannelX=15µm
P~30µmTx Array
ZY
D~30µm
Time Interleaving
Single Channel
(Vcrosstalk =0)
0.25
0.5
0.75
1
+3µm
Stacked Chips
Alignment Mark
Misalignment<3µm
© T. Kuroda (39/48)
Cost Down・ Circuit solution in standard CMOS:
no need for new process developmentno additional cost in manufacturing
・ Reduce chip size: no peripheral circuits neededno ESD protection needed
© T. Kuroda (40/48)
AC Coupling・No need for level shifters under different VDD’s・No need for additional VDD’s nor thick gate oxide transistors・VDD’s can change: in burn-in, dynamic voltage scaling
RxTx
1V2V
Txdata Rxdata
Txdata
1V
2V
2V
2V
Rxdata
Chip2, VDD=2VChip1, VDD=1V
[20] ESSCIRC’06, Keio Univ.
© T. Kuroda (41/48)
Detachable・ At-speed test possible if same transceiver
are arranged in test head:solve KGD problem improve yield remove built-in test circuit
・ Wafer entirely test possible: reduce test time and cost (¢3 /min)
・ Avoid Pad damage by probe:raise yield
・ Replace a high-speed connector: improve reliabilityreduce cost
© T. Kuroda (42/48)
Inter-Chip 3D Link Wire vs. WirelessWired Inter-Chip Link Wireless Inter-Chip
Micro-bump (2 chips) Capacitive coupling (2 chips)TSV ( >3 chips) Inductive coupling (>3 chips)<100µm <100µm
Data rate High speed High speedReliability High reliable High reliableCost Up (new technology) Down (circuit solution)Power Low LowSize Small SmallConnection Easy Easy (clock and data)Usability Good Better (AC coupling)Mobility One time attachment Detachable
Scalability Mechanical Electrical … Scalable
© T. Kuroda (43/48)
Constant Magnetic Field Scaling
pd
D
pd
D
TRT
TR
tIDkn
tIkL
tILLk
tIMV
6.12
maxmax
==
∂∂
=∂
∂∝
Received Signal Voltage
Self Inductance6.12DnL ∝
[I]
[x]
[T]
[V]
[n]
[t]~[CV/I]
1/α
1/α
1/α
α0.5
1/α
1/α
1
Coil Diameter [D]~[1/x] 1/α
[k] 1
[vRS/vRN] 1
[1/t] α
[ItV] 1/α3
D1.6(I/t)] 1
[1/tD2] α3
[1/D2] α2
Current [I]
Transistor Size [x]
Chip Thickness [T]
Power Supply Voltage [V]
Coil Turn Number (Layer #) [n]
Circuit Delay Time [t]~[CV/I]
Self Inductance
1/α
1/α
1/α
α0.8
1/α
1/α
1
[D]~[1/x] 1/α
Magnetic Coupling Coefficient [k] 1
Crosstalk [vRS/vRN] 1
Data Rate / Channel [1/t] α
Energy / Bit [ItV] 1/α3
Receive Signal [vR]~[kn2[vR]~[kn2 t)] 1
Aggregated Data Rate / Area [1/tD2] α3
Channel Number / Area [1/D2] α2
1.6[L]~[n2D ][L]~[n2D
Distance~Chip Thickness (T)
Diameter (D)Turns (n)
1/α
© T. Kuroda (44/48)
Diameter: 1/α
Chip Thickness: 1/α
Turn: α0.8
Voltage: 1/α
Transistor size: 1/α
Constant Magnetic Field
Constant Electric Field
1/α
Diameter: 1/α
Chip Thickness: 1/α
Turn: α0.8
Voltage: 1/α
Transistor size: 1/α
Constant Magnetic Field
Constant Electric Field
1/α1/α
3D Scaling Scenario
Inductive Coupling Link(Communication)
Field Effect Transistor(Computation)
[I]
[x]
[T]
[V]
[n]
[t]~[CV/I]
1/α
1/α
1/α
α0.5
1/α
1/α
1Coil Diameter [D]~[1/x] 1/α
[k] 1
[vRS/vRN] 1
[1/t] α
[ItV] 1/α3
D1.6(I/t)] 1
[1/tD2] α3
[1/D2] α2
Current [I]
Transistor Size [x]
Chip Thickness [T]Power Supply Voltage [V]
Coil Turn Number (Layer #) [n]
Circuit Delay Time [t]~[CV/I]
Self Inductance
1/α
1/α
1/α
α0.8
1/α
1/α
1
[D]~[1/x] 1/α
Magnetic Coupling Coefficient [k] 1
Crosstalk [vRS/vRN] 1
Data Rate / Channel [1/t] α
Energy / Bit [ItV] 1/α3
Receive Signal [vR]~[kn2[vR]~[kn2 t)] 1
Aggregated Data Rate / Area [1/tD2] α3
Channel Number / Area [1/D2] α2
1.6[L]~[n2D ][L]~[n2D
[I]
[x]
[T]
[V]
[n]
[t]~[CV/I]
1/α
1/α
1/α
α0.5
1/α
1/α
1Coil Diameter [D]~[1/x] 1/α
[k] 1
[vRS/vRN] 1
[1/t] α
[ItV] 1/α3
D1.6(I/t)] 1
[1/tD2] α3
[1/D2] α2
Current [I]
Transistor Size [x]
Chip Thickness [T]Power Supply Voltage [V]
Coil Turn Number (Layer #) [n]
Circuit Delay Time [t]~[CV/I]
Self Inductance
1/α
1/α
1/α
α0.8
1/α
1/α
1
[D]~[1/x] 1/α
Magnetic Coupling Coefficient [k] 1
Crosstalk [vRS/vRN] 1
Data Rate / Channel [1/t] α
Energy / Bit [ItV] 1/α3
Receive Signal [vR]~[kn2[vR]~[kn2[vR]~[kn2[vR]~[kn2 t)] 1
Aggregated Data Rate / Area [1/tD2] α3
Channel Number / Area [1/D2] α2
1.6[L]~[n2D ][L]~[n2D1.6[L]~[n2D ][L]~[n2D
Cost/Performance will be improved by a 3D scaling scenario:
[20] ESSCIRC’06, Keio Univ.
© T. Kuroda (45/48)
Trends in Chip Performance and Pin Bandwidth
Year’70 ’80 ’90 ’00 ’10
4004
8086
286
Intel386Intel486
Pentium
Pentium4
100
MIP
S [i
nstr
uctio
n/s]
0.01
0.1
1
10
1000
10000
100000
1000000
Data R
ate [MB
/s]
1
10
100
1000
10000
100000
Chip Perform
ance
X1.70/year
Pin Bandwidth
X1.44/year
Rambus FlexIO
’20
Inductive-Coupling Link
Tchip=150µm80µm
45µm25µm
[20] ESSCIRC’06, Keio Univ.
© T. Kuroda (46/48)
Which Technology, µ-bump, TSV, Inductive?
Analog
Digital
µ-bump
TSV
Inductive
2 chips
>3 chips
Tr-link
Chip-link
High-end
Consumer
Homogeneous
Heterogeneous
TSV-light
Inductive
© T. Kuroda (47/48)
ConclusionsInductive and capacitive coupling links are discussed.Inductive coupling has advantages over capacitive coupling in terms of coupling strength through body, package flexibility, scalability.Inductive coupling can link >2 chips (face up or down), and eliminate ESD protection to lower delay, area, power.Inductive coupling bears comparison with TSV/µ-Bump in terms of data rate (1Tb/s), reliability (BER<10-13), energy dissipation (0.1pJ/b) Inductive coupling is applicable to a standard CMOS, and less expensive than TSV/µ-Bump. Inductive coupling exhibits high noise immunity and alignment tolerance.Inductive coupling provides with AC coupling link and makes interface design easy under multiple/variable VDD’s.Inductive coupling may make non-contact testing possible.Constant magnetic field scaling scenario by thinning chip thickness is proposed as a new guideline for 3D integration.
© T. Kuroda (48/48)
To Probe Further[1] T. Ezaki, et al., “A 160Gb/s Interface Design Configuration for Multichip LSI,” ISSCC Dig. Tech. Papers, pp.140-141, Feb. 2004.[2] J. Burns, et al., “Three-Dimensional Integrated Circuits for Low-Power, High-Bandwidth Systems on a Chip,” ISSCC, pp.268-269,
Feb. 2001.[3] K. Kanda, et al., “A 1.27Gb/s/ch 3mW/pin Wireless Superconnect (WSC) Interface Scheme,” ISSCC, pp.186-187, Feb. 2003.[4] D. Mizoguchi, et al., “A 1.2Gb/s/pin Wireless Superconnect Based on Inductive Inter-chip Signaling (IIS),” ISSCC, pp.142-143, Feb.
2004.[5] S. Mick et al. “4Gbps High-Density AC Coupled Interconnection,” CICC, pp.133-140, May 2002. [6] L. Luo, et al., “3Gb/s AC-Coupled Chip-to-Chip Communication using a Low-Swing Pulse Receiver,” ISSCC, pp.522-523, Feb. 2005. [7] A. Fazzi, et al., “A 0.14mW/Gbps high-density capacitive interface for 3D system integration,” CICC, pp.101-104, Sep. 2005.[8] R. Drost, et al., “Proximity Communication,” CICC, pp.469-472, Sep. 2003.[9] R. Drost, et al., “Electronic Alignment for Proximity Communication,” ISSCC, pp.144-145, Feb. 2004.[10] N. Miura, et al., “A 195Gb/s 1.2W 3D-Stacked Inductive Inter-Chip Wireless Superconnect with Transmit Power Control Scheme,”
ISSCC, pp.264-265, Feb. 2005.[11] A. Iwata, et al., “A 3D Integration Scheme utilizing Wireless Interconnections for Implementing Hyper Brains,” ISSCC, pp.262-263,
Feb. 2005.[12] M. Sasaki, et al., “A 0.95mW/1.0Gbps Spiral-Inductor Based Wireless Chip-Interconnect with Asynchronous Communication
Scheme,” Symposium on VLSI Circuits, pp.348-351, Jun. 2005.[13] Jian Xu, et al., “2.8 Gb/s inductively coupled interconnect for 3D ICs,” Symposium on VLSI Circuits, pp.352-355, Jun. 2005.[14] N. Miura, et al., “A 1Tb/s 3W Inductive-Coupling Transceiver for Inter-Chip Clock and Data Link,” ISSCC, pp.424-425, Feb. 2006.[15] M. Inoue, et al., “Daisy Chain for Power Reduction in Inductive-Coupling CMOS Link,” Symposium on VLSI Circuits, pp.80-81,
Jun. 2006.[16] N. Miura, et al., “A 0.14pJ/b Inductive-Coupling Inter-Chip Data Transceiver with Digitally-Controlled Precise Pulse Shaping,”
ISSCC, [20.2], Feb. 2007.[17] N. Miura, et al., “Analysis and Design of Inductive Coupling and Transceiver Circuit for Inductive Inter-Chip Wireless
Superconnect,” Symposium on VLSI Circuits, pp. 246-249, Jun. 2004.[18] H. Ishikuro, et al., “An Attachable Wireless Chip-Access Interface for Arbitrary Data Rate Using Pulse-Based Inductive-Coupling
through LSI Package,” ISSCC, [20.3], Feb. 2007.[19] N. Miura, et al., “Cross Talk Countermeasures in Inductive Inter-Chip Wireless Superconnect,” CICC, pp.99-102, Oct. 2004.[20] T. Kuroda, et al., “Perspective of Low-Power and High-Speed Wireless Inter-Chip Communications for SiP Integration,” ESSCIRC,
pp.3-6, Sep. 2006.