11. Beyond CMOS Benchmarking
Dmitri Nikonov
Thanks to Ian Young
Beyond CMOS computing
Nikonov 11. Benchmarking 1
Acknowledgement
• Benchmarking thrust led by K. Bernstein for the last 3 years.
• It was a momentous step: from concepts of devices to envisioning practical circuits.
• We wanted to equalize the assumptions and approaches to benchmarking.
• All devices PIs were very supportive of this work. Worked hard to understand and refresh their data. >3 meetings with each group.
• We take all the responsibility for possible errors in the new analysis.
• Hoping that this overview will stimulate discussions and improvements.
Nikonov 11. Benchmarking 2
Insights of Benchmarking Practical considerations
1. Area is determined by metal pitch (4F) to connect to the device [terminal connections]
2. Parasitics can overwhelm the intrinsic device attributes.
3. Majority gates permit more compact, faster circuits
4. Power delivery dominates for very low-voltage devices
Results
1. Spintronics devices are dominated by either – switching energy (spin torque)
– magnetization switching speed (magnetoelectric).
2. Charge-based devices are an attractive option: good E*d, compatible with CMOS circuits
3. Spintronic devices still competitive on throughput at low power
Nikonov 11. Benchmarking 3
Reminder of Nomenclature
4 Nikonov 11. Benchmarking
Adder: Transistor or Majority Gates
• Spintronic circuits can be more compact
Adder = 28 transistors (at least)
… or just 3 majority gates (Nanomagnetic Logic)
… or just 2 majority gates (All Spin Logic)
… or just 1 majority gate (Spin Wave Devices) !
Nikonov 11. Benchmarking 5
Energy-Delay of Beyond CMOS Electronic Devices
Rely on device simulations done by device research groups to obtain input
parameters. But audited numbers.
Add thorough accounting of driven capacitance, parasitics, interconnect. [From realistic layout] Obtain switching time and energy of gates (inverter, NAND, XOR, 1-bit adder). Values validated against Purdue’s electronic simulator, PETE
Simple underlying equations for intrinsic switching time and energy
int/
dev dd devt C V I
2
int dev ddE C V
Vdd, V Ion, A/m
CMOS HP 0.73 1805
CMOS LP 0.3 2
IIIvTFET 0.2 25
HJTFET 0.3 112
gnrTFET 0.1 20
GpnJ 0.7 3932
BisFET NA NA
SpinFET 0.7 700
(4)
(5)
Nikonov 11. Benchmarking 6
Energy-Delay of Beyond CMOS Spintronic Devices
STO SRO PZT
CoFe MgO CoFe IrMn Cu
Iout
Vin
FE polarization
Current induced spin torque Voltage driven Magnetoelectric Switching
b u nmU K v
bc
e UI
P
energy barrier
critical current
0ms ms msP = E
2
ms ms S tl X ddQ P w c w V
polarization
charge
STT Magnetoelectric
Time to switch
Energy
2 2log
3
Bs nmstt
B c c b
k TeM vt
g P I I U
ms ms ddE Q V
2mag
me
tB
stt dev dd sttE I V t
DOMI NATES
DOMI NATES
Nikonov 11. Benchmarking 7
Switching Current vs. Supply Voltage
10 -2
10 -1
10 0
10 -1
10 0
10 1
10 2
10 3
Voltage, V
Cu
rren
t,
A
CMOS HP
CMOS LP
IIIvTFET
HJTFET
gnrTFET
GpnJ SpinFET
STT/DW
SMG
STTtriad
STOlogic ASLD
SWD
NML
High resistance
Low resistance
Spin Torque 10mV
Magnetoelectric 100mV
Electronic 100-700mV
Nikonov 11. Benchmarking 8
10 1
10 2
10 3
10 4
10 5
10 6
10 -2
10 -1
10 0
10 1
10 2
10 3
Delay, ps
En
erg
y, f
J
CMOS HP
CMOS LP
IIIvTFET
HJTFET
gnrTFET
GpnJ
SpinFET
STT/DW
SMG
STTtriad
STOlogic
ASLD
SWD
NML
32bit adder
Benchmarks with spin torque
Spin torque
10-26
10-25
10-24
10-23
10-22 10-21
E*d Constant Energy*Delay
Electronics
Nikonov 11. Benchmarking 9
10 1
10 2
10 3
10 4
10 5
10 6
10 -2
10 -1
10 0
10 1
10 2
10 3
Delay, ps
En
erg
y, f
J
CMOS HP
CMOS LP
IIIvTFET
HJTFET
gnrTFET
GpnJ
SpinFET
STT/DW
SMG
STTtriad
STOlogic
ASLD
SWD
NML
32bit adder
Benchmarks with magnetoelectric
Spin torque Electronics
Magnetoelectric
10-26
10-25
10-24
10-23
10-22 10-21
E*d Constant Energy*Delay
Nikonov 11. Benchmarking 10
Changes in Benchmarking (selected)
Devices NRI Oct 2011 Intel March 2012
CMOS HP Purdue 15nm MOSFET simulated parameters, V=0.7V
Taken from ITRS, V=0.73V
STTriad Only spin torque switching Possibility of magnetoelectric switching
STOlogic In plane magnetization Perpendicular magnetization, optimistic inputs
STT/DW Proponent’s old scheme, not well-founded calculations
New device scheme and architecture
ASLD Smaller voltage. Volatile. Only intrinsic contribution.
Larger voltage. Non-volatile.
NML Smaller size, favorable assumptions of clocking
Magnetoelectric switching, larger size, interconnect contribution
Nikonov 11. Benchmarking 11
10 -1
10 0
10 1
10 2
10 3
10 4
10 -4
10 -2
10 0
10 2
Delay, ps
En
erg
y, f
J
CMOS HP
CMOS LP
IIIvTFET
HJTFET
gnrTFET
GpnJ
SpinFET STT/DW
SMG
STTtriad
STOlogic
ASLD
SWD
NML
NAND2
CMOS HP
CMOS LP
IIIvTFET
HJTFET
gnrTFET
GpnJ
BisFET
STT/DW
STTtriad STOlogic
SpinFET ASLD
SWD
NML
Before and After NRI Oct 2011
Intel March 2012
size
voltage
ITRS
magnetoelectric perp M
scheme
Nikonov 11. Benchmarking 12
Differences, CMOS
Values from ITRS 2011
Before Now A. Khakifirooz et. al., IEEE TED vol. 55, pp. 1391– 1400, 2008. MIT
Nikonov 11. Benchmarking 13
Differences, BisFET - TBD
Reddy et al., IEEE TED 57, 755 (2010), UT Austin Gilbert, IEEE TED 57, 3059 (2010), UIUC
UT Austin Vdd = 25mV Curve fitted to one back-of-the –envelope calculated point. Thermal distribution, SS not accounted for. Need a simulated I-V curve. Not ready to benchmark.
UIUC Vdd = 600mV Simulations of transport via Landauer transmission probability. BUT, according to the device group: May be not optimal parameters. Different wiring.
Nikonov 11. Benchmarking 14
Differences, ASLD
Purdue Intel
VSS =2mV
Input Output
Vdd=10mV
10x10x1nm 3000 spins
Volatile, D=10kT
voltage supplied directly to the device Vss=2mV
15x15x2nm 14500 spins Non-volatile, D=65kT considering the resistance of the power and ground distribution networks, hierarchical Vdd=10mV
2mV
Power and ground
dist. 1000 devices
Nikonov 11. Benchmarking 15
10 -2
10 -1
10 0
10 -1
10 0
10 1
10 2
10 3
Voltage, V
Cu
rren
t,
A
CMOS HP
CMOS LP
IIIvTFET
HJTFET
gnrTFET
GpnJ SpinFET
STT/DW
SMG
STTtriad
STOlogic ASLD
SWD
NML
Charge vs. Voltage
High energy
Shot noise? Low capacitance
Nikonov 11. Benchmarking 16
10 2
10 3
10 4
10 5
10 6
10 7
10 1
10 2
10 3
10 4
10 5
10 6
Resistance, Ohm
Ch
arg
e,
e
CMOS HP CMOS LP
IIIvTFET HJTFET
gnrTFET
GpnJ
SpinFET
STT/DW
SMG
STTtriad
STOlogic
ASLD
SWD
NML
Charge vs. Resistance
Larger energy*delay
Smaller energy*delay
Nikonov 11. Benchmarking 17
10 2
10 3
10 4
10 5
10 6
10 7
10 0
10 2
10 4
10 6
10 8
Resistance, Ohm
Cap
acit
an
ce,
aF
CMOS HP CMOS LP IIIvTFET
HJTFET
gnrTFET
GpnJ SpinFET
STT/DW
SMG
STTtriad
STOlogic
ASLD
SWD
NML
Capacitance vs. Resistance
Slower devices
Faster devices
Nikonov 11. Benchmarking 18
10 2
10 4
10 6
10 8
10 10
10 2
10 4
10 6
10 8
10 10
Q 2 R, h
En
erg
y*d
ela
y,
h
CMOS HP
CMOS LP
IIIvTFET
HJTFET gnrTFET
GpnJ
SpinFET
STT/DW
SMG
STTtriad
STOlogic
ASLD
SWD
NML
Energy*Delay vs. Q2*R
Switch with Beff, slower than electric
Too good to be true?
Nikonov 11. Benchmarking 19
10 -1
10 0
10 1
10 2
10 3
10 4
10 1
10 2
10 3
Device Delay, ps
Ad
der/
De
vic
e D
ela
y CMOS HP
CMOS LP
IIIvTFET HJTFET
gnrTFET
GpnJ
SpinFET
STT/DW
SMG
STTtriad
STOlogic
ASLD
SWD
NML
Device vs. Circuit, Time
Majority gates
• Majority gates => faster circuits
Fast devices
Fast
cir
cu
its
Nikonov 11. Benchmarking 20
10 -4
10 -3
10 -2
10 -1
10 0
10 1
10 2
10 3
10 4
Device Energy, fJ
Ad
der/
De
vic
e E
nerg
y
CMOS HP CMOS LP
IIIvTFET HJTFET
gnrTFET
GpnJ SpinFET
STT/DW SMG
STTtriad
STOlogic
ASLD
SWD
NML
Device vs. Circuit, Energy
Low energy devices
Nikonov 11. Benchmarking 21
Low energy devices
Lo
w e
nerg
y
cir
cu
its
• More devices (STTriad, STT/DW) – less efficient
10 -4
10 -2
10 0
10 2
10 4
10 3
10 4
10 5
10 6
Devics E*d, fJ × ps
Ad
der/
De
vic
e E
*d
CMOS HP CMOS LP
IIIvTFET HJTFET
gnrTFET
GpnJ
SpinFET
STT/DW SMG
STTtriad
STOlogic
ASLD
SWD
NML
Device vs. Circuit, Energy*Delay
• Fewer element => efficient circuits
Efficient devices
Eff
icie
nt
cir
cu
its
Nikonov 11. Benchmarking 22
Switching time and energy, closer look
10 1
10 2
10 3
10 4
10 5
10 6
10 -2
10 -1
10 0
10 1
10 2
10 3
Delay, ps
En
erg
y, f
J
CMOS HP
CMOS LP
IIIvTFET
HJTFET
gnrTFET
GpnJ
SpinFET
STT/DW
SMG
STTtriad
STOlogic
ASLD
SWD
NML
32bit adder
Worse
Better
Fast Slow
Limited by Capacitor charging
Steep turn-on/off (TFETs)
Limited by spin dynamics
Magneto-electric
Potentially Nonvolatile
Nikonov 11. Benchmarking 23
Energy Aware Figure of Merit: Throughput with Capped Power
New FOM = Throughput with Capped Power
“Computational throughput with capped power measured as Operations per second per logic die area measures how useful a computer is, in a power constrained computing environment.”
Choose 10W/cm2* as the cap
Re-scales throughput by the same factor, either
i. Less dense circuits
ii. Slower circuits
* Clocking and long interconnect dissipation are not included
Throughput @ Capped Power = Switching Operations/Area/Time
T@CP Units = [Operations/s/cm2]
Nikonov 11. Benchmarking 24
Throughput and Power Comparison
10 -2
10 -1
10 0
10 1
10 -2
10 -1
10 0
10 1
10 2
Throughput, PetaIntegerOps/s/cm 2
Po
wer,
W/c
m 2
CMOS HP
CMOS LP
IIIvTFET
HJTFET
gnrTFET
GpnJ SpinFET STT/DW
SMG
STTtriad
STOlogic ASLD
SWD NML
32bit adder
worse
better
* Cap for power
10W/cm2, slowed down
circuits
Limited by Power
Dissipation
Energy Efficient, Lower Voltage
• SWD, HJTFET = high throughput, low power
Nikonov 11. Benchmarking 25
Insights of Benchmarking Practical considerations
1. Area is determined by metal pitch (4F) to connect to the device [terminal connections]
2. Parasitics can overwhelm the intrinsic device attributes.
3. Majority gates permit more compact, faster circuits
4. Power delivery dominates for very low-voltage devices
Results
1. Spintronics devices are dominated by either – switching energy (spin torque)
– magnetization switching speed (magnetoelectric).
2. Charge-based devices are an attractive option: good E*d, compatible with CMOS circuits
3. Spintronic devices still competitive on throughput at low power
Nikonov 11. Benchmarking 26