ModSim 2021
Photos placed in horizontal position with even amount of white space
between photos and header
Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.
T. Patrick Xiao, Christopher H. Bennett, Ben Feinberg, Matthew Marinella, Sapan Agarwal
Sandia National Laboratories, Albuquerque, [email protected]
CrossSim: GPU-Accelerated Simulation of Analog Neural Networks
1
ModSim 2021 2
Deep learning inside memory arrays
V1
G3,2
V2
V3
G2,2
G1,2
G3,1
G2,1
G1,1
G3,3
G2,3
G1,3
ΣGi,1Vi ΣGi,2Vi ΣGi,3Vi
ElectricalMathematical
x1
x2
x3 A3,2
A2,2
A1,2
A3,1
A2,1
A1,1
A3,3
A2,3
A1,3
ΣAi,1xi ΣAi,2xi ΣAi,3xi=
T
Matrix-vector multiplication:
Ax
V3T2
V2T2
V1T2
V3T1
V2T1
V1T1
V3T3
V2T3
V1T3
V1
V2
V3
T1 T2 T3
x1
x2
x3 x3δ2
x2δ2
x1δ2
x3δ1
x2δ1
x1δ1
x3δ3
x2δ3
x1δ3
δ1 δ2 δ3
=
Outer product update:
xδT
Highly energy-efficient, but is it accurate enough?
ModSim 2021 3
Neural networkAnalog MVM cores
Partial result aggregation
Row
driv
ers/
DA
C
ADC
Core 0Core 1Core 2
Weight matrix
mapping engine
Neural network model
Inference
Dataset
Laye
r inp
uts
Layer outputs
Bias addition
Activation function
prediction
Integrator
weight datapartial sums, activations
MVM simulatorArray and device
non-idealities
• Device program errors• Array parasitic
resistance with built-in fast circuit simulator
• ADC quantization• Cycle-to-cycle read
noise• Conductance drift
Characterized device data• Program errors,
process variation• Retention loss• Noise properties• On/Off ratio
conductance, G # de
vice
s
System parameters• Weight bit slicing• Input bit slicing• Negative number handling• ADC resolution & ranges• Array size• Array electrical topology
…
Inputs to CrossSim
To be released soon! Check cross-sim.sandia.gov
Python with CUDA
acceleration
ModSim 2021 4
Multi-scale modeling of inference accuracyDevice properties affect accuracy Array design affects accuracy
G
# de
vice
s ΔG = αpropGΔG = αindGmax
# de
vice
s
G
System architecture affects accuracy
10–310–5 10–4
Parasitic resistance Rp / Rmin
10–2
MNIST, CNN-6
CIFAR-10, ResNet56
ImageNet, ResNet50
Rp
State-independent programming error
State-proportional programming error
Xiao et al, arXiv:2109.01262, 2021Xiao et al, Semi Sci Tech, Accepted (in press), 2021
Offset subtraction Differential cells
State-proportional error αprop (%)
Error αind (%) Error αprop (%)
ResNet50 MobileNet-v1Inception-v3 VGG-19
Wij ~ Gij – Goffset Wij ~ Gij – Gij+ –
8 bits/cell4 bits/cell2 bits/cell
7 bits/cell4 bits/cell2 bits/cell
ResNet50 ResNet50
CrossSim’s fast built-in circuit simulator
ModSim 2021
LUT 2
5
Neural network Analog MVM cores
Row
driv
ers/
DA
C
Core 0Core 1
Core 2
Training
Act
ivat
ions
x
loss function error
…
ADC
Integrator Column drivers/ DAC
Errors δ
Initial conductance (μS)
Con
duct
ance
ch
ange
(μS
)
Probabilistic device lookup table (CDF)
Initial conductance,
desired update
Realistic conductance
update
LUT 1
Device pulse data
Bac
kpro
paga
tion
Lookup tables can model:• Arbitrary device update nonlinearity and asymmetry
properties, not describable by analytical equations• Cycle-to-cycle write noise• Device-to-device variation Fuller et al, Science 2019
Bennett et al, IRPS 2019
LUT 0+
Python with CUDA
acceleration
ModSim 2021 6
From device measurements to accuracyTaOx ReRAM
Marinella, Agarwal et al, JETCAS 2018
Electrochemical RAM
Van der Burgt et al, Nature Materials 2017
Domain wall magnetic tunnel junction
Liu et al, Appl Phys Lett, 2021
Device
Pulse data
MNIST accuracy (2-layer MLP)