Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 11
Soft Error Rates with Inertial and Logical Masking
Fan Wang*Fan Wang*Vishwani D. AgrawalVishwani D. Agrawal
[email protected] Department of Electrical and Computer Engineering
Auburn University, AL 36849 USA
22 th IEEE International Conference on VLSI Design*Presently with Juniper Networks, Inc. Sunnyvale, CA
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 22
OutlineOutline BackgroundBackground
Problem StatementProblem Statement
AnalysisAnalysis
Results and DiscussionResults and Discussion
ConclusionConclusion
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 33
Motivation for This WorkMotivation for This Work With the continuous downscaling of CMOS With the continuous downscaling of CMOS
technologies, the device reliability has become technologies, the device reliability has become a major bottleneck.a major bottleneck.
The sensitivity of electronic systems can The sensitivity of electronic systems can potentially become a major cause of soft (non-potentially become a major cause of soft (non-permanent) failures.permanent) failures.
The determination of soft error rate in logic The determination of soft error rate in logic circuits is a complex problem. It is necessary to circuits is a complex problem. It is necessary to analyze circuit reliability. However, there is no analyze circuit reliability. However, there is no comprehensive work that considers all the comprehensive work that considers all the factors that influence the soft error rate. factors that influence the soft error rate.
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 44
Strike Changes State of a Single BitStrike Changes State of a Single Bit
01Definition from NASA Thesaurus:Definition from NASA Thesaurus: ““Single Event Upset (SEU): Radiation-induced errors in Single Event Upset (SEU): Radiation-induced errors in
microelectronic circuits caused when charged particles [also, microelectronic circuits caused when charged particles [also, high energy particles] (usually from the radiation belts or high energy particles] (usually from the radiation belts or from cosmic rays) lose energy by ionizing the medium from cosmic rays) lose energy by ionizing the medium through which they pass, leaving behind a wake of electron-through which they pass, leaving behind a wake of electron-hole pairshole pairs””..
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 55
Cosmic RaysCosmic Rays
Earth’s Surface
pn p
p
n
n
p
p
n
nn
Neutron flux is dependent on altitude, longitude, solar activity etc.
Source: Ziegler et al.
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 66
Problem StatementProblem Statement Given background environment dataGiven background environment data
Neutron fluxNeutron flux Background energy (LET*) distributionBackground energy (LET*) distribution*These two factors are location dependent.*These two factors are location dependent.
Given circuit characteristicsGiven circuit characteristics TechnologyTechnology Circuit netlistCircuit netlist Circuit node sensitive region dataCircuit node sensitive region data*These three factors depend on the circuit.*These three factors depend on the circuit.
Estimate neutron caused soft error rate in standard FIT** Estimate neutron caused soft error rate in standard FIT** units.units.
*Linear Energy Transfer (LET) is a measure of the energy transferred to the device per unit length as an ionizing particle travels through material. Unit: MeV-cm2/mg.
**Failures In Time (FIT): Number of failures per 10Failures In Time (FIT): Number of failures per 1099 device hours device hours
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 77
Measured Environmental DataMeasured Environmental Data Typical ground-level neutron flux: 56.5cmTypical ground-level neutron flux: 56.5cm-2-2ss-1-1..
J. F. Ziegler, “Terrestrial cosmic rays,” J. F. Ziegler, “Terrestrial cosmic rays,” IBM Journal of Research IBM Journal of Research and Developmentand Development, vol. 40, no. 1, pp. 19.39, 1996., vol. 40, no. 1, pp. 19.39, 1996.
Particle energy distribution at ground-level: Particle energy distribution at ground-level: “ “For both 0.5For both 0.5μμm and 0.35m and 0.35μμm CMOS technology at ground m CMOS technology at ground
level, the largest population has an LET of 20 level, the largest population has an LET of 20 MeV-cmMeV-cm22/mg or /mg or less. Particles with energy greater than 30 MeV-cmless. Particles with energy greater than 30 MeV-cm22/mg are /mg are exceedingly rare.”exceedingly rare.” K. J. Hass and J. W. Ambles, “Single Event Transients in Deep K. J. Hass and J. W. Ambles, “Single Event Transients in Deep
Submicron CMOS,” Submicron CMOS,” Proc.Proc. 4242ndnd Midwest Symposium on Circuits and Midwest Symposium on Circuits and Systems,Systems, vol. 1, 1999. vol. 1, 1999.
Linear energy transfer (LET), MeV-cm2/mg
Prob
abili
ty d
ensi
ty
0 15 30
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 88
Proposed Soft Error ModelProposed Soft Error Model
Occurrence rate
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 99
Pulse Width Probability Density PropagationPulse Width Probability Density Propagation
1
X
Y
We use a “3-interval piecewise linear” propagation model We use a “3-interval piecewise linear” propagation model 1)1) Non-propagation, if XNon-propagation, if X ≤≤ττp.p. Propagation with attenuation, if Propagation with attenuation, if ττp p << X X << 22ττp.p. Propagation with no attenuation, if XPropagation with no attenuation, if X 22ττp.p.
WhereWhere1)1) X: input pulse widthX: input pulse width2)2) Y: output pulse widthY: output pulse width3)3) ττp p : gate input to output delay: gate input to output delay
τp 2τp0 X
YfX(x)fY(y)
Delayττpp
Probability TransformationProbability Transformation Consider random variables x and y, andConsider random variables x and y, and Function, Y = F(X)Function, Y = F(X) Given, P.D.F. of X is p(x)Given, P.D.F. of X is p(x) P.D.F. of Y: p(x)dx = p(y)dy; p(y) = p(x)/(dy/dx) P.D.F. of Y: p(x)dx = p(y)dy; p(y) = p(x)/(dy/dx)
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 1010x x+dx
y+dyy
X
Y =
F(X)
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 1111
Validation Using HSPICE SimulationValidation Using HSPICE Simulation
CMOS inverter in TSMC035 technology with load capacitance 10fFCMOS inverter in TSMC035 technology with load capacitance 10fF
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 1212
Comparing MethodsComparing MethodsFactorsFactorsConsideredConsidered
LETLETSpec.Spec.
Re_covRe_covFanoutFanout
Sensi. Sensi. regionregion
Occur Occur raterate
VectoVectors?rs? AltitudeAltitude CktCkt
Tech.Tech.SET SET degradegradationdation
Our workOur work YesYes NoNo YesYes YesYes NoNo YesYes YesYes YesYesRao et at. [1]Rao et at. [1] YesYes NoNo NoNo NoNo YesYes YesYes YesYes YesYesRajaraman et Rajaraman et al. [2]al. [2] NoNo NoNo NoNo NoNo YesYes NoNo NoNo YesYesAsadi-Tahoori Asadi-Tahoori [3][3] NoNo NoNo NoNo YesYes NoNo NoNo NoNo NoNoZhang-Zhang-Shanbhag[4]Shanbhag[4] YesYes NoNo YesYes YesYes YesYes YesYes YesYes NoNoRejimon-Rejimon-Bhanja [5]Bhanja [5] NoNo NoNo NoNo YesYes YesYes NoNo NoNo NoNo
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 1313
Experimental Results ComparisonExperimental Results Comparison
CktCkt ##PIPI
##POPO
##Gat-Gat-eses
Our approachOur approach Rao et al. [1]Rao et al. [1] Rajaraman Rajaraman et al[2]et al[2]
CPU CPU ss FITFIT
CPU CPU ss
FITFIT CPU CPU minmin
Error Error Prob.Prob.
C432C432 3636 77 160160 0.040.04 1.18x101.18x1033 <0.01<0.01 1.75x101.75x10-5-5 108108 0.07250.0725C499C499 4141 3232 202202 0.140.14 1.41x101.41x1033 0.010.01 6.26x106.26x10-5-5 216216 0.00410.0041C880C880 6060 2626 383383 0.080.08 3.86x103.86x1033 0.010.01 6.07x106.07x10-5-5 102102 0.01880.0188C1908C1908 3333 2525 880880 1.141.14 1.63x101.63x1044 0.010.01 7.50x107.50x10-5-5 10731073 0.00110.0011Computing PlatformComputing Platform Sun Fire 280RSun Fire 280R Pentium 2.4 GHzPentium 2.4 GHz Sun Fire v210Sun Fire v210Circuit TechnologyCircuit Technology TSMC035TSMC035 Std. 0.13 Std. 0.13 µmµm 70nm BPTM*70nm BPTM*AltitudeAltitude GroundGround GroundGround N/AN/A
*BPTM: Berkeley Predictive Technology Model
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 1414
More Result ComparisonMore Result Comparison
Measured DataMeasured Data Logic Circuit SER Estimation Logic Circuit SER Estimation Ground LevelGround Level
DevicesDevices SER*SER*(FIT/Mbit)(FIT/Mbit) Our WorkOur Work Rao et al. [1]Rao et al. [1]
0.130.13µµ SRAMs[6] SRAMs[6] 10,000 to 10,000 to 100,000100,000
1,000 to 1,000 to 20,00020,000
1x101x10-5 -5 to to 8x108x10-5-5
SRAMs, 0.25SRAMs, 0.25μμ and below [7]and below [7]
10,000 to 10,000 to 100,000100,000
1 Gbit memory in 1 Gbit memory in 0.250.25µµ [8] [8] 4,2004,200
* The altitude is not mentioned for these data.
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 1515
Circuit Topology and SERCircuit Topology and SER
Circuit topology influences the logic SER.Circuit topology influences the logic SER. We have analyzed two types of circuits for We have analyzed two types of circuits for
different sizes, an inverter chain and a different sizes, an inverter chain and a ripple carry adder.ripple carry adder.
For inverter chain, in TSMC035 technology For inverter chain, in TSMC035 technology the critical width is between 25ps and the critical width is between 25ps and 50ps.50ps.
For ripple carry adder, the critical width For ripple carry adder, the critical width may not exist.may not exist.
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 1616
Inverter Chain and SERInverter Chain and SER
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 1717
Ripple Carry Adder and SERRipple Carry Adder and SER
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 1818
ConclusionConclusion SER in logic and memory chips will continue to SER in logic and memory chips will continue to
increase as devices become more sensitive to increase as devices become more sensitive to soft errors at sea level.soft errors at sea level.
By modeling the soft errors by two parameters, By modeling the soft errors by two parameters, the occurrence rate and single event transient the occurrence rate and single event transient pulse width density, we effectively account for pulse width density, we effectively account for the electrical masking of circuit.the electrical masking of circuit.
Our research on critical width of SER for Our research on critical width of SER for different circuit topologies may provide better different circuit topologies may provide better insights for soft error protection schemes. insights for soft error protection schemes.
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 1919
ReferencesReferences[1] R[1] R. R. Rao, K. Chopra, D. Blaauw, and D. Sylvester, “An Efficient Static Algorithm . R. Rao, K. Chopra, D. Blaauw, and D. Sylvester, “An Efficient Static Algorithm
for Computing the Soft Error Rates of Combinational Circuits,” for Computing the Soft Error Rates of Combinational Circuits,” Proc. Design Proc. Design Automation and Test in EuropeAutomation and Test in Europe, pp. 164-169, 2006., pp. 164-169, 2006.
[2] R. Rajaraman, J. S. Kim, N. Vijaykrishnan, Y. Xie, and M. J. Irwin, “SEAT-LA: A [2] R. Rajaraman, J. S. Kim, N. Vijaykrishnan, Y. Xie, and M. J. Irwin, “SEAT-LA: A Soft Error Analysis Tool for Combinational Logic," Soft Error Analysis Tool for Combinational Logic," Proc. 19Proc. 19thth International International Conference on VLSI DesignConference on VLSI Design, 2006, pp. 499-502., 2006, pp. 499-502.
[3] G. Asadi and M. B. Tahoori, “An Accurate SER Estimation Method Based on [3] G. Asadi and M. B. Tahoori, “An Accurate SER Estimation Method Based on Propagation Probability,” Propagation Probability,” Proc. Design Automation and Test in Europe ConfProc. Design Automation and Test in Europe Conf, , 2005, pp. 306-307.2005, pp. 306-307.
[4] M. Zhang and N. R. Shanbhag, “A Soft Error Rate Analysis (SERA) [4] M. Zhang and N. R. Shanbhag, “A Soft Error Rate Analysis (SERA) Methodology,” Methodology,” Proc.Proc. IEEE/ACM International Conference on Computer Aided IEEE/ACM International Conference on Computer Aided Design, Design, 2004, pp. 111-118.2004, pp. 111-118.
[5] T. Rejimon and S. Bhanja, “An Accurate Probabilistic Model for Error [5] T. Rejimon and S. Bhanja, “An Accurate Probabilistic Model for Error Detection,” Detection,” Proc. 18th International Conference on VLSI DesignProc. 18th International Conference on VLSI Design, 2005, pp. 717-, 2005, pp. 717-722.722.
[6] J. Graham, “Soft Errors a Problem as SRAM Geometries Shrink,” [6] J. Graham, “Soft Errors a Problem as SRAM Geometries Shrink,” http://www.ebnews.com/story/OEG20020128S0079http://www.ebnews.com/story/OEG20020128S0079, ebn, 28 Jan 2002., ebn, 28 Jan 2002.
[7] W. Leung, F.-C. Hsu and M. E. Jones, “The Ideal SoC Memory: 1T-SRAM[7] W. Leung, F.-C. Hsu and M. E. Jones, “The Ideal SoC Memory: 1T-SRAMTMTM,” ,” Proc. Proc. 13th Annual IEEE International ASIC/SOC Conference13th Annual IEEE International ASIC/SOC Conference, pp. 32-36, 2000 , pp. 32-36, 2000
[8] Report, “Soft Errors in Electronic Memory-A White Paper," Technical report, [8] Report, “Soft Errors in Electronic Memory-A White Paper," Technical report, Tezzaron Semiconductor, 2004.Tezzaron Semiconductor, 2004.
[9] F. Wang, “[9] F. Wang, “Soft Error Rate Determination for Nanometer CMOS VLSI CircuitsSoft Error Rate Determination for Nanometer CMOS VLSI Circuits,” ,” Master’s Thesis, Auburn University, Electrical and Computer Engineering, May Master’s Thesis, Auburn University, Electrical and Computer Engineering, May 2008.2008.
Jan 5-9, 2009Jan 5-9, 2009 VLSID'2009VLSID'2009 2020
Thank You Thank You . . .. . .