Date post: | 30-Mar-2015 |
Category: |
Documents |
Upload: | genesis-burrage |
View: | 227 times |
Download: | 3 times |
TMR SchemesTMR Schemes
Melanie BergMelanie Berg
MEI Technologies/NASA GSFCMEI Technologies/NASA GSFC
[email protected]@NASA.gov
Voting Matrix
Page Page 22European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
OverviewOverview
Premise: Why do various FPGAs require Premise: Why do various FPGAs require separate mitigation strategies?separate mitigation strategies?
Radiation Effects in FPGA devicesRadiation Effects in FPGA devices
Mitigation and Actel Anti-fuse DevicesMitigation and Actel Anti-fuse Devices
Mitigation and Xilinx Virtex DevicesMitigation and Xilinx Virtex Devices
ToolsTools
Page Page 33European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Radiation Effects in FPGA devicesRadiation Effects in FPGA devicesSingle Event Transients (SETs)Single Event Transients (SETs)
Single Event Upsets (SEUs)Single Event Upsets (SEUs)
Single Event Functional Interrupts (SEFIs)Single Event Functional Interrupts (SEFIs)
Page Page 44European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Single Event Effects (SEEs) and IC Single Event Effects (SEEs) and IC System ErrorSystem Error
SEUs or SETs can occur in:SEUs or SETs can occur in:Combinatorial LogicCombinatorial Logic
Sequential LogicSequential Logic
Configuration Memory CellsConfiguration Memory Cells
Depending on the Device and the design, Depending on the Device and the design, each fault type will:each fault type will:
Have a probability of occurrenceHave a probability of occurrence
Either have a significant or insignificant Either have a significant or insignificant contribution to system errorcontribution to system error
Every Device has different Error Responses – We must understand the differences and design
appropriately
Page Page 55European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Combinatorial Logic Blocks and Potential Combinatorial Logic Blocks and Potential Upsets… SETs in Anti-fuse FPGAsUpsets… SETs in Anti-fuse FPGAs
Page Page 66European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Basic Combinatorial Logic Blocks and Potential Basic Combinatorial Logic Blocks and Potential UpsetsUpsets
TRANSIENT
PSET
STUCK UNTIL OVERWRITTEN
Probability of Configuration Fault
PConfiguration
Page Page 77European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
DFF’s: SEUs and SEFIsDFF’s: SEUs and SEFIs
Strike Caught in Loop
D Q
reset
CLK
PDFFSEU
Probability of SEU
Probability of SEFIPSEFI
Page Page 88European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Transient Capture on A DFF Data Input Pin Transient Capture on A DFF Data Input Pin (SET→SEU)(SET→SEU)
clockTpulse
tp = 1/fs
Q
QSET
CLR
D
P(fs)SET→SEU
fs
PfsPfsPfsTfsP DFFEnSETpropSETgenpulse
seuset 12
)()()(
fs : System Frequency
T(fs)pulse : SET Pulse Width
P(fs)SETgen: Probability SET generated with sufficient amplitude
P(fs)SETprop : Probability SET can propagate with sufficient amplitude
PDFFEn : Probability DFF is enabled (active)
P(fs)SET→SEU : Probability SET can be caught by clock edge
Page Page 99European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Frequency Effects and Frequency Effects and Conventional DFF Upset TheoryConventional DFF Upset Theory
DF
Fer
ror
Frequency
DFFMBUSEUSETDFFSEUDFFerror PfsPPfsP )(
Composite Cross Section
~0
PDFFSEU & PDFFMBU
P(fs) SET→SEU
PDFF(fs)error
Page Page 1010European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Summary: Most Significant Factors of Summary: Most Significant Factors of System Error Probability P(System Error Probability P(fsfs))errorerror
SEFISEUSETDFFSEUionConfiguraterror PfsPPPfsP )( SEFISEUSETDFFSEUionConfiguraterror PfsPPPfsP )(
Configuration DFFs SEFIs
SRAM Based FPGAs
STATIC
SEU
Dynamic
SET→SEU
Clocks & Resets
Inaccessible control circuitry
ionConfiguratPDFFSEUP
SEUSETfsP )( SEFIP
Page Page 1111European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Reducing System Error: Common Mitigation Reducing System Error: Common Mitigation TechniquesTechniques
Mitigation can be:Mitigation can be:EmbeddedEmbedded: built into the device library cells: built into the device library cells
User does not verify the mitigation – manufacturer doesUser does not verify the mitigation – manufacturer does
User insertedUser inserted:: part of the actual design process part of the actual design processUser must verify mitigation… Complexity is a RISK!!!!!!!!User must verify mitigation… Complexity is a RISK!!!!!!!!
Common Mitigation Types:Common Mitigation Types:Local Triple Modular Redundancy (LTMR)Local Triple Modular Redundancy (LTMR)
Global Triple Modular Redundancy (GTMR)Global Triple Modular Redundancy (GTMR)
SEFISEUSETDFFSEUionConfiguraterror PfsPPPfsP )( SEFISEUSETDFFSEUionConfiguraterror PfsPPPfsP )(
Page Page 1212European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Example Mitigation Schemes Example Mitigation Schemes will use Majority Votingwill use Majority Voting
I0I0 I1I1 I2I2 Majority VoterMajority Voter
00 00 00 00
00 00 11 00
00 11 00 00
00 11 11 11
11 00 00 00
11 00 11 11
11 11 00 11
11 11 11 11
102021 IIIIIIterMajorityVo
Page Page 1313European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Mitigation and Actel Anti-Mitigation and Actel Anti-fuse Devicesfuse Devices
Page Page 1414European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
ACTEL RTAX-S Architecture BasicsACTEL RTAX-S Architecture Basics
Embedded RHBD:Embedded RHBD:Hardened Global Clocks and ResetsHardened Global Clocks and Resets
Antifuse Configuration is SEU immuneAntifuse Configuration is SEU immune
Embedded Localized TMR (LTMR) at each DFF (RCELL) Embedded Localized TMR (LTMR) at each DFF (RCELL)
Source: RTAX-S/SL RadTolerant FPGAs 2009 Actel.com
Super Cluster:•Combinatorial Cells: C CELLS•DFF Cells: R Cells
Page Page 1515European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Local Triple Modular Redundancy Local Triple Modular Redundancy (LTMR): (LTMR): Smallest Area & PowerSmallest Area & Power
Triple Each DFF + Vote… Triple Each DFF + Vote…
Data paths are not redundant – can only have one voterData paths are not redundant – can only have one voter
Unprotected:Unprotected:Clocks and Resets… SEFIClocks and Resets… SEFI
Transients (SET->SEU)Transients (SET->SEU)
Internal/hidden device logic: SEFIInternal/hidden device logic: SEFI
SEFISEUSETDFFSEUionConfiguraterror PfsPPPfsP )( SEFISEUSETDFFSEUionConfiguraterror PfsPPPfsP )(Low
Non-Mitigated Mitigated
Page Page 1616European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
ACTEL RTAX-S Embedded ACTEL RTAX-S Embedded Mitigation… LTMR and SETsMitigation… LTMR and SETs
Combinatorial logic: C-CELL
Sequential logic R-CELLCombinatorial logic C-CELL
X
X
X
Super Cluster
C RRX
TX
RX
TX
RX
TX
RX
TX
BC CC R
Combinatorial logic C-CELL
TX
C
C CR
RX
Page Page 1717European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
RTAX Example: Probability of Error RTAX Example: Probability of Error ReductionReduction
Error Probability is Per DFF bit
Error Rate must reflect frequency of operation
SEFISEUSETDFFSEUionConfiguraterror PfsPPPfsP )( SEFISEUSETDFFSEUionConfiguraterror PfsPPPfsP )(Low ~00
Page Page 1818European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Upper-Bound Error Prediction RHBD Upper-Bound Error Prediction RHBD Anti-fuse FPGAAnti-fuse FPGA
DFF (near) Static Error Bit Rate no CCells DFF (near) Static Error Bit Rate no CCells PPDFFSEUDFFSEU::
15MHz to 120MHz: Dynamic Error Bit Rate with 8 15MHz to 120MHz: Dynamic Error Bit Rate with 8
levels of CCells levels of CCells P(P(fsfs))SET→SEUSET→SEU::
daybit
Errors
dt
dEbit 10101 Source: Actel
daybit
Errors
dt
fsdEbit 89 106101
Source: NASA Goddard
Page Page 1919European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Upper-Bound Error Prediction Actel Upper-Bound Error Prediction Actel RHBD Anti-fuse FPGARHBD Anti-fuse FPGA
UsedDFFsdt
fsdE
dt
dE bit #*
design
bitsn
daybit
Errorsx *106 8
SEUSETerror fsPfsP )( SEUSETerror fsPfsP )(
With embedded LTMR Mitigation + Hardened Clocks:
daydesign
Errorsx
dt
dE 3103
Thousands of years in LEO !!!!!
Page Page 2020European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Mitigation and Xilinx Virtex DevicesMitigation and Xilinx Virtex Devices
Page Page 2121European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Xilinx XQR4VSX55: Radiation Test Xilinx XQR4VSX55: Radiation Test DataData
For non-mitigated designs the most significant upset For non-mitigated designs the most significant upset factor is:factor is:
Xilinx Consortium: VIRTEX-4VQ STATIC SEU CHARACTERIZATION SUMMARY: April/2008
ionConfiguratP
Probability Error Rate LEO GEO
Configuration Memory: XQR4VSX55
Pconfiguration 7.43 4.2
Combined SEFIs per device
PSEFI 7.5x10-5 2.7x10-5
dt
dE ionconfigurat
dt
dESEFI
daydevice
Upsets
daydevice
Upsets
M Berg, Trading ASIC and FPGA Considerations for System Insertion; IEEE Nuclear Science Radiation Effects Conference 2009
Page Page 2222European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Global Triple Modular Redundancy (GTMR): Global Triple Modular Redundancy (GTMR): Largest Area → Greatest ComplexityLargest Area → Greatest Complexity
Triple Entire DesignTriple Entire Design
Triple I/O and VotersTriple I/O and Voters
Unprotected – hidden device logic SEFIsUnprotected – hidden device logic SEFIs
Can not be an embedded strategy: Complex to verifyCan not be an embedded strategy: Complex to verify
Xilinx offers XTMRXilinx offers XTMR
SEFISEUSETDFFSEUionConfiguraterror PfsPPPfsP )( SEFISEUSETDFFSEUionConfiguraterror PfsPPPfsP )(Low Low
Non-Mitigated Mitigated
Low
Page Page 2323European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
XTMR – Capturing XTMR – Capturing Asynchronous Input dataAsynchronous Input data
INPUT: Async_DATA_tr0
INPUT: Async_DATA_tr1
INPUT: Async_DATA_tr2
n n+1 n+2 n+3
n
n+1
INPUTSKEW
EDGE DETECT TIMING WAVEFORM
Edge_detect_tr0
Edge_detect_tr1
Edge_detect_tr2
n+3 n+4 n+5
Voted rising edge detect
n+2
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
E
Edge Detect Circuit
Metastability Filter
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
E
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
E
VOTER
Async_data_tr0
Async_data_tr1
Async_data_tr2
Dynamic Analysis:
•One domain leads the other two
Page Page 2424European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Time Domain Considerations: XTMR Time Domain Considerations: XTMR Single Bit Failures …Not Detected by Single Bit Failures …Not Detected by Static Node AnalysisStatic Node Analysis
n n+1 n+2 n+3
n+1
INPUT: Async_DATA_tr0
INPUT: Async_DATA_tr1
INPUT: Async_DATA_tr2
n+2 n+3 n+4 n+5
Voted rising edge detect
Edge_detect_tr0
Edge_detect_tr1
Edge_detect_tr2
CONFIGURATION BIT HIT
NO EDGE DETECTION
Page Page 2525European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Voters and Asynchronous Signal Voters and Asynchronous Signal CaptureCapture
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
E
Edge Detect Circuit
Metastability Filter
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
E
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
E
VOTER
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
E
Edge Detect Circuit
Metastability Filter
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
E
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
Q
QSET
CLR
D
E
VOTER
VOTER
Place voter after Place voter after metastability filtersmetastability filters
It satisfies skew It satisfies skew constraints because constraints because voter is anchored at DFF voter is anchored at DFF control points control points
INPUT: Async_DATA_tr0
INPUT: Async_DATA_tr1
INPUT: Async_DATA_tr2
n+2 n+3 n+4 n+5
n+1
n+1VOTER
Edge Detect
Page Page 2626European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
PPConfigurationConfiguration ??? ???SEUs are insignificantSEUs are insignificant
MBUs may be insignificant (still under investigation)MBUs may be insignificant (still under investigation)
Assumes proper scrubbingAssumes proper scrubbing
Upper-Bound Error Prediction: Upper-Bound Error Prediction: Xilinx FPGA XTMRXilinx FPGA XTMR
day
Errorsn
dt
dE
dt
dE SEFI 5103
DevicendayDevice
Errors
dt
dESEFI
5103
SEFIerror PfsP Assumes Unmitigated SEFIs are the most predominant source:
Page Page 2727European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
ToolsTools
Page Page 2828European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Mitigation and Actel ToolsMitigation and Actel Tools
Mentor Graphics has offered LTMR for anti-fuse Mentor Graphics has offered LTMR for anti-fuse devicesdevices
There is a desire to employ LTMR to Actel Flash There is a desire to employ LTMR to Actel Flash Based productsBased products
DTMR is another approach (GTMR with no DTMR is another approach (GTMR with no clock redundancy)clock redundancy)
FlashFlash
Assist with SETs in Anti-fuse DeviceAssist with SETs in Anti-fuse Device
Page Page 2929European Space Agency FPGA Tool Workshop. Noordwijk, NL; Melanie Berg
Mitigation and Xilinx ToolsMitigation and Xilinx Tools
Currently XTMR is commercially available from Currently XTMR is commercially available from XilinxXilinx
NASA REAG has identified some issues:NASA REAG has identified some issues:Asynchronous domain crossingsAsynchronous domain crossings
Verification of XTMR insertionVerification of XTMR insertion
Mentor is now evaluating GTMR with Formal Mentor is now evaluating GTMR with Formal CheckingChecking
NASA REAG is expecting to use Mentor GTMR NASA REAG is expecting to use Mentor GTMR (preliminary version) for V5 radiation testing(preliminary version) for V5 radiation testing