+ All Categories
Home > Documents > 4_c144_swift_s

4_c144_swift_s

Date post: 27-Oct-2015
Category:
Upload: deepa-devaraj
View: 6 times
Download: 0 times
Share this document with a friend
Description:
FPGA
26
Swift and Roosta 1 144_C4 / MAPLD04 Tradeoffs in Flight-Design Upset Mitigation in State- of-the-Art FPGAs Hardened By Design vs. Design-Level Hardening Gary M. Swift and Ramin Roosta Jet Propulsion Laboratory / California Institute of Technology The research done in this paper was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration (NASA) and was partially sponsored by the NASA Electronic Parts and Packaging Program. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology.
Transcript
Page 1: 4_c144_swift_s

Swift and Roosta 1 144_C4 / MAPLD04

Tradeoffs in Flight-Design Upset Mitigation in State-of-the-Art FPGAs

Hardened By Designvs.

Design-Level Hardening

Gary M. Swift and Ramin RoostaJet Propulsion Laboratory / California Institute of Technology

The research done in this paper was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration (NASA) and was partially sponsored by the NASA Electronic Parts and Packaging Program.

Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology.

Page 2: 4_c144_swift_s

Swift and Roosta 2 144_C4 / MAPLD04

In the beginning was Actel …

• Leveraging from a commercial product line ONO anti-fuse based one-time programmable

(OTP)

• “beginning” = 1993 Reference:

Katz, R.; Barto, R.; McKerracher, P.; Carkhuff, B.; Koga, R.; “SEU hardening of field programmable gate arrays (FPGAs) for space applications and device characterization,” IEEE Transactions on Nuclear Science, Dec. 1994

Page 3: 4_c144_swift_s

Swift and Roosta 3 144_C4 / MAPLD04

Later, Xilinx

Leveraging from a commercial product line SRAM based reconfigurable

“later” = 1998 Reference: Guertin, S.M.; Swift, G.M.; Nguyen, D.; “Single-event

upset test results for the Xilinx XQ1701L PROM”, Radiation Effects Data Workshop Record, 1999

Quote:(Xilinx SRAM-based FPGAs)… “do appear suited to a broad

range of other (non-critical) applications, such as sensor and camera controllers.”

Page 4: 4_c144_swift_s

Swift and Roosta 4 144_C4 / MAPLD04

OUTLINE

• FPGAs: A key enabling technology for modern spacecraft

• Background in radiation testing of FPGAs▫ Earlier, Katz/Swift collaboration▫ Recently, Xilinx Consortium

• Feature Comparison

• Triple Modular Redundancy (TMR) - hardware approach vs. software approach

• Concluding Remarks

Page 5: 4_c144_swift_s

Swift and Roosta 5 144_C4 / MAPLD04

FPGAs: A key “enabling technology”

Like custom ASICs, FPGAs can replace whole boards Saving mass, volume, power Achieving extra functionality

FPGAs are much cheaper than ASICs Design efforts can be later in the schedule Design mistakes don’t require a re-spin through the

foundry

Page 6: 4_c144_swift_s

Swift and Roosta 6 144_C4 / MAPLD04

MER Pyro-Controller

Used self-checking of configuration to initiate a re-configuration after spotting an upset

Page 7: 4_c144_swift_s

Swift and Roosta 7 144_C4 / MAPLD04

MER Pyro-Controller

Nearing Mars Xilinx XQR4062XL

0

5

10

15

20

25

30

0 50 100 150Days after Launch

# of

Ups

ets

predicted

MER-A

MER-B Nov. 23MER-B

Nov. 23MER-A

Oct. 28MER-A

Oct. 28MER-B

Page 8: 4_c144_swift_s

Swift and Roosta 8 144_C4 / MAPLD04

My Background

• Actel experience is older No direct involvement in radiation tests since the

ONO anti-fuse was replaced Results here are from others’ work

• Xilinx experience is recent Active participant in Xilinx Rad Test Consortium Currently, finishing two+ year test campaign

targeting the Virtex II family

Page 9: 4_c144_swift_s

Swift and Roosta 9 144_C4 / MAPLD04

Currently Available Devices

Actel RT54SX-S family vs. Xilinx Virtex II family

(-SU)

Note: both are essentially immune to single-event latchup

and have good total ionizing dose tolerance,

[ Actel > 135 krad(Si); Xilinx > 200 krad(Si) ]

Page 10: 4_c144_swift_s

Swift and Roosta 10 144_C4 / MAPLD04

Main Feature Comparison

Actel Xilinx

RT54SX72S XQR2V6000

Gates: 72,000 ~6M ( /~3.2 )

flip/flops: 2012 67,584 / 3.2 = ~20k

I/O Pins: 360 824 / 3 = 274

Speed external : 230 MHz 622 Mb/s (I-mode LVDS)

Speed internal : 310 MHz 360 MHz

Page 11: 4_c144_swift_s

Swift and Roosta 11 144_C4 / MAPLD04

Extra Features Comparison

Actel Xilinx

RT54SX72S XQR2V6000

Block RAM: no 2.5 Mb

I/O Standards: many many

Others: hardwired TMR Clock Manager

Multipliers

Page 12: 4_c144_swift_s

Swift and Roosta 12 144_C4 / MAPLD04

Actel: What bits can upset?

User flip-flops only Direct hits of same flip/flop in multiple domains

▫ Very unlikely due to layout

Clock domain hits

SEFI modes essentially eliminated

Page 13: 4_c144_swift_s

Swift and Roosta 13 144_C4 / MAPLD04

Xilinx: What bits can upset?

• Configuration Bits Logical

Function Routing User Options

• Block RAM

• User Flip-flops

• Control Registers

× Type of I/O× Mode of Block RAM Access× Clock Manager× etc…

× NAND× Ex-OR× Flip-Flop type× etc…

Page 14: 4_c144_swift_s

Swift and Roosta 14 144_C4 / MAPLD04

Xilinx: Heavy Ion Test Results

Resulting in fairly low in-space rates:

~6 per day for 2V6000 in GCRmin.

1.E-11

1.E-10

1.E-09

1.E-08

1.E-07

0 10 20 30 40 50 60 70

LET (MeV-cm2/mg)

Cro

ss S

ect

ion

pe

r B

it (c

m2 )

X-2V1000 configuration bitsWeibull Curve Fit

Low Threshold(soft)

Low Susceptibility(hard)

Page 15: 4_c144_swift_s

Swift and Roosta 15 144_C4 / MAPLD04

Actel: Heavy Ion Test Results

Very low in-space rates (assume LETth > 40 achieved):

~1 per 6800 years for SX72-S in GCRmin.

1.E-10

1.E-09

0 20 40 60 80 100 120

LET (MeV-cm2/mg)

Cro

ss S

ecti

on

(cm

2 )

307

315

10-9

10-101.E-10

1.E-09

0 20 40 60 80 100 120

LET (MeV-cm2/mg)

Cro

ss S

ecti

on

(cm

2 )

307

315

10-9

10-10

Where’s Threshold???

Low Susceptibility(~100x harder)

Data for twoRTAX2000Sprototypesat 1 MHz usingcheckerboardpattern

from Fig. 12,JJ Wang et al., NSREC 2003[Ref. 1]

Page 16: 4_c144_swift_s

Swift and Roosta 16 144_C4 / MAPLD04

Actel-style TMR

SX-A “R” cell

triplicates to:

RTSX-S

“R” cell

Page 17: 4_c144_swift_s

Swift and Roosta 17 144_C4 / MAPLD04

Actel-style TMR

Actel-style TMR is fairly straightforward:

Each flip-flop is replaced by three plus feedback voter

Triplicated elements spread out physically

Uses one clock/inverse-clock domain

No external parts needed

Page 18: 4_c144_swift_s

Swift and Roosta 18 144_C4 / MAPLD04

Xilinx-style TMR

Xilinx-style TMR is more complicated: First, it’s not too useful without

configuration scrubbing Whole functional blocks are triplicated,

not individual flip-flops Three voters are used Three clock domains Elimination of:

▫ Weak keepers (aka half latches)

▫ Use of configuration cells as part of the design- For example, SRL16

Needs some external circuitry

(at least, a watchdog timer + PROMs)

Page 19: 4_c144_swift_s

Swift and Roosta 19 144_C4 / MAPLD04

Xilinx-style TMR

Page 20: 4_c144_swift_s

Swift and Roosta 20 144_C4 / MAPLD04

Xilinx-style TMR

In Xilinx-style TMR, I/O’s use three pins tied externally :

Minority Voter

P

Minority Voter

P

Minority Voter

P

D0

D1

D2

D

Pins

Board Traces

Page 21: 4_c144_swift_s

Swift and Roosta 21 144_C4 / MAPLD04

Xilinx TMRtool

• Xilinx-style TMR done by hand is difficult and tedious

• An automated tool which integrates into the design flow has been developed (“now” available)

• In-beam testing shows tool is very effective

Design Entry

NGC

XILINX ImplementationTranslate, Map, Floorplan, Par,

BitGen

XTMR

EDIF

NCD

BIT

XILINX

Back-AnnotationTiming, ncd2edif,

ncd2vhdl, ncd2verilog

Simulation

FPGA

NGO

EDIF TMR

Design Entry

NGCNGC

XILINX ImplementationTranslate, Map, Floorplan, Par,

BitGen

XTMRXTMR

EDIFEDIF

NCD

BIT

XILINX

Back-AnnotationTiming, ncd2edif,

ncd2vhdl, ncd2verilog

Simulation

FPGA

NGONGO

EDIF TMR

Page 22: 4_c144_swift_s

Swift and Roosta 22 144_C4 / MAPLD04

Upset Comparison• ATMR now has eliminated:

Upsets of static storage elements, and SEFIs

• ATMR upsets from: Transients that are clocked into storage Clock tree hits

• Xilinx FPGAs have a small susceptibility to two types of SEFIs Reset (sometimes only partial) Disable scrub port

• XTMR in combination with scrubbing can lower system upset rates below the SEFI rate

Page 23: 4_c144_swift_s

Swift and Roosta 23 144_C4 / MAPLD04

Rate Comparison

GCR = Galactic Cosmic Ray background (interplanetary space)

almost identical to geosynchronous orbit

• Actel• Dominated by transients

• Roughly one system error per thousand years (GCRmin)

• Xilinx• Dominated by SEFI rate

• Expect one SEFI per ~65 years in GCRmin

• Expect one system error ~5-20x less often

Page 24: 4_c144_swift_s

Swift and Roosta 24 144_C4 / MAPLD04

CONCLUSIONSFor the present –

Both can achieve very acceptable radiation tolerance

Actel wins on:▫ Less burden on the designer▫ No auxiliary components▫ Lower SEFI susceptibility

Xilinx wins on:▫ Designer control of the resources vs. hardness tradeoff▫ On-chip feature set▫ Re-configurability

Competition is good.

Page 25: 4_c144_swift_s

Swift and Roosta 25 144_C4 / MAPLD04

AcronymsFPGA - Field Programmable Gate Array ASIC - Application Specific Integrated CircuitSEU - Single Event UpsetSEFI - Single Event Functionality InterruptTMR - Triple Modular RedundancyATMR - Actel-style TMRXTMR - Xilinx-style TMRLET - Linear Energy Transfer (proportional to deposited

charge per micron for a heavy ion strike on an active node)

GCRmin - Galactic Cosmic Ray background (highest during “solar minimum” period of ~11-yr cycle of

sunspots)MER - Mars Exploration Rovers

(i.e., Spirit and Opportunity)

Page 26: 4_c144_swift_s

Swift and Roosta 26 144_C4 / MAPLD04

Additional References

[1] J.J. Wang, W. Wong, S. Wolday, B. Cronquist, J. McCollum, R. Katz, I. Kleyner, “Single event upset and hardening in 0.15 antifuse-based field programmable gate array,” IEEE Transactions on Nuclear Science, Dec. 2003

[2] Jih-Jong Wang, R.B. Katz, F. Dhaoui, J.L. McCollum, W. Wong, B.E. Cronquist, R.T. Lambertson, E. Hamdy, I. Kleyner, W. Parker, “Clock buffer circuit soft errors in antifuse-based field programmable gate arrays,” IEEE Transactions on Nuclear Science, Dec. 2000

[3] R. Katz, J.J. Wang, R. Koga, K.A. LaBel, J. McCollum, R. Brown, R.A. Reed, B. Cronquist, S. Crain, T. Scott, W. Paolini, B. Sin, “Current radiation issues for programmable elements and devices,” IEEE Transactions on Nuclear Science, Dec. 1998