+ All Categories
Home > Documents > PhD Student: Salvatore Danzeca Supervisor: Giovanni Spiezia.

PhD Student: Salvatore Danzeca Supervisor: Giovanni Spiezia.

Date post: 28-Dec-2015
Category:
Upload: clifton-conley
View: 221 times
Download: 0 times
Share this document with a friend
Popular Tags:
19
Summary of the Workshop on FPGAs for High-Energy Physics PhD Student: Salvatore Danzeca Supervisor: Giovanni Spiezia
Transcript

Summary of the Workshop on FPGAs for High-Energy Physics

PhD Student: Salvatore DanzecaSupervisor: Giovanni Spiezia

Summary

IntroductionAltera Arria GX Test for LHCb Microsemi Igloo 2 for CMS HCALKintex 7 Mitigation techinques and general

TMR Experience from using SRAM based FPGAs in

the ALICE TPC Detector and Future Plans

The eternal fight

Xilinx 28 nm processhigh-k metal gate (HKMG) technology

FLASH BASED FPGA130 nm dual Poly process CMOS (ProAsic3)

65 nm process (IGLOO2)

RADWG Community

• Most of the project in the RADWG community use the Flash based FPGA.– Better SEU immunity– Easy to harden against SEU by use of GLOBAL TMR– Resources available are comparable to an SRAM

FPGA of 3 years ago– TID limit not high as the SRAM FPGA

• In which case we should use SRAM FPGA??– TID is a concern– Performances are a concern– SEU can be tollerated

Proton irradiation test of an Altera SRAM-based FPGA for the possible usage in the readout electronics of the LHCb experiment

Presenter: Christian FAERBER

2x Arria GX – EP1AGX35DF780I6 (90nm) Application LHCb Outer Tracker detector FPGA used as TDC and Gbit/s trans Tested with 22 MeV protons

Results• Current

– FPGA Core current rises after 150 krad(Si) and reaches 107% after 7 Mrad(Si).– FPGA I/O current starts to drop after 400 krad(Si) and reaches 94% at 7 Mrad(Si).– All permanent current changes are between 5% - 20% and begin after 150 krad(Si)

• Stability of Implemented TDC– Wrong time measurement after a TID of 400 krad(Si)– Shifted time measurement after a TID of 4 Mrad

• Stability of PLL– 3 PLL clock signals monitored– 3 frequencies did not changed– The phase between clk1 and clk2 shows a shift from -150° to larger values after 3 Mrad(Si)

• FPGA Gbit/s Transceiver Tests– Loss of bit alignment: Recovered by sending next bit alignment word. Cross section: (1.3±0.5) x10-10cm²/GBit transceiver – De-synchronization of transmitter and receiver: Needed reprogramming of the FPGA Cross section: (8±4) x10-11cm²/GBit transceiver

• FPGA configuration registers– cyclic redundancy checker tool from Altera Cross section:(1.6±0.2) x10-9cm²/FPGA

FPGAs in the upgrade of CMS HCALPresenter: Tullio Grassi

Existing Upgrade

Expected TID on FEE 2 Gy 14 Gy (1.4 krad)

Expected 1 MeV-equivalent neutron fluence on FEE

1 x 1011 / cm2 7 x 1011 / cm2

Front-End Actel antifuse (for control only)

Microesemi flash-based (control and data)

Back-End Xilinx and Altera Xilinx

Number of FPGA types 2 (FEE) + ~8 (BEE) ~5 (FEE) + 5 (BEE)

Number of developers 1 (FEE) + ~4 (BEE) ~6 (FEE) + 4 (BEE)

FPGAs both in the Front-End Electronics (FEE) mounted on the detector and in the Back-End electronics (BEE) located in the counting rooms.

Solutions• Microsemi ProASIC3L

– interface the ProASIC3L to the Cern GBTX need the ability to receive SLVS (a differential signal similar to LVDS but with smaller amplitude).

– ProASIC3L can receive SLVS :A Belloni et al, “Radiation tolerance of an SLVS receiver based on commercial components”, Journal of Instrumentation (JINST 2014)

• MicroSemi Igloo 2– On-going tests by Univ. of Minnosota with 230 MeV

• failure after 2x1012 protons• no SEU seen on a TMR-type shift-register• no SET seen• PLL : observed 400 SEUs over a fluence of 1011 protons/cm2

• LATEST NEWS (2014) : serializer running at 4.8 Gbps: loss-of-sync observed with cross section = 1.7 E-10 cm2. A power cycle was issued after every loss-of-sync, after that the link was working again. It was not attempt to reset (part of) the serializer.

Scrubbing Approaches for Kintex-7 FPGAsPresenter: Michael Wirthlin

Xilinx Kintex 7 Commercially available FPGA

28 nm, low power programmable logic High-speed serial transceivers (MGT) High density (logic and memory)

Built-In Configuration Scrubbing Support for Configuration Readback and Self-Repair Auto detect and repair single-bit upsets within a frame SEU Mitigation IP for correcting multiple-bit upsets

Proven mitigation techniques Single-Event Upset Mitigation (SEM) IP Configuration scrubbing Triple Modular Redundancy (TMR) Fault tolerant Serial I/O State machines BRAM ECC Protection

Kintex 7 ARCHITECURE and SCRUBBING

• Device configuration organized as “Frames”– Smallest unit of configuration and readback

• Individual frames can be configured (partial reconfiguration)• Individual frames can be read (readback)

– 101 words x 32 bits/word = 3232 bits/frame

• Frames organized into different “Blocks”– Block 0: Logic/Routing Configuration Data (22546 frames)– Block 1: BlockRAM configuration/contents (5774 frames)

• Frames can be “scrubbed” during device operation– Writing individual configuration frames overwrites previous data

• Replaces “bad” data in the presence of upsets• Writes “same” data when no presence of upsets

– Scrubbing involves continuous reading/writing of configuration data

SCRUBBING CONFIGURATION DATA• Each Frame contains SECDED ECC Code

– Provides single-bit correction and double bit detection• Identifies the location of the single-bit upset• Identifies presence of double bit upset• Double-error detection can be masked with >2 upsets in frame

• Entire bitstream checked with global CRC– Detects failure of individual ECC words (masked ECC)– Suggests full reconfiguration if global CRC error detected

• Internal FrameECC Block– Dedicated block for ECC computation and error correction

INTERNAL Scrubber

EXTERNAL Scrubber• must respond to >2 bit frame errors

Triple Modular Redundancy (TMR)

• TMR has lower reliability than non-redundant for long mission times

• Effective TMR almost always is coupled with “repair”

• TMR + Repair = Very Reliable!

• Fault repair through scrubbing– Fixes the cause of the error– Does NOT fix the state of the circuit

• State of circuit must be synchronized to working circuits

BYU-LANL TMR Tool

– BYU-LANL Triple Modular Redundancy– Developed at BYU under the support of Los

Alamos National Laboratory (Cibola Flight Experiment)

– Used to test TMR on many designs• Fault injection, Radiation testing, in Orbit

– Testbed for experimenting with various TMR application techniques (used for research)

Experience from using SRAM based FPGAs in the ALICE TPC Detector and Future Plans

Presenter: Johan Alme 1000 samples/event (10bit)

4356 * 128 channels

700MByte/event

200 Hz/1kHz eventrate

142-710 GByte/s (Raw)

Data compression:5-20 Gbyte/s (~x30)

The RCU main FPGA sits in the datapathData readout is handled by the Readout Node

92% CLBs75% BRAM blocks (Remaining 25% BRAM can not be used due to the Active Partial Reconfiguration) Result: TMR or any other mitigation techniques are not applicable

Solution: ReconfigurationConsists of:

A radiation tolerant flash memory, a radiation tolerant flash based FPGA and the DCS board – an Embedded PC with Linux.

Corrects SEUs in the configuration memory of the Xilinx Virtex-II pro vp7Why it works:

Active Partial Reconfiguration

Problems and solution

• 2011 Pb-Pb run: – 300 - 400 x 1024 cm-2s-1 : ~5 SEUs/h for all 216

• Run 2 scenario:– Peak luminosity 1 – 4 x 1027 cm-2s-1 : ~45 SEUs/h for all 216 FPGAs

• Solution : Upgrades the RCU –> RCU2– New «state of the art» System on Chip FPGA – Microsemi smartFusion2– Faster, bigger, better in radiation!

• First flashbased FPGA with SERDES

• Test carried out at the end of April – Waiting for the results!!

Monitoring of Radiation Levels

• On the present RCU we have the Reconfiguration Network acting as a radiation monitor

• Additional SRAM memory and Microsemi proASIC3 250 added to the RCU2

• Cypress SRAM – same as used for the latest LHC RadMon devices

Clooser Look• Workshop on FPGAs for High-Energy Physics,

http://indico.cern.ch/event/300532/timetable/#20140321• Fault-Tolerance Techniques for SRAM-Based FPGAs (Frontiers in

Electronic Testing) by Fernanda Lima Kastensmidt and Ricardo Reis (May 3, 2006)

• Kintex 7 Article http://www.eetimes.com/author.asp?section_id=36&doc_id=1287740&page_number=1

• Soft Error Rate Estimations of the Kintex-7 FPGA within the ATLAS Liquid Argon (LAr) Calorimeter, Takai Helio, http://indico.cern.ch/event/228972/session/19/contribution/49/material/slides/0.pdf

• What's Microsemi Done With Actel's IGLOO Product Range? http://www.eetimes.com/author.asp?section_id=36&doc_id=1319435

THANK YOU!


Recommended