maintaining data integrity in programmable logic in atmospheric environments through error detection...

Click here to load reader

Post on 17-Dec-2015




1 download

Embed Size (px)


  • Slide 1
  • Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military & Aerospace Business Unit
  • Slide 2
  • Single Event Upset (SEU) Overview for SRAM-Based FPGAs
  • Slide 3
  • Copyright 2004 Altera Corporation Definitions SEU: Single Event Upset Unwanted Change in State of a Latch or a Memory Cell SER: Soft Error Rate SEU Rate SEFI: Single Event Functional Interrupt Functional Failure by SEU Not All SEUs are SEFIs Generally Takes 5-10 SEUs to Cause SEFI
  • Slide 4
  • Copyright 2004 Altera Corporation Circuit Components of SRAM-Based FPGAs I/O Registers & I/O Configuration No Issue, Very Robust Registers, < 1 FIT Logic Registers (LEs) No Issues, Very Robust Registers, < Hard Error Rate User Memory Typically On-Chip Memories are By 9 for Parity Checking IP Available for ECC Configuration RAM (CRAM) for LUTs & Routing Area of Focus
  • Slide 5
  • Copyright 2004 Altera Corporation Upset of a CRAM Cell Data In Add Vcc Vss Clear Data Out Time Voltage 6 Transistor Cell Noise Current for 10fC Collected Charge 0 50 100 150 200 050100150200 Time (ps) Current (A)
  • Slide 6
  • Copyright 2004 Altera Corporation SEU Induced Failure Rate* DeviceLE CountSEU Rate (FIT) SEFI Rate (FIT) MTBF** (Years) EP1C66K250601,900 Years EP1C2020K730180634 Years EP1S2526K1950400285 Years EP1S8079K6000120095 Years * Data at Sea Level **MTBF: Mean Time Between Functional Interrupt
  • Slide 7
  • Copyright 2004 Altera Corporation Number of CRAM Bit Upsets for Each Occurrence of Functional Upset Median ~6 Median 5
  • Slide 8
  • Addressing System-Level Issues
  • Slide 9
  • Copyright 2004 Altera Corporation SER Improvements/Mitigation Chip Design Enhancements New Materials & Process Enhancements Larger CRAM Structure Increase in Capacitance on Critical Node Smaller Process => Smaller Die => Lower SEU Probability Built-In Error Detection/Correction Circuitry
  • Slide 10
  • Copyright 2004 Altera Corporation SER Per SRAM Bit Trend Process Technology Year 0.5 m 1995 0.13 m 2002 SER per SRAM MBit 100 FITS 1,000 FITS 90 nm Projection
  • Slide 11
  • Copyright 2004 Altera Corporation System Level Improvements Mitigation ECC for User Memory Use Detection/Correction Feature Triple Module Redundancy (TMR) To Achieve Lower Error Rate & Less Downtime Migrate to Structured ASIC
  • Slide 12
  • Copyright 2004 Altera Corporation Soft Error Detection Methods Configuration RAM Readout Read-Out Full Bitstream Compare with Stored Bitstream Can Determine where in Configuration Error Occurred Caveat: Security Issues with Reading Out Bitstream Stored CRAM Data Stored CRAM Data FPGA Microprocessor or CPLD Microprocessor or CPLD Same or Different?
  • Slide 13
  • Copyright 2004 Altera Corporation Soft Error Detection Methods On-Chip SEU Detection Dedicated Comparison Circuitry e.g. CRC Engine Comparing Stored CRC with That Calculated from Configuration RAM Detection Circuitry Running Continuously Error Detection Rate Variable Based on Implementation of Hardware, Number of CRAM Bits & Input Clock Frequency Error Signal Available Internally or Externally Caveat: Cannot Determine Where in Configuration Error Occurred Computed Value Stored Value To Core = FPGA
  • Slide 14
  • Copyright 2004 Altera Corporation On-Chip Detection Example Dedicated CRC Circuit Configuration RAM Verification Capability 32-Bit Cyclic Redundancy Code Check Verified Against Internally Stored Value Runs in the Background Without Impacting Device Performance Close to Real-Time Detection Variable Clock Frequency Depends on Number of CRAM Bits Multi-Event Detection Up to 3-Bit for 32-Bit CRC Result Output to Either Core or Pin Use with Either Internal or External Hardware for Error Correction
  • Slide 15
  • Copyright 2004 Altera Corporation Correction Methods FPGA Detection, System-Level Correction Lower Total Cost Downtime Is Limited & Manageable Used in Non-Critical Applications Triple Module Redundancy Two Flavors All On-Chip in FPGA Separate Chips & Voter Correction Can Be Real-Time Used in Critical Applications
  • Slide 16
  • Copyright 2004 Altera Corporation Single System Detection & Correction Step One: Detect the Soft Error 75% of Reported Errors Are Dont Care Errors Step Two: Alert the System Step Three: Fix the Error In Some Cases, Re-Program the FPGA In Some Cases, Reboot the Sub-System In Some Cases, Reboot the System Need to Focus on System Downtime Each System Has Unique Requirements Re-Programming FPGA Takes < 250 ms Rebooting Time Varies & Can Be Fast by Design
  • Slide 17
  • Copyright 2004 Altera Corporation TMR Method 1 Identical Hardware in FPGAs Use Voter Implemented in FPGA or CPLD Utilize Either Hardware Output or CRC Error Pin Voter Also Used to Signal Reconfiguration on Difference or Error FPGA Hardware1 FPGA Hardware1 FPGA Hardware3 FPGA Hardware3 FPGA Hardware 2 FPGA Hardware 2 FPGA or CPLD (Voting) FPGA or CPLD (Voting)
  • Slide 18
  • Copyright 2004 Altera Corporation TMR Method 2 Multiple Instantiations of Hardware in Single FPGA For Low-Rate SEUs SEU Events May Occur Much More Frequently than Functional Error (De-Rating) Voter Signals Reconfiguration of FPGA FPGA Must be Reconfigured Voting Circuit Voting Circuit FPGA Hardware 1 Hardware 2 Hardware 3
  • Slide 19
  • Copyright 2004 Altera Corporation De-Rating Methodology Only a Fraction of Configuration Bits Are Actually Programmed e.g. Using Only Two Inputs of 4-Input LUT Leaves 75% of LUT as Dont Care Only About 20% of Routing Is Used Depends on Utilization & Application Some Un-Programmed Bits Still Matter Flipping Could Change Function of the Device Extensive Experimentation Shows a Range From 1/8 to 1/3 of the Bits Matter
  • Slide 20
  • Copyright 2004 Altera Corporation Structured ASIC: Ultimate SEU Protection No Configuration Memory = Estimated SER is below Hard Failure Rate for the Device FPGA Structured ASIC PLD Architecture with ASIC Routing
  • Slide 21
  • Copyright 2004 Altera Corporation Summary SEU is a Well Understood Phenomena Many Chip Level Enhancements Mitigate SEUs Process Design Manufacturing Techniques Easy Detection of SEU Events is Key After Detection, Other Methods Must be Employed to Deal with the Event Critical Nature of Application Determines Level of SEU Response Structured ASICs from FPGA Designs Offer a Much More Robust Solution Due to Removal of All CRAM

View more