an 737: seu detection and recovery in intel arria 10 devices · pdf fileseu detection and...

30
SEU Detection and Recovery in Intel ® Arria ® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback is application note describes the implementation of Intel ® Arria ® 10 single event upset (SEU) detection and recovery features by presenting the following information: Error detection and correction feature architecture in Arria 10 devices. General implementation guidelines for error detection cyclic redundancy check (EDCRC) and error correction feature. General implementation guidelines for embedded memory error correction code (ECC) feature. Arria 10 EDCRC reference design with detailed development flow. Related Information Test Methodology of Error Detection and Recovery using CRC in Intel FPGA Devices Provides more information about SEU detection and recovery in Arria II, Stratix III, Stratix IV, Arria V, Cyclone V, and Stratix V devices. Altera Advanced SEU Detection IP Core User Guide Provides more information about hierarchy tagging and sensitivity processing using Altera Advanced SEU Detection IP core. Altera Fault Injection IP Core User Guide Provides more information about injecting soſt error to simulate SEU using Altera Fault Injection IP core. Altera Error Message Register Unloader IP Core User Guide Provides more information about retrieving and storing the error message register using Altera Error Message Register Unloader IP Core. SEU Mitigation for Arria 10 Devices Provides more information about Arria 10 SEU features. Arria 10 ROM with ECC Reference Design Files Arria 10 EDCRC Reference Design Files Reference design files that you need to apply steps and compilation described in Creating Arria 10 SEU Fault Injection and Hierarchy Tagging Design with Qsys. Complete Arria 10 EDCRC Reference Design Files Precompiled reference design files ready for design testing in Design Testing with Fault Injection Debugger. Arria 10 GX FPGA Development Kit Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to current specifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice. Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expressly agreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published information and before placing orders for products or services. *Other names and brands may be claimed as the property of others. ISO 9001:2008 Registered www.altera.com 101 Innovation Drive, San Jose, CA 95134

Upload: vothuy

Post on 24-Mar-2018

240 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

SEU Detection and Recovery in Intel® Arria® 10 Devices2017.03.15

AN-737 Subscribe Send Feedback

This application note describes the implementation of Intel® Arria® 10 single event upset (SEU) detectionand recovery features by presenting the following information:

• Error detection and correction feature architecture in Arria 10 devices.• General implementation guidelines for error detection cyclic redundancy check (EDCRC) and error

correction feature.• General implementation guidelines for embedded memory error correction code (ECC) feature.• Arria 10 EDCRC reference design with detailed development flow.

Related Information

• Test Methodology of Error Detection and Recovery using CRC in Intel FPGA DevicesProvides more information about SEU detection and recovery in Arria II, Stratix III, Stratix IV, Arria V,Cyclone V, and Stratix V devices.

• Altera Advanced SEU Detection IP Core User GuideProvides more information about hierarchy tagging and sensitivity processing using Altera AdvancedSEU Detection IP core.

• Altera Fault Injection IP Core User GuideProvides more information about injecting soft error to simulate SEU using Altera Fault Injection IPcore.

• Altera Error Message Register Unloader IP Core User GuideProvides more information about retrieving and storing the error message register using Altera ErrorMessage Register Unloader IP Core.

• SEU Mitigation for Arria 10 DevicesProvides more information about Arria 10 SEU features.

• Arria 10 ROM with ECC Reference Design Files• Arria 10 EDCRC Reference Design Files

Reference design files that you need to apply steps and compilation described in Creating Arria 10SEU Fault Injection and Hierarchy Tagging Design with Qsys.

• Complete Arria 10 EDCRC Reference Design FilesPrecompiled reference design files ready for design testing in Design Testing with Fault InjectionDebugger.

• Arria 10 GX FPGA Development Kit

Intel Corporation. All rights reserved. Intel, the Intel logo, Altera, Arria, Cyclone, Enpirion, MAX, Nios, Quartus and Stratix words and logos are trademarks ofIntel Corporation or its subsidiaries in the U.S. and/or other countries. Intel warrants performance of its FPGA and semiconductor products to currentspecifications in accordance with Intel's standard warranty, but reserves the right to make changes to any products and services at any time without notice.Intel assumes no responsibility or liability arising out of the application or use of any information, product, or service described herein except as expresslyagreed to in writing by Intel. Intel customers are advised to obtain the latest version of device specifications before relying on any published informationand before placing orders for products or services.*Other names and brands may be claimed as the property of others.

ISO9001:2008Registered

www.altera.com101 Innovation Drive, San Jose, CA 95134

Page 2: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Arria 10 Error Detection and Correction Feature Architecture

Error Detection and Correction for CRAM

Error Detection Cyclic Redundancy Check

In user mode, the contents of the configured configuration RAM (CRAM) bits can be affected by softerrors. These soft errors, which are caused by an ionizing particle, are not common in Intel FPGA devices.However, high-reliability applications that require error-free device operation may require your design toconsider these errors.

The hardened on-chip EDCRC circuitry allows you to perform the following operations without anyimpact on the fitting or performance of the device:

• Auto-detection of cyclic redundancy check (CRC) errors during configuration.• Optional soft errors (SEU and multiple bit upset) detection and identification in user mode.• Fast soft error detection. The error detection speed is improved.• Two types of check-bits:

• Frame-based check-bits—stored in CRAM and used to verify the integrity of the frame.• Column-based check-bits—stored in registers and used to protect integrity of all frames.

During error detection in user mode, a number of EDCRC engines run in parallel for Arria 10 devices.The number of error detection CRC engines depends on the frame length—total bits in a frame.

Each column-based error detection CRC engine reads 128 bits from each frame and processes within fourcycles. To detect errors, the error detection CRC engine needs to read back all frames.

Figure 1: Block Diagram for Error Detection in User Mode

The block diagram shows the registers and data flow in user mode.

CRCCalculation

Error DetectionSearch Engine

Error Message Register

JTAG UpdateRegister

JTAG ShiftRegister

User UpdateRegister

User ShiftRegister

HPS ShiftRegister

HPS UpdateRegister

CRC_ERROR

JTAGTDO

GeneralRouting

HPSOutput

ReadbackBitstream

SyndromeCorrection

Pattern Write Back toCRAM for Correction

2 Arria 10 Error Detection and Correction Feature ArchitectureAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 3: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Table 1: Error Detection Registers

Name Description

Error message registers(EMR)

Contains error details for single-bit and double-adjacent errors. The errordetection circuitry updates this register each time the circuitry detects anerror.

User update register This register is automatically updated with the contents of the EMR oneclock cycle after the contents of this register are validated. The user updateregister includes a clock enable, which must be asserted before its contentsare written to the user shift register. This requirement ensures that the userupdate register is not overwritten when its contents are being read by theuser shift register.

User shift register This register allows user logic to access the contents of the user updateregister via the core interface.

You can use the Altera Error Message Register Unloader IP core to shift-outthe EMR information through user shift register. For more information,please refer to related information.

JTAG update register This register is automatically updated with the contents of the EMR oneclock cycle after the content of this register is validated. The JTAG updateregister includes a clock enable, which must be asserted before its contentsare written to the JTAG shift register. This requirement ensures that theJTAG update register is not overwritten when its contents are being read bythe JTAG shift register.

JTAG shift register This register allows you to access the contents of the JTAG update register viathe JTAG interface using the SHIFT_EDERROR_REG JTAG instruction.

Hard Processor System(HPS) update register

This register is automatically updated with the contents of the EMR oneclock cycle after the content of this register is validated. The (HPS) updateregister includes a clock enable, which must be asserted before its contentsare written to the HPS shift register. This requirement ensures that the HPSupdate register is not overwritten when its contents are being read by theHPS shift register.

HPS shift register This register allows you to access the contents of the HPS update register viathe HPS interface.

Related InformationAltera Error Message Register Unloader IP Core User GuideProvides more information about using the Altera EMR Unloader IP core.

AN-7372017.03.15 Error Detection Cyclic Redundancy Check 3

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 4: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Column-Based and Frame-Based Check-Bits

Figure 2: Column-Based and Frame-Based Check-Bits

128-BitsData

128-BitsData Frame 0

128-BitsData

128-BitsData Frame 1

Frame 2

128-BitsData Last Frame

Column 0 Column 1 Last Column

32-Bits Frame-BasedCheck-Bits

32-Bits Frame-BasedCheck-Bits

32-Bits Frame-BasedCheck-Bits

32-Bits Frame-BasedCheck-Bits

32-Bits Column-BasedCheck-Bits

32-Bits Column-BasedCheck-Bits

EDCRC Check-Bits Updates

Frame-based check-bits are calculated on-chip during configuration. Column-based check-bits areupdated after configuration.

When you enable the EDCRC feature, after the device enters user mode, the EDCRC function startsreading CRAM frames. The data collected from the read-back frame is validated against the frame-basedcheck-bits.

After the initial frame-based verification is completed, the column-based check-bits will be calculatedbased on the respective column CRAM. The EDCRC hard block will recalculate the column-based check-bits in one of the following scenarios:

• FPGA re-configuration• After successful partial reconfiguration (PR) session• After configuration via protocol (CvP) session

Error Message Register

The EMR contains information on the error type, the location of the error, and the actual syndrome. Thisregister is 78 bits wide in Arria 10 Device. The EMR does not identify the location bits for uncorrectableerrors. The location of the errors consists of the frame number, double word location and bit locationwithin the frame and column.

You can shift out the contents of the register through the following:

• EMR Unloader IP core—core interface• SHIFT_EDERROR_REG JTAG instruction—JTAG interface• HPS Shift register—HPS interface

4 Column-Based and Frame-Based Check-BitsAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 5: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Figure 3: Error Message Register Map

MSB LSB

Frame Address Column-BasedDouble Word

Column-BasedBit

Column-BasedType

Frame-BasedSyndrome

Frame-BasedDouble Word

Frame-BasedBit

Frame-BasedType

1 bit1 bit3 bits5 bits10 bits32 bits3 bits5 bits2 bits16 bits

Reserved Column-Check-Bit Update

Column-Based Fields

Frame-Based Fields

Table 2: Error Message Register Width and Description

Name Width (Bits) Description

Frame Address 16 Frame Number of the error locationColumn-Based DoubleWord

2 There are 4 double words per frame in a column. It indicatesthe double word location of the error

Column-Based Bits 5 Error location within 32-bit double wordColumn-Based Type 3 Types of error shown in Table 3Frame-Based syndromeregister

32 Contains the 32-bit CRC signature calculated for the currentframe. If the CRC value is 0, the CRC_ERROR pin is drivenlow to indicate no error. Otherwise, the pin is pulled high.

Frame-Based Double Word 10 Double word location within the CRAM frame.Frame-Based Bit 5 Error location within 32-bit double wordFrame-Based Type 3 Types of error shown in Table 3Reserved 1 Reserved bitColumn-Based Check-BitsUpdate

1 Logic high if there is error encountered during the columncheck-bits update stage. The CRC_ERROR pin will be assertedand stay high until the FPGA is reconfigured.

Related Information

• Reading EMR using JTAG Interface on page 8• Reading EMR using EMR Unloader IP Core on page 8• Reading EMR using HPS on page 8

Error Type in EMR

Table 3: Error Type in EMR

The following table lists the possible error types reported in the error type field in the EMR.

AN-7372017.03.15 Error Type in EMR 5

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 6: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Error Types Bit 2 Bit 1 Bit 0 Description

Frame-based

0 0 0 No error

0 0 1 Single-bit error

0 1 X Double-adjacent error

1 1 1 Uncorrectable error

Column-Based

0 0 0 No error

0 0 1 Single bit error

0 1 X Double-adjacent error in a same frame

1 0 X Double-adjacent error in a different frame

1 1 0 Double-adjacent error in a different frame

1 1 1 Uncorrectable error

Related Information

• CRC_ERROR Pin Behavior on page 10• SEU Mitigation for Arria 10 Devices

Provides more information about Arria 10 SEU error detection frequency.

Recovering from CRC Errors

Arria 10 devices support the internal scrubbing capability. The internal scrubbing feature corrects correct‐able CRAM upsets automatically when an upset is detected. However, internal scrubbing can not fix theFPGA to a known good state. The time between the error and completion of scrubbing can be tens ofmillisecond. This duration represents thousands of clock cycles in which data was legally written tomemories, or status registers. It is a good practice to always follow any SEU event with a soft-reset to bringthe FPGA operation to a known good state.

If a soft-reset is unable to bring the FPGA to a known good state, you can reconfigure the device to rewritethe CRAM and reinitialize the design registers. The system that hosts the Arria 10 device must control thedevice reconfiguration. When reconfiguration completes successfully, the Arria 10 device operates asintended.

Memory Blocks Error Correction Code SupportECC allows you to detect and correct data errors at the output of the memory. ECC can perform single-error correction, double-adjacent-error correction, and triple-adjacent-error detection in a 32-bit word.However, ECC cannot detect four or more errors.

6 Recovering from CRC ErrorsAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 7: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

The M20K blocks have built-in support for ECC when in x32-wide simple dual-port mode:

• The M20K runs slower than non-ECC simple-dual port mode when ECC is engaged. However, you canenable optional ECC pipeline registers before the output decoder to achieve higher performancecompared to non-pipeline ECC mode at the expense of one cycle of latency.

• The M20K ECC status is communicated with two ECC status flag signals—e (error) and ue (uncorrect‐able error). The status flags are part of the regular output from the memory block. When ECC isengaged, you cannot access two of the parity bits because the ECC status flag replaces them.

Guidelines for Error Detection CRC and Error Correction Feature

Error Detection

Enabling Error Detection

There are two methods to turn on Arria 10 error detection CRC feature based on your application needs.

• If your design detects and reads the EMR using user logic, you need to instantiate the EMR UnloaderIP core which will automatically turn the EDCRC feature on.

• If you want to monitor SEU with the external host and do not need to read the EMR from user logic,you can turn on EDCRC feature by enabling CRC_ERROR pin in your Quartus® Prime project.

Related InformationAltera Error Message Register Unloader IP Core User GuideProvides more information about using the Altera EMR Unloader IP core.

Enabling the Error Detection CRC_ERROR Pin

To enable the CRC_ERROR pin for external host monitoring purpose, perform the following steps:

1. On the Assignments menu, click Device.2. Click Device and Pin Options and select the Error Detection CRC at the left panel.3. Check the Enable Error Detection CRC_ERROR pin.4. Select the EDCRC clock divisor from the list of Divide error check frequency by.

Note: This option provides you with a flexibility to run the EDCRC at a slower speed. However, Intelrecommends you to set to the smallest EDCRC clock divisor. Setting a high divisor can impactthe error detection time performance. Refer to Arria 10 Handbook SEU Mitigation chapter ofthe Arria 10 handbook for detection time specification.

5. Check the Enable open drain on CRC_ERROR pin if you have an external pull up resistor on yourboard.

6. Click OK.

Related InformationSEU Mitigation for Arria 10 DevicesProvides more information about Arria 10 SEU error detection time.

Reading EMR

AN-7372017.03.15 Guidelines for Error Detection CRC and Error Correction Feature 7

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 8: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Reading EMR using EMR Unloader IP Core

You can instantiate the EMR Unloader IP core to detect SEU and unload EMR content in user logic. TheEDCRC feature will be turned on automatically when the EMR Unloader IP core is instantiated. EMRUnloader IP core helps you to read the EMR whenever there is an SEU event by:

• Unloading the EMR via core logic• Accessing the hard CRC Block• Providing access to the user logic to read the EMR data

Figure 4: EMR Unloader IP Core Block Diagram

EMR Unloader

Hard CRC Block

EMR Unloader IP Core

EMR(Avalon-ST

Source)

CRCError

EMRRead

End of Full-ChipError DetectionCycle (Optional)

Related InformationAltera Error Message Register Unloader IP Core User GuideProvides more information about retrieving and storing the error message register using Altera ErrorMessage Register Unloader IP Core.

Reading EMR using HPS

The FPGA Manager in the HPS has the ability to monitor the CRC_ERROR status pin and to retrieve theerror symptom, location and type. You can choose to enable the CRC error interrupt from the FPGAManager, followed by CRC error information extraction from respective registers.

Related InformationFPGA Manager of the Arria 10 Hard Processor System Technical Reference Manual

Reading EMR using JTAG Interface

To unload the contents of the EMR using a JTAG port, use the SHIFT_EDERROR_REG JTAG instruction.This JTAG instruction connects the EMR to the JTAG pin in the error detection block between the TDIand TDO pins. You can execute the instruction whenever the CRC_ERROR pin goes high. You must unloadthe contents of the EMR before the register is overwritten by the information of the next CRC error.

8 Reading EMR using EMR Unloader IP CoreAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 9: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Table 4: SHIFT_EDERROR_REG JTAG Instruction

JTAG Instruction Instruction Code Description

SHIFT_EDERROR_REG 00 0001 0111 The JTAG instruction connects the EMR tothe JTAG pin in the error detection blockbetween TDI and TDO pins.

The following shows the Jam™ Standard Test and Programming Language (STAPL) Format File (.jam)used to execute the SHIFT_EDERROR_REG JTAG instruction to unload the contents of the EMR.

Example 1: Example of .jam File to Unload the Contents of the EMR for Arria 10 Device

ACTION UNLOAD_EMR = EXECUTE;DATA EMR_DATA;BOOLEAN out[78];ENDDATA;PROCEDURE EXECUTE USES EMR_DATA;DRSTOP IDLE;IRSTOP IDLE;STATE IDLE;IRSCAN 10, $017;WAIT IDLE, 10 CYCLES, 1 USEC, IDLE;DRSCAN 78,$0, CAPTURE out[77..0];WAIT IDLE, 10 CYCLES, 25 USEC, IDLE;PRINT " ";PRINT "Data read out from the ";

PRINT "EMR_Register :" , out[77], out[76], out[75], out[74], out[73], out[72], out[71], out[70], out[69], out[68], out[67], out[66], out[65], out[64], out[63],out[62], " ", out[61], out[60], " ", out[59], out[58], out[57], out[56], out[55], " ", out[54], out[53], out[52], " ", out[51], out[50], out[49], out[48], out[47], out[46], out[45], out[44], out[43], out[42], out[41], out[40], out[39], out[38], out[37], out[36], out[35], out[34], out[33], out[32], out[31], out[30], out[29], out[28], out[27], out[26], out[25], out[24], out[23], out[22], out[21], out[20], " ", out[19], out[18], out[17], out[16], out[15], out[14], out[13], out[12], out[11], out[10], " ", out[9] , out[8], out[7], out[6], out[5], " ", out[4], out[3], out[2], " ", out[1], " ", out[0];

'PRINT " ";

PRINT "Frame Address :", out[77], out[76], out[75], out[74], out[73], out[72], out[71], out[70], out[69], out[68], out[67], out[66], out[65], out[64], out[63], out[62];PRINT "Column-Based Double Word Location :", out[61], out[60];PRINT "Column-Based Bit :", out[59], out[58], out[57], out[56], out[55];PRINT "Column-Based Type :", out[54], out[53], out[52];PRINT "Frame-Based Syndrome :" , out[51], out[50], out[49], out[48], out[47], out[46], out[45], out[44], out[43], out[42], out[41], out[40], out[39], out[38], out[37], out[36], out[35], out[34], out[33], out[32], out[31], out[30], out[29], out[28], out[27], out[26], out[25], out[24], out[23], out[22], out[21], out[20];PRINT "Frame-Based Double Word Location :", out[19], out[18], out[17], out[16], out[15], out[14], out[13], out[12], out[11], out[10];PRINT "Frame-Based Bit :", out[9] , out[8], out[7], out[6], out[5];PRINT "Frame-Based Type :", out[4], out[3], out[2];PRINT "Reserved bit :", out[1];PRINT "Column-based EDCRC Check Bits Update:", out[0];STATE IDLE;

EXIT 0;

AN-7372017.03.15 Reading EMR using JTAG Interface 9

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 10: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

ENDPROC;

Related InformationSEU Mitigation for Arria 10 DevicesProvides more information about Arria 10 SEU features.

Enabling Error Correction (Internal Scrubbing)Arria 10 supports the internal scrubbing feature to automatically scrub away the flipped bit induced by theSEU. To enable the internal scrubbing feature, follow these steps:

1. On the Assignments menu, click Device.2. Click Device and Pin Options and select the Error Detection CRC tab.3. Turn on Enable internal scrubbing.4. Click OK.

Interpreting CRC_ERRORIt is important to determine the error type when an SEU is detected. This section explains the CRC_ERRORpin behavior and how to interpret whether the error type is correctable or uncorrectable.

CRC_ERROR Pin Behavior

The Arria 10 fast EDCRC feature runs all the column-based check-bits engine in parallel. When an SEU isdetected, the column-based check-bits asserts the CRC_ERROR, the detected frame location is then passed tothe frame-based check-bits to further localize the affected bit. This process causes the CRC_ERROR pin toassert twice. Column-based check-bits assert the first CRC_ERROR pulse and followed by the frame-basedcheck-bits asserting the second pulse.

In Arria 10, as soon as an SEU is detected, the CRC_ERROR will be asserted high and remains high until theEMR is ready to be read. You can unload the EMR data as soon as the CRC_ERROR pin goes low. Once EMRdata is unloaded, can determine the error type and the affected location. With these information you candecide how your system response to the specific SEU event.

10 Enabling Error Correction (Internal Scrubbing)AN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 11: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Figure 5: Fast EDCRC Process Flow Chart

CRC_ERROR Deasserted

EDCRC Running

Start EDCRC Column-Based Error Scan

Error Correction

Find Frame Location

Update EMR Column-Based Fields

Start EDCRC Frame-Based Error Scan

Find FrameLocation(s)

Update EMR Frame-Based Fields

ErrorDetected?

ErrorCorrectable?

ErrorCorrectable?

NO

NO

NO

YES

YESYES

CRC_ERROR Deasserted

CRC_ERROR Asserted CRC_ERROR Asserted

AN-7372017.03.15 CRC_ERROR Pin Behavior 11

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 12: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Figure 6: Timing Diagram for Column-Based Check-Bits

If the error is correctable, in most cases, there will be a second pulse in a single SEU event .There are caseswhere the error is uncorrectable when the CRC_ERROR pin asserts 2 pulses, refer to Correctable andUncorrectable Error for complete correctable and uncorrectable error cases. The complete EMR will onlybe available at the falling edge of the second pulse.

CRC ERROR Pin

Column-BasedError Detected

Column-Based Check-BitsAssertion Time

Column-Based EMRis Available

Unload EMR Ends

Complete EMR is Available

Frame-BasedCheck-Bits

Assertion Time

Duration to expect 2nd

pulse triggered by Frame-Based

Check-Bits

Unload EMR Starts

One SEU Event EMR for the Second Frame (1)

(1) In a rare event of correctable double-adjacent error located in different frames.

In the rare event of an uncorrectable and un-locatable error, the CRC_ERROR signal is asserted only once.There will be no second pulse assertion by frame-based check-bits due to the uncorrectable error locationcannot be located. The statistical likelihood of uncorrectable multi-bit SEU is less than one in 10,000 yearsfor a device in typical environmental conditions.

Figure 7: Timing Diagram for Column-Based or Frame-Based Check-Bits

Example of CRC_ERROR pin behavior for column-based/frame-based check-bits with a single pulseobserved in one SEU event.

CRC ERROR Pin

Column-Based/Frame-BasedError Detected

Unload EMRStarts

Column-Based/Frame-Based Check-BitsAssertion Time

Related Information

• Error Type in EMR on page 5

12 CRC_ERROR Pin BehaviorAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 13: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

• SEU Mitigation for Arria 10 DevicesProvides more information about Arria 10 SEU error detection frequency.

Correctable and Uncorrectable Error

When an SEU is detected, you can read the EMR data to determine whether the error is correctable oruncorrectable. Intel recommends you to use Altera EMR Unloader IP core in your design. The Altera EMRUnloader IP core interprets the error and reports it at the output.

Table 5: Correctable and Uncorrectable Error Cases

The table summarizes the correctable and uncorrectable error cases. You do not need to determinewhether the current EDCRC operation is in frame-based check-bits or column-based check-bits but youneed to know how to interpret the error type of the column-based or frame-based. If the EMR UnloaderIP core reports the error type other than 3’b111, the error is correctable and the error will be scrubbed ifyou turned on internal scrubbing.

Case EDCRCOperation

CRC_ERRORPulse

Column-Based Field Frame-Based Field Correct‐able

Remark

A(1) Frame-basedcheck-bits

1 All 0's Type = 3'b001 or

Type = 3'b010 & bit ≠5'h1F or

Type = 3'b011 & bit =5'h1F

Yes Error will becorrected ifinternalscrubbing isturned On

B(1) Frame-basedcheck-bits

1 All 0's Type = 3'b111 or

Type = 3'b010 & bit =5'h1F or

Type = 3'b011 & bit ≠5'h1F

EMR Unloader IP corewill set type = 3'b111 ifany of above conditionmet

No The frame-based check-bits will retryfor 2 timesand enterdead statewhere CRC_ERROR stayshigh untilFPGAreconfigura‐tion

(1) Case can occur only when the Frame-based CRC error is detected after the CRAM is configured, such asFPGA configuration, partial PR or CvP. The SEU event is statistically impossible to happen during CRAMconfiguration, such cases are to cover other problems such as corrupted configuration data or bad CRAMthat unable to hold the correct bit setting.

AN-7372017.03.15 Correctable and Uncorrectable Error 13

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 14: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Case EDCRCOperation

CRC_ERRORPulse

Column-Based Field Frame-Based Field Correct‐able

Remark

C Stuck indead state

1 pulseand stay

high after2nd

assertion

EMR Unloader IP coreset Type = 3'b111

EMR Unloader IP coreset type = 3'b111

No CRC_ERROR

stays highuntil FPGAreconfigura‐tion. ReferCase B tounderstandhow EDCRCcan stuck indead state

D Column-basedcheck-bits

1 Type = 3'b111 or

Type = 3'b010 & bit =5'h1F or

Type = 3'b011 & bit ≠5'h00

EMR Unloader IP corewill set type = 3'b111 ifany of above conditionmet

All 0's No Detecteduncorrectableerror duringcolumn-based check-bits

E Column-basedcheck-bitsand frame-basedcheck-bits

2 Any type except:

Type = 3'b111 or

Type = 3'b010 & bit =5'h1F or

Type = 3'b011 & bit ≠5'h00

Type = 3'b111 or

Type = 3'b010 & bit =5'h1F or

Type = 3'b011 & bit ≠5'h1F

EMR Unloader IP corewill set type = 3'b111 ifany of above conditionmet

No Detecteduncorrectableerror

F Column-basedcheck-bitsand frame-basedcheck-bits

2 Any type except:

Type = 3'b111 or

Type = 3'b010 & bit =5'h1F or

Type = 3'b011 & bit ≠5'h00

Any type except:

Type = 3'b111 or

Type = 3'b010 & bit =5'h1F or

Type = 3'b011 & bit ≠5'h1F

Yes Error will becorrected ifinternalscrubbing isturned On

14 Correctable and Uncorrectable ErrorAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 15: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Guidelines for Embedded Memory ECC FeatureThe Arria 10 SCFIFO and DCFIFO supports embedded memory ECC for M20K memory blocks. Thebuilt-in ECC feature in Arria 10 can perform:

• Single-error detection and correction• Double-adjacent-error detection and correction• Triple-adjacent-error detection

You can turn on FIFO Embedded ECC feature by enabling enable_ecc parameter in the FIFO IP coreGUI.

Note: Embedded ECC feature is only available for M20K memory block type.

Note: The embedded memory ECC supports variable data width. When ECC is enabled, RAM combinesmultiple M20K blocks in the configuration of 32(width) x 512 (depth) to fulfill your instantiation.The unused data width will be tied to the VCC internally.

Figure 8: ECC Option in FIFO GUI

When you enable the ECC feature, a 2-bit wide error correction status port (eccstatus[1:0]) will becreated in the generated FIFO entity. These status bits indicate whether the data that is read from thememory has an error in single-bit with correction, fatal error with no correction, or no error bit.

• 00: No error• 01: Illegal• 10: A correctable error occurred and the error has been corrected at the outputs; however, the memory

array has not been updated.• 11: An uncorrectable error occurred and uncorrectable data appears at the output

Related InformationDCFIFO and SCFIFO IP Cores User Guide

AN-7372017.03.15 Guidelines for Embedded Memory ECC Feature 15

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 16: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Arria 10 EDCRC Reference DesignThe EDCRC reference design demonstrates the following main SEU detection and recovery for Arria 10:

• Instantiating various SEU-related IP cores such as EMR Unloader IP core, Advanced SEU Detection IPcore, and Fault Injection IP core

• Demonstrating how the Advanced SEU Detection IP core retrieves the SMH information from theEPCQ-L with Serial Flash Controller IP core(2)

• Integrating the reference design into your system and characterize your system response to the SEUevent with the Intel Fault Injection feature.

Related Information

• Arria 10 EDCRC Reference Design FilesReference design files that you need to apply steps and compilation described in Creating Arria 10SEU Fault Injection and Hierarchy Tagging Design with Qsys.

• Complete Arria 10 EDCRC Reference Design FilesPrecompiled reference design files ready for design testing in Design Testing with Fault InjectionDebugger.

• Arria 10 GX FPGA Development Kit

System RequirementsThis reference design is targeted for the following hardware and software:

• Arria 10 development kit that is using 10AX115S2F45I2SG device.• Quartus Prime software version 16.0

Note: You can tweak some setting in this design if you wish to test on other Arria 10 devices. For example,you can change the device to other Arria 10 part, set other clock source frequency, and clock sourcepin assignment.

Creating Arria 10 SEU Fault Injection and Hierarchy Tagging Design with QsysThe a10-seu.zip reference design consists of:

• a10_seu.qar—the project archive file• top.v—the top level module of the project• top.sdc—the timing constraint file• top.stp—the Signal Tap file

Note: The a10-seu-complete.zip consists of a fully compiled and output files-ready reference design.You can refer directly to Design Testing with Fault Injection Debugger on page 24 if you chooseto use this complete design as a reference.

In this design, you will use Qsys to connect the Intel SEU-related IP cores together. IP core to beconnected are EMR Unloader IP core, Fault Injection IP core and Advanced SEU Detection IP core. Some

(2) You can only use EPCQ-L to store SMH and access with Serial Flash Controller when you set your MSEL pinto Active Serial Configuration.

16 Arria 10 EDCRC Reference DesignAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 17: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

other IP cores are also needed to make the design complete, which are Altera IOPLL IP core, AVST Splitterand Serial Flash Controller IP core.

Figure 9: Arria 10 SEU Fault Injection and Hierarchy Tagging Design

Non-criticalLogic

Command-LineInterface or

Fault InjectionDebugger User

Interface

Arria 10

Fault InjectionIP Core (2)

CriticalUser Logic

Sensitivity MapHeader File (.smh)

Notes:1. The Fault Injection IP core and Advanced SEU Detection IP core read the EMR from EMR Unloader IP core .2. The Fault Injection IP core flips the bits of the targeted logic.3. The Advanced SEU Detection IP core flag the affected region by reading the .smh file stored in EPCQ-L.

Injected Error

Unused LogicSerial FlashController

Advanced SEUDetection IP Core(3)

Avalon-STSplitter

SignalTapEPCQ-L

Sensitivity MapHeader File (.smh)

EMR UnloaderIP Core (1)

Related InformationConfiguration, Design Security, and Remote System Upgrades in Arria 10 Devices

Starting Quartus Prime Software and Opening the Reference Design ProjectThe Quartus Prime project serves as an easy starting point for this reference design development flow. TheQuartus Prime project contains all setting and design files required to create the .sof.

To open the Quartus Prime project, perform the following steps:

1. In the Quartus Prime software, click Open Existing Project on the splash screen, or on the File menu,click Open Project. The Open Project dialog box appears.

2. Browse to the <qar file directory> where you store your .qar file.3. Select the file a10_seu.qar and click Open.4. Change the destination folder name if required, or leave it default as <qar file directory>/a10_

seu_restored. Click OK.

Creating New Qsys System

To create a new Qsys system, click Qsys on the Tools menu in the Quartus Prime software. Qsys starts anddisplays the System Contents tab.

AN-7372017.03.15 Starting Quartus Prime Software and Opening the Reference Design Project 17

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 18: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Figure 10: Complete IP Core settings and Connections in Qsys

Specifying Target FPGA and Clock Settings

To specify target FPGA and clock settings in Qsys, perform the following steps:

1. Click Device Family in View menu, select the Device Family that matches the Arria 10 device you aretargeting. Warning will appear if the selected device family does not match Quartus Prime projectsettings, you need to make sure your selected device in Quartus Prime project settings match to yourselected Device Family in Qsys.

2. On the System Contents tab, double click the clk_0 component. In the Parameters tab for clk_0, setthe Clock frequency to 50MHz.Next, you begin to add other IP cores to the Qsys system.

Adding Altera IOPLL IP Core

You must instantiate Altera IOPLL IP core in this reference design to generate 3 different clock sources,10MHz, 20MHz and 100MHz. To add the Altera IOPLL IP core, perform the following steps:

1. On the IP Catalog Tab, expand Basic Functions, expand Clock; PLLs and Resets, PLL, and then clickAltera IOPLL.

2. Click Add. The Altera IOPLL parameter editor appears.3. On PLL tab, at General section, set the Reference Clock Frequency to 50.4. Uncheck Enable locked output port.5. At Output Clocks section, set Number Of Clocks to 3.

18 Specifying Target FPGA and Clock SettingsAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 19: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

6. Set the clocks as the following:a. For outclk0, set the Clock Name to clk_100and set the Desired Frequency to 100MHz.b. For outclk1, set the Clock Name to clk_20 and set the Desired Frequency to 20MHz.c. For outclk2, set the Clock Name to clk_10 and set the Desired Frequency to 10MHz.

7. Click Finish to return to Qsys.8. On System Contents tab, an instance of the iopll_0 appears in the system contents table.9. Connect the clk port of the clk_0 clock source to the refclk port of the iopll_0.10.Connect the clk_reset port of the clk_0 clock source to the reset port of the iopll_0.11.Double click the outclk2 of the iopll_0 at Export column to export outclk2 as the clock source for

other component outside of this Qsys system. Rename the exported signal as clk_10.12.Double click the outclk0 of the iopll_0 at Export column to export outclk0 as the clock source

for other component outside of this Qsys system. Rename the exported signal as clk_100.

Adding EMR Unloader IP Core

You must instantiate EMR Unloader IP core to unload the EMR whenever there is SEU event. To add theEMR Unloader IP core, perform the following steps:

1. On the IP Catalog tab, expand Basic Functions, expand Configuration and Programming, and thenclick Altera Error Message Register Unloader.

2. Click Add. The Altera Error Message Register Unloader parameter editor appears.3. In CRC error check clock divisor list, select 2.4. Check the Input clock is driven from Internal Oscillator. This reference example uses Internal

Oscillator to drive EMR Unloader IP core.5. Click Finish to return to Qsys. On System Contents tab, an instance of the emr_unloader2_0 appears

in the system contents table.6. Connect the clk_reset port of the clk_0 clock source to the reset port of emr_unloader2_0.7. Double click the crcerror, and emr_read of emr_unloader2_0 at Export column to export them for

external access. Leave the name as default.

Adding Advance SEU Detection IP Core

You must instantiate the ASD IP core for sensitivity processing and to validate the hierarchy taggingfeature. To add ASD IP core, perform the following steps:

1. On the IP Catalog tab, expand Basic Functions, expand Configuration and Programming, and thenclick Altera Advanced SEU Detection.

2. Click Add. The Altera Advanced SEU Detection parameter editor appears.3. Leave the CRC error cache depth list default selection at 8.4. Set Largest ASD region ID used to 3.5. Check the Use on-chip sensitivity processing.6. Set Memory interface address width to 32.7. Set Sensitivity Data start address to 0x02000000.

AN-7372017.03.15 Adding EMR Unloader IP Core 19

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 20: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

8. Click Finish to return to Qsys. On System Contents tab, an instance of the adv_seu_detection_0appears in the system contents table.

9. Connect the clk_reset port of the clk_0 clock source to the reset port of theadv_seu_detection_0.

10.Double click the cache_comparison_off, and errors port of adv_seu_detection_0 at Exportcolumn to export them for external access, leave the name default.

Adding Fault Injection IP Core

You must instantiate the Fault Injection IP core to inject the fault to the CRAM. The faults can be a SingleBit Error (SBE), Double Adjacent Error (DAE) or Uncorrectable Multi Bit Error (UMBE). To add FaultInjection IP core, perform the following steps:

1. On the IP Catalog tab, expand Basic Functions, expand Configuration and Programming, and thenclick Altera Fault Injection.

2. Click Add.The Altera Fault Injection parameter editor appears.3. Click Finish to return to Qsys.On the System Contents tab, an instance of the fault_injection_0

appears in the system contents table.4. Connect clk_reset port of the clk_0 clock source to the reset port of the fault_injection_0.5. Connect intosc port of the fault_injection_0 to clock port of emr_unloader2_0.6. Connect intosc port of the fault_injection_0 to clock port of adv_seu_detection_0.7. Connect crcerror_pin port of emr_unloader2_0 to crcerror_pin port of fault_injection_0.8. Double click the error_injected, and error_scrubbed of the fault_injection_0 at Export column

to export them for external access, leave the name default.

Adding Avalon-ST Splitter

EMR Unloader core sends the EMR data to the downstream IP cores with Avalon-ST protocol. Both ASDIP core and Fault Injection IP core require EMR data from EMR Unloader core. You need to instantiatethe Avalon-ST Splitter to distribute the EMR data from EMR Unloader to ASD IP core and Fault InjectionIP core. To add the Avalon-ST Splitter, perform the following steps:

1. On the IP Catalog tab, expand Basic Functions, expand Bridges and Adaptors, expand Streaming,and click Avalon-ST Splitter.

2. Click Add. The Avalon-ST Splitter parameter editor appears.3. Set NUMBER_OF_OUTPUTS to 3.4. Check only USE_VALID, USE_ERROR and USE_DATA, uncheck all other check boxes.5. Set DATA_WIDTH to 119.6. Set ERROR_WIDTH to 1.7. Set BITS_PER_SYMBOL to 119.8. Click Finish to return to Qsys. On the System Contents tab, an instance of the st_splitter_0 appears in

the system contents table.9. Connect clk_reset port of the clk_0 clock source to reset port of st_splitter_0.10.Connect intosc port of the fault_injection_0 to clk port of st_splitter_0.11.Connect avst_emr_src port of emr_unloader2_0 to in port of st_splitter_0.

20 Adding Fault Injection IP CoreAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 21: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

12.Connect out0 port of st_splitter_0 to avst_emr_snk port of adv_seu_detection_0.13.Connect out1 port of st_splitter_0 to avst_emr_snk port of fault_injection_0.14.Double click the out2 port of the st_splitter_0 at Export column to export it for external access,

leave the name default. This port will be used for Signal Tap purpose to read the EMR value after thefault injection.

Adding Serial Flash Controller

You must use the Serial Flash Controller IP core to access to the EPCQ-L1024 that stores the SMH file inthis reference design. The ASD IP core reads the SMH data from EPCQ-L1024 via Serial Flash ControllerIP core. To add the Serial Flash Controller, perform the following steps:

1. On the IP Catalog tab, expand Basic Functions, expand Configuration and Programming, and thenclick Altera Serial Flash Controller.

2. Click Add. The Altera Serial Flash Controller parameter editor appears. Set the parameters as thefollows:a. On Configuration device type list, select EPCQL1024.b. On Choose I/O mode, select QUAD.c. On Number of Chip Selects used list, select 1.

3. Click Finish to return to Qsys. On the System Content tab, an instance of the epcq_controller_0appears in the system contents table.

4. Connect outclk1 port of iopll_0 to clock_sink port of epcq_controller_0.

Note: The Fmax for Serial Flash Controller is 25MHz5. Connect clk_reset port of clk_0 clock source to reset port of epcq_controller_0.6. Connect asd_sp_master port of adv_seu_detection_0 to avl_mem port of epcq_controller_0.

Generating Qsys System

To generate the Qsys system, perform the following steps:

1. Click Generate HDL from Generate menu.2. Click Generate. Click Yes when the Save Changes? dialog box appears.3. Type asd_fi_system in the File name box and click Save. The Generate dialog box appears and

system generation process begins.4. Click Close to close the dialog box.5. On the File menu, click Exit to close Qsys and return to the Quartus Prime software.

You are ready to integrate the Qsys system into Quartus Prime project.

AN-7372017.03.15 Adding Serial Flash Controller 21

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 22: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Integrating Qsys System into Quartus Prime Project

To complete the reference design, you must perform the following tasks:

• Generate In-System Source and Probe (ISSP) IP core• Quartus Prime project setting and add the following files (provided in download package) to the

project:

• Top.v—instantiate the Qsys system module and connect all other IP cores• Top.stp—monitor some key signals with Signal Tap tool• Top.sdc—timing constraint

• Assign ASD regions to up counter and down counter• Assign FPGA device and pin locations• Compile the project

Generating In-System Source and Probe IP Core

To generate ISSP, perform the following steps:

1. On IP Catalog, expand Basic Functions, expand Simulation; Debug and Verification, expand Debugand Performance and double click Altera In-System Sources and Probes.

2. IP Parameter Editor appears, key in issp in Entity name, click OK.3. Set Probe Port Width [0..511] to 0.4. Set Source Port Width [0..511] to 4.5. Leave default to all other setting.6. Click Generate HDL from Generate Menu, click Generate.7. Click close and click Exit from File menu.8. Click Yes if prompted to add the Quartus Prime IP File to the project.

Quartus Prime Project Settings

To set the Quartus Prime project setting, add the top level file, Signal Tap file and SDC file to the project,perform the following steps:

1. Click Device at Assignments menu, and then click Device and Pin Options in Device dialog box.2. Under Configuration Category, select Active Serial x4 for the Configuration scheme.3. Under Error Detection CRC Category, check the Enable Error Detection CRC_ERROR pin.4. Leave Enable internal scrubbing uncheck.

Note: You can enable Enable internal scrubbing during internal scrubbing feature tryout.5. Set the Divide error check frequency by list to 2.6. Check the Generate SEU sensitivity map file (.smh).7. Click OK to exit Device and Pin Options dialog box.8. Click OK again to exit Device dialog box.9. Click Settings at Assignments menu, select Files category at left panel, add top.v, top.stp and

top.sdc to the project.10.Select TimeQuest Timing Analyzer category at left panel, add the top.sdc to SDC files to include in

the project.

22 Integrating Qsys System into Quartus Prime ProjectAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 23: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

11.Select Signal Tap Logic Analyzer category at left panel, check Enable Signal Tap Logic Analyzer andselect the top.stp as the Signal Tap File name.

12.Click OK to close the Settings window.13.Click Processing Menu, click Start > Analysis and Synthesis.

Assigning ASD Regions

This reference design uses 3 ASD regions. To assign the ASD regions, perform the following steps:

1. At Project Navigator window, select Hierarchy, expand top, right clickdown_counter:down_counter_inst, select Design Partition, Set as Design Partition.

2. Repeat step 1 for up_counter:up_counter_inst to set the Design Partition.3. In the Design Partition Window , set the Netlist Type and ASD Region for the following Partition

Name:

Partition Name(3) Netlist Type ASD Region(4)

Top Source File 1

down_counter:down_counter_inst Source File 2

up_counter:up_counter_inst Source File 3

Assigning FPGA Pin Location

To assign the clock source pin to your design, perform the following steps:

1. Launch the Pin Planner from the Assignment menu.2. Assign AU33 to inclk input.3. Close the Pin Planner.

Compiling the Project

You must compile the project to generate the .sof file and .smh file. To compile the project, perform thefollowing steps:

1. Click Start Compilation in the Processing menu.The full compilation process begins and this may take a while to complete the compilation.

2. After the compilation complete, you will get the .sof file and .smh file in the output_files folder,you need these files for hardware verification later.

(3) You can toggle Design Partition Window on or off from Assignments menu or enter the shortcut key Alt+D.

(4) To make ASD Region column visible in Design Partition Window, right click the header of the table andcheck ASD Region.

AN-7372017.03.15 Assigning ASD Regions 23

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 24: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Design Testing with Fault Injection DebuggerThe following are the main steps to test your reference design:

1. Convert .sof file and .smh file to .jic file.2. Program .jic file to EPCQ-L.3. Launch Signal Tap Logic Analyzer and Fault Inject Debugger.4. Configure the .sof to Arria 10 and reading .smh file with Fault Injection Debugger.5. Start Signal Tap to monitor the signal and injecting an error with Fault Injection Debugger.6. Observe the Signal Tap output.

This section will go through some simple steps to inject faults to the CRAM. For more information aboutthe Fault Injection Debugger, refer to Fault Injection Debugger User Guide.

Related InformationDebugging Single Event Upset Using the Fault Injection DebuggerProvides more information about using Fault Injection Debugger.

Converting .sof File and .smh File to .jic File

To program .sof file and .smh file into EPCQ-L, you must convert them to a .jic file. Theconverted .jic file is consist of:

• The bit stream forArria 10 FPGA configuration in Active Serial mode upon power up• The .smh file content at certain offset that you can define in Convert Programming File tool

To convert, perform the following steps:

1. Go to your output_files folder, duplicate the top.smh file and rename it to top.hex.

Note: The .smh file is in Intel HEX standard format, i.e. bytes addressing little endian. You may needto convert the .smh file to match the endianness of your system

2. Launch Convert Programming File tool from File menu.3. At Output programming file section, select JTAG Indirect Configuration File (.jic) from the

Programming file type list.4. Select EPCQL1024 from Configuration device list.5. Select Active Serial x4 from Mode list.6. Give the File name as output_files/top.jic.

Optional to check Create Memory Map File (Generate top.map) and Create config data RPD(Generate top_auto.rpd).

7. At Input files to convert section, select Flash Loader at the column of File/Data area.8. Click Add Device button and select Arria 10, 10AX115S2 and click OK.9. Select SOF Data at File/Data area column, click Add File button and select the top.sof inside the

output_files folder.10.Select top.sof under SOF Data, click the Properties button, enable Compression and click OK to

close the SOF File Properties dialog.11.Click Add Hex Data at Input files to convertsection.12.Select Relative addressing and set the start address to 0x2000000. Leave Big endian as the default

selection for Endianness. Select top.hex from your output_files folder, and click OK.The figure below shows the final setting for the .jic file generation. Verify and click Generate button.

24 Design Testing with Fault Injection DebuggerAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 25: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

13.Click Close button to close the Convert Programming File after .jic is generated successfully.

Programing .jic File into EPCQ-L

Before performing this task, ensure that your board configuration scheme is set to Active Serial by settingthe MSEL[2:0] pins to b'0101 or b'011. Refer to the Configuration, Design Security, and Remote SystemUpgrades in Arria 10 Devices for more information.

To program the generated .jic file into the EPCQ-L, perform the following steps:

1. Launch Programmer at Tools menu.2. Ensure that the valid programming cable is selected at Hardware Setup.3. Click Auto Detect button and you should see the detect JTAG chain displayed in the programmer

window.4. Select Arria 10 FPGA, click the Change File button and select top.jic file in your output_files

folder.5. Check the output_files/top.jic Program/Configure, the Factory default SFL image Program/

Configure will be checked automatically.The diagram below shows the final setting of the programmer.

AN-7372017.03.15 Programing .jic File into EPCQ-L 25

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 26: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

6. Click Start to program top.jic file, this operation may take several minutes to complete.

Related InformationConfiguration, Design Security, and Remote System Upgrades in Arria 10 Devices

Launching Signal Tap Logic Analyzer

To observe the signals monitored by the Signal Tap, you must launch the Signal Tap Logic Analyzer andstart the Signal Tap operation before the fault injection operation. To launch the Signal Tap LogicAnalyzer, perform the following steps:

1. Launch Signal Tap Logic Analyzer from Tools menu.2. Make sure the Hardware and Device is selected.

Your Signal Tap operation cannot be started at this point until the FPGA is configured.

Configuring Arria 10 and Reading .smh File with Fault Injection Debugger

To configure the Arria 10 with Fault Injection Debugger, perform the following steps:

1. Launch Fault Injection Debugger from Tools menu.2. Make sure a valid programming cable is selected in Hardware Setup.3. Click Auto Detect, the windows should display the detected Arria 10 in the JTAG chain.4. Select Arria 10 device, click Select File, select the top.sof from the output_files folder and click

Open.5. Check the Program/Configure.6. Click Start to start the configuration operation.7. Right click the Arria 10 device, click Select SMH file.8. Select the top.smh from output_files folder and click Open.9. Right click the Arria 10 device, click Show Device Sensitivity Map.10.SelectASD region(s) - 1 in the Sensitivity Map window as shown in the figure below.

26 Launching Signal Tap Logic AnalyzerAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 27: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

11.Close the Sensitivity Map window.

Injecting Error with Fault Injection Debugger

You can now inject the error to the CRAM with the Fault Injection Debugger. Prior to error injection, youmust start the Signal Tap to monitor the targeted signals. Perform the following steps:

1. In Signal Tap Logic Analyzer window, select the Signal Tap instance and click Run Analysis inProcessing menu, or hit F5.

2. Back to the Fault Injection Debugger window, check Inject Fault and click Start.You may see theQuartus Prime System message shows Injects 1 error (s) into device(s).

3. Click Read EMR, the System message shows the injected error location as in the figure below.

AN-7372017.03.15 Injecting Error with Fault Injection Debugger 27

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 28: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

The Signal Tap Logic Analyzer will read the error as the critical error and reports the affected region as0x1, this should match to the System message that reports the error located at ASD region 1.

Arria 10 ROM with ECC Reference DesignThe ROM IP core does not have ECC selection in the user interface. This reference design demonstrates amethod to enable ECC feature for the ROM IP core.

The ROM with ECC feature design consists of the following modules:

• ROM with ECC• Read address

Figure 11: ROM with ECC Reference Design Block Diagram

ReadAddressModule

ROM withECC

Module

dataout

eccstatus

In the design, the ROM content is initiated in the .mif file with the associated address shown in thefollowing table.

Table 6: ROM Content Initialization

Address ROM content

00h 32h

01h 33h

02h 34h

:

:

:

:

1Dh 4Fh

1Eh 50h

1Fh 51h

28 Arria 10 ROM with ECC Reference DesignAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback

Page 29: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Related InformationArria 10 ROM with ECC Reference Design Files

System RequirementsThe design is implemented using the 10AX115S3F45E2SGE3 device on the Arria 10 ES Development KitBoard. You can implement this design using any Arria 10 device.

Design WalkthroughTo enable the ECC feature, you must instantiate a RAM:2-PORT IP core with following settings:

• Set Operation mode to Simple Dual Port (with one read port and one write port)• Set Ram block type to M20K• Enable Same-width port feature• Disable Byte-enable feature• Enable ECC Checking feature• Initiate RAM content with a .mif or a .hex file

Additionally, you must connect the wren, wraddress and datain signals of the design to GND.

Hardware VerificationThe ROM content has been purposely injected with a faulty bit with single bit error and 3 adjacent bitserror after device configuration. If the ROM content is free from bit-flip, the ECC statusport(eccstatus[1:0]) shows 2b’00. The Signal Tap logic analyzer waveforms indicate that there are twocycles latency of the output with respect to the associated read address.

Results: Single Bit Error

After the device configuration, the ROM content of the address 1Fh is injected with 50h where it isinitiated with 51h. The ECC status signal shows 2b’10 which indicates single error bit detected at theROM content of the address 1Fh and the error has been corrected at the output.

Figure 12: Single Bit Error Waveform

Results: 3 Adjacent Bits Error

After the device configuration, the ROM content of the address 1Fh is injected with 56h where it isinitiated with 51h. The ECC status signal shows 2b’11 which indicates 3 adjacent bits error detected at theROM content of the address 1Fh and uncorrectable data appears at the output.

AN-7372017.03.15 System Requirements 29

SEU Detection and Recovery in Intel Arria 10 Devices Altera Corporation

Send Feedback

Page 30: AN 737: SEU Detection and Recovery in Intel Arria 10 Devices · PDF fileSEU Detection and Recovery in Intel® Arria® 10 Devices 2017.03.15 AN-737 Subscribe Send Feedback This application

Figure 13: 3 Adjacent Bits Error Waveform

Document Revision History

Date Version Changes

March 2017 2017.03.15 Rebranded as Intel.February 2017 2017.02.13 • Updated Timing Diagram for Column-Based

Check-Bits diagram description.• Added note to Case A and B in Correctable and

Uncorrectable Error Cases table.• Updated device development kit ordering part

number.• Added note to Creating Arria 10 SEU Fault

Injection and Hierarchy Tagging Design with Qsysto state the availability of a10-seu-complete.zip design and skipping pregeneratedsteps.

• Updated device selection in Converting .sof Fileand .smh File to .jic File.

October 2016 2016.10.31 • Added ROM with ECC Reference Design.• Updated EDCRC reference design target device

and reference design file.

March 2016 2016.03.03 Updated CRC_ERROR pin behavior whenuncorrectable error cannot be located.

March 2016 2016.03.02 Initial release.

30 Document Revision HistoryAN-737

2017.03.15

Altera Corporation SEU Detection and Recovery in Intel Arria 10 Devices

Send Feedback