moorec142/mapld2004 1 single event effects (see) test results on the virtex-ii digital clock manager...

18
1 Moore C142/MAPLD2004 Single Event Effects (SEE) Test Results on the Virtex- II Digital Clock Manager (DCM) Jason Moore 1 , Carl Carmichael 1 , Gary Swift 2 and Jeff George 3 1 Xilinx Corporation, San Jose CA 95124 2 Jet Propulsion Laboratory / Caltech, Pasadena CA, 91109 3 The Aerospace Corporation, El Segundo CA, USA "This work was carried out in part by the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration." "Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology."

Upload: kathryn-freeman

Post on 29-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

1Moore C142/MAPLD2004

Single Event Effects (SEE) Test Results on the Virtex-II Digital

Clock Manager (DCM)

Jason Moore1, Carl Carmichael1, Gary Swift2 and Jeff George3

1Xilinx Corporation, San Jose CA 951242Jet Propulsion Laboratory / Caltech, Pasadena CA, 911093The Aerospace Corporation, El Segundo CA, USA

"This work was carried out in part by the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration."

"Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology."

Page 2: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

2Moore C142/MAPLD2004

DCM Functionality• What is a DCM?

– Digital Clock Manager– Embedded digital “dedicated” function in the

FPGA that is configurable.• Delay-Locked Loop

– Clock phase de-skew• Improves device Tcko

– Duty cycle correction– Temperature compensation

• Frequency Synthesis– CLKFX = CLKIN * (M/D)

• Phase Shift– Fixed or Variable Mode– Resolution = 1/256 * period OR resolution of

tap (~50ps)

CLKINCLKFB

RST

CLK0 CLK90

CLK180 CLK270

CLKDV

LOCKED

CLKFX180PSEN CLKFX

PSDONE

CLK2X180

PSINCDEC

STATUS[7:0]

DSSEN

PSCLK

CLK2X

DCM

Clock signalControl signal

Page 3: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

3Moore C142/MAPLD2004

DCM Functionality• Why use a DCM?

– Most common function is the removal of clock insertion delay

• Tcko of FPGA without a DCM– Tcko(FPGA) = Insertion Delay + Tcko (FF) + Data Delay

• Tcko of FPGA with a DCM– Tcko(FPGA) = Tcko (FF) + Data Delay– Clock at pad of FPGA and internal FFs are aligned

• This configuration was chosen for testing

DCMBUFG

Clock Path: clk to a_14(FF) Location Delay type Delay(ns) Logical Resource(s) ------------ --------------- ------------- ------------------- D12.I Tiopi 0.653 clk (dcm_comp_CLKIN_IBUFG_INST) DCM_X0Y1.CLKIN net (fanout=1) 0.633 dcm_comp_CLKIN_IBUFG_OUT DCM_X0Y1.CLK0 Tdcmino -3.741 dcm_comp_DCM_INST BUFGMUX7P.I0 net (fanout=1) 0.674 dcm_comp_CLK0_BUF BUFGMUX7P.O Tgi0o 0.465 dcm_comp_CLK0_BUFG_INST U10.ICLK1 net (fanout=55) 0.742 clk_int ------------------------------------------------------------------------------------------------------------------------

Removal of Clock Insertion Delay

Significant Impact on High-Speed Designs!

Page 4: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

4Moore C142/MAPLD2004

DCM Radiation Test Strategy• Elementary Tests

– How hard is the DCM to upset?• SEUPI1 Analysis= 166 Configuration Memory Cells / DCM• Empirical Analysis = 160 Configuration Memory Cells (2 are unused)

– What common error modes are observed?• Advanced Tests

– Detailed analysis of error modes– Mitigation Evaluation

• DCM Configurations– Not every permutation and combination of DCM options tested– Focused on most common applications

• CLK0 and CLKFX

1: Single Event Upset Probability Indicator

Page 5: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

5Moore C142/MAPLD2004

Virtex-II Radiation Test Platform

• Functional Monitor/Control : FUNCMON– User-Specific Functionality (e.g. DCM, I/O, Multipliers, etc)

• Configuration Monitor/Control : CONFIGMON– Common Configuration Logic– Scrubbing; Readback for SEFI Detection

Service FPGA

DUT FPGA

CONFIGMON SW

FUNCMON SW

CONFIGMON HW

FUNCMON HW

Virtex-II “SEAKR Board”

2V3000 2V6000

Page 6: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

6Moore C142/MAPLD2004

Virtex-II Radiation Test Board

Service FPGA

DUT FPGAService PROMs

DUT PROMS (design and mask)

JTAG I/F

Page 7: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

7Moore C142/MAPLD2004

Elementary DCM Test

• November ’03 : Texas A&M Cyclotron– Focus on Cross-Section Calculations– Focus on Common Error Modes– Device Under Test (DUT) FPGA Design

• CLK0 de-skew : Most common application• Six DCMs tested simultaneously• Scrubbing configuration memory continuously

– Service FPGA Design• Basic Concept : Compare the output of three counters

– Counters operate at 100MHz– Two “Radiation” Counters, One “Golden” Counter– Error Modes Detected

• “Stuck At”, Transients and Change in Frequency

DCMBUFG

Page 8: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

8Moore C142/MAPLD2004

Elementary DCM Test Design Details

IBUFG DCM

CLKIN

CLKFXOBUF

OSC

DCM

CLKIN

CLKFX

DCMreset

SERVICE FPGA

DUT FPGA

DCM

CLKIN

CLKFB

CLK0

CLKFX

BUFG

OBUF

To DUT DCMs(top)

To DUT DCMs(bot)

BUFG

3x MULT = 100MHzAll clocks

Golden Clock

Status RegistersIO Fail Count

DCM Fail Count

DCM Data A

DCM Data B

DCM Data C

Error Detection and Reporting

(x6)

reset

CLKINCLK0

BUFGIBUFG

STATUSLOCK

CLKFB

RST

• Redundant outputs from the DUT DCM allow detection of I/O errors– Routing errors are included in the DCM error analysis (worst case)

• Error Detection and Reporting Logic– Asynchronous FIFOs used to transfer count values across clock domains– Dual-Port RAMs used to store status register values– Upon every error, a flag is set and the values of LOCKED and STATUS(1) are recorded

(x6)

Radiation Clocks

Page 9: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

9Moore C142/MAPLD2004

Elementary DCM TestResults

• Cross-Section Results– 40.0 MeV/u Ne, LET=1.22 MeV/cm2/mg– Cross Section: 8e-9 cm2/DCM

• Need more visibility into Error Detection and Correction– Understanding of Mitigation Methods

• Reset vs Scrub vs Reset and Scrub– Error Detection: Is LOCKED a reliable status?

Page 10: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

10Moore C142/MAPLD2004

Advanced DCM Tests• June and August ’04 : Texas A&M Cyclotron

– Multiple LETs and Ions• Ne, Ar, Kr,• 1.28, 4.28, 18.1 and 35.1 (MeV/mg/cm2)

– Provided more User Control• Added Control FSM : User dictates Reset or Scrub command

– Detailed Error Logging• Which errors are corrected by

– Reset– Scrub– Scrub and Reset

– 2nd and 3rd DUT Configurations Created• CLKFX output• CLK0 output at slower frequencies

Page 11: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

11Moore C142/MAPLD2004

Device Under Test Configurations

• DUT DCM Configuration– CLK0 w/feedback– 100MHz in and out (“CLK0FAST”)– 33MHz in and out (“CLK0SLOW”)

DCM

BUFG33MHz or

100MHzDCM

BUFG100MHz33MHz

• DUT DCM Configuration– CLKFX w/ CLK0 feedback– 33MHz in -> 100MHz out– “CLKFX”

DCM1

BUFGCLKFX

DCM2 DCM3

DCM4 DCM5 DCM6

33MHz or

100MHz

CLK0

Page 12: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

12Moore C142/MAPLD2004

DCM Control

• Test Operator has complete control of Correction Method– Reset, Scrub, Scrub and Reset

• Scrub– Refresh of configuration memory while operating

• Reset– DCM Reset Only

Wait for DCM Error

Reset DCM

Scrub DCM

Reset DCM

fixed fixed fixed

Not fixed

Not fixed

“Reset Only” Data Path Errors

Scrub required Configuration Errors

Page 13: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

13Moore C142/MAPLD2004

DCM Functional Monitor SW

Run/Error Status

Logging

Per DCM Control and

Status

User Controlled Reset

Real-time value of LOCKED output

Page 14: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

14Moore C142/MAPLD2004

Cross-Section Results

• DCM CLK0 and CLKFX outputs have “statistically equivalent” susceptibility

DCM

1.E-09

1.E-08

1.E-07

1.E-06

1.E-05

1.E-04

0 5 10 15 20 25 30 35 40

LET (MeV/mg/cm2)

Cro

ss

Se

cti

on

CLKFX

CLK0FAST

CLK0SLOW

Page 15: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

15Moore C142/MAPLD2004

Cross-Section Results

• No evidence of SET effects on the DCM between 33 and 100MHz.

CLK0

1.E-09

1.E-08

1.E-07

1.E-06

1.E-05

1.E-04

0 20 40 60 80 100 120

Frequency (MHz)

Cro

ss

Se

cti

on

Ne 40MeV

Kr 40MeV

Ar 40MeV

Page 16: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

16Moore C142/MAPLD2004

Cross-Section Analysis

• Large % of errors at a high LET are in the datapath

– DCM reset is all that is required to restore operation

– 20-120us LOCK time (CLK0)

– 10ms LOCK time (CLKFX)

• DCM Configuration Cell Usage is consistent

– ~38% (60 of 160) of the DCM Configuration Cells are critical for the CLKFX and CLK0 designs.1.E-09

1.E-08

1.E-07

1.E-06

1.E-05

1.E-04

0 5 10 15 20 25 30 35 40 45 50

LET (MeV/mg/cm2)

Cro

ss

Se

cti

on

(c

m2

/DC

M)

DCM Cross-SectionDCM Data Path ErrorsDCM Configuration ErrorsConfiguration (single bit)

Page 17: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

17Moore C142/MAPLD2004

Lessons Learned andFuture Work

• Lessons Learned– “Elementary Tests” are not efficient or cost effective

• Development of Fault Injection capabilities will eliminate the need for beam time during the early stages of SEU evaluation

• Future Work– Analysis and test of self-EDAC logic

• Autonomous Detection and Correction using XTMR

Page 18: MooreC142/MAPLD2004 1 Single Event Effects (SEE) Test Results on the Virtex-II Digital Clock Manager (DCM) Jason Moore 1, Carl Carmichael 1, Gary Swift

18Moore C142/MAPLD2004

Summary• Results

– The “saturated” DCM Cross-Section is ~1e-5– CLK0 and CLKFX Configuration Bit exposure is ~60 bits– Data Path Errors dominate at LET > 5 MeV/cm2/mg

• Only a reset is required to resynchronize DCM

– No apparent frequency dependence from 33MHz to 100MHz– DCM LOCKED output is not a reliable indicator of upsets– Scrubbing the DCM does not disturb the LOCK signal– An SEU in the DCM will not cause a functional failure of the if

the Xilinx Triple Module Redundancy method is employed.