matrix tdc design overview
TRANSCRIPT
Topical Workshop on Electronics for Particle Physics 2016
Karlsruhe, September 28th 2016
MATRIX: a 15 ps resistive
interpolation TDC ASIC based
on a novel regular structure
J. Mauricio, D. Gascón, D. Ciaglia,
S. Gómez, G. Fernández, A. Sanuy
SiUB & ICC - Universitat de Barcelona
Outline
• Motivation
• MATRIX TDC Design Overview
• Measurement Results
• Conclusions & Future Work
Motivation
• SiUB PET ASIC ecosystem:
– Broad experience in fast-timing ASIC designs.
FlexToTv1, FlexToTv2.
< 10-ps jitter floor.
100-ps SPTR.
< 8% energy resolution.
Low power conspumption (~10 mW/ch).
Continuous Time Binary Valued outputs.
• Challenge:
– Time-of-Arrival (ToA) TDC for FlexToT timing signal output.
Resolution << FlexToT SPTR 10 ps < LSB < 20 ps.
Very low power consmption (~10 mW/ch, full chip).
Jitter & timing resolution ~ 1 LSB.
Technology: 180 nm.
3J.Mauricio - UB
FlexToTv2 ASIC (0.35um BiCMOS)
Motivation
• First TDC design temptative:
– 4-ch TDC implementation based on state of the art.
An adaptation of L. Perktold and J. Christiansen TDC [1].
2 levels of fine TDC (sub-clock):
o 1st level: tunable buffers ~ 120 ps delay (180 nm technology).
o 2nd level: resistive interpolation.
Number of resistive interpolation stages:
o 8 stages per buffer: LSB ~ 15 ps.
Power consumption: ~18 mW (schematic level), ~4.5 mW/ch.
o Almost half of the power budged spent on this block.
Resistive interpolation problems:
o Require iterative adjust.
o Mismatch issues.
o Difficult to scale.
4
L. Perktold and J. Christiansen, 2013
J.Mauricio - UB
Motivation
• Design potentially out of specifications:
– Simulation results at schematic level:
Post-layout ~ 2X power consumption.
1st level tunable buffers would increase with parasitics.
5J.Mauricio - UB
Outline
• Motivation
• MATRIX TDC Design Overview
• Measurement Results
• Conclusions & Future Work
MATRIX TDC Design Overview
• A novel alternative was proposed:
– Resistive Interpolation Mesh Circuit (RIMC).
Patent application EP16382039.2.
7
Dummy stages
RIMC schematic.
R0<0> R0<1> R0<2> R0<3> R0<4> R0<5> R0<6> R0<7>DIN0
R0<0> R0<1> R0<2> R0<3> R0<4> R0<5> R0<6> R0<7>DIN0
ROW0
ROW1
ROW2
ROW3
ROW4
ROW5
ROW6
R0<0> R0<1> R0<2> R0<3> R0<4> R0<5> R0<6> R0<7>DIN0
R0<0> R0<1> R0<2> R0<3> R0<4> R0<5> R0<6> R0<7>DIN0
ROW0
ROW1
ROW2
ROW3
ROW4
ROW5
ROW6
Ro
w w
idth
Column width
(sub-gate delay)
> Row width / 8
Delay Element
J.Mauricio - UB
MATRIX TDC Design Overview
• A novel alternative was proposed:
– Resistive Interpolation Mesh Circuit (RIMC).
Patent application EP16382039.2.
8
0.000
0.200
0.400
0.600
0.800
1.000
No
rma
lize
d T
DC
co
un
ts
TDC sub-delay normalized transfer function
ROW 6
ROW 5
ROW 1
ROW 3
ROW 4
ROW 2
ROW 0
TCycle / 2
Chronogram of the RIMC clock
nodes andnormalized ideal transfer function.
Sub-gate delay = 15 ps
TCycle= 1250 ps
TRow = 90 ps
TCol = 15 ps
Overlap = 33%
Gate delay = 90 ps
J.Mauricio - UB
MATRIX TDC Design Overview
• Benefits:
– Reduced power consumption and area:
Few components (-50%): sub-delays covers half clock period.
– Excellent process variability properties:
Very regular design (less neighboring effects).
Mesh structure averages variability.
– Scalability and reusability:
Few changes on the Delay Element to modify TDC resolution/performance.
– Very low design time:
Simple atomic structure: Delay Element (starved inverter + resistor).
The structure is repeated in 2D.
• Drawbacks:
– Layout issues related to the 2D structure:
Track congestion.
Track lengths.
9J.Mauricio - UB
MATRIX TDC Design Overview
10
FRONT-END READOUT
RIMC(VCO)
COARSE COUNTER
CP/M
clk200
PFDCLK_REF
clk800
···
TIME
clk800
SERIALIZER
BACK-END READOUT
VCTL
clk800
clk200
TIME_DIG
COARSE_OVF
Fine data
Coarse data
!GRSTSPI BUS
SPI SLAVE
PLL
clk200
J.Mauricio - UB
MATRIX TDC Design Overview
• PLL:– Supports M = 4 / 8 / 16.
– PLL Clock Frequency = 800 MHz.
– The RIMC is the VCO.
• Coarse Counter:
– 10-Bit natural counter at 800 MHz.
– Multilevel approach:
2-bit counter at 800 MHz.
8-bit counter at 200 MHz.
Clock domain synchronization to avoid metastability issues.
– Full custom design.
11J.Mauricio - UB
MATRIX TDC Design Overview
• Front-End Readout (Fine Interpolator):– 4 channels.
– Captures the polarity of the RIMC nodes.
– Encodes the column(s) where transition(s) occurs. It also gathers transition type (rising or falling).
• Serializer:
– Two output types: Single Ended and LVDS.
– Data rate:
10 MHz sustained rate per channel (200 Mbps).
12J.Mauricio - UB
D Q
x8
TIME
R0<7:0>T0<7:0>
T0<7:1>
T0<6:0>x7
XOR0<6:0>7:3 COL0<2:0>
D Q
x8TIMEDELAYED
E
!BUSY
HIT<0>
CLK_PHASE<0>T0<0>
IS_EVEN=1
Schematic of Row #0 of the Front-End Readout
MATRIX TDC Design Overview
• Back-End Readout:
– Syncronize Fine Interpolator and Coarse Counter data.
Coarse Counter provides a phase bit to align counters.
– 4-Word per-channel FIFO:
TDC dead time = 20 ns.
– Algorithm:
Row: 0~6 (3 bits).
Column: 0~7 (3 bits).
Fine stage comptuation (8 bits, 0~219 LSBs):
o Two hits (overlapping): Fine Stage ID = ID1 + ID2
o One hit (no overlapping): Fine Stage ID = 2 * ID1
o If phase bit is active, Fine Stage ID += 110
13
Full Scale = 1250 ps
1250 ps = 220 LSB
1 LSB = 5.68 ps (avg)
Stage ID = Row * 8 + Column
J.Mauricio - UB
MATRIX TDC Design Overview:
MATRIX TDC core layout.
• Front-End Readout size: 280x215 um2.
• Back-End Readout size: 630x215 um2.
BACK-END READOUT
& SPI INTERFACE
POWER RAILS
POWER RAILS
RIMCCOARSE
COUNT
FRONT-END READOUT
FRONT-END READOUT
PLL
910 µm
215 µ
m
630 µm280 µm
J.Mauricio - UB 14
Outline
• Motivation
• MATRIX TDC Design Overview
• Measurement Results
• Conclusions & Future Work
Measurement Results
• Chip calibration:
– Density code test:
100K random pulse shots following an uniform distribution.
The number of hits per bin will indicate the bin size.
Used to calibrate MATRIX channels.
– Test results:
Almost all the bins in the range between 0 and 25 ps.
Uncalibrated σDNL = 7 ps.
16
0
5
10
15
20
25
30
35
0 20 40 60 80 100 120 140 160 180 200 220
Bin
Wid
th (
ps)
Bin Number
Chip #1 - Ch0 density code test
J.Mauricio - UB
Outlier
Measurement Results
• Linearity & Jitter test:
– CLK REF TIMEi sweep:
Start: 0 ps, stop: 1245 ps.
Step: 5 ps. #steps = 250.
N acquisitions per step.
– Jitter measurement (single shot precision):
Stdev of the N ACQ of each step.
– Linearity measurement:
Average of the N ACQ of each step.
17
PPG 81110A
MATRIX PCB
5.0V 0.4A
E3631A
TIMEi
CLK REFLinearity & Jitter test setup.
0ps 5ps ... 1245ps
013 015 ... 011
014 016 ... 010
011 014 ... 012
···
014 017 ... 011
013 014 ... 011
012 015 ... 009
N A
CQ
250 steps
J.Mauricio - UB
Measurement Results
• Linearity - DNL:
18
– Uncalibrated:
DNL = ± 5.1 ps
RMS DNL < 1.4 ps
– Calibrated:
DNL = ± 4.7 ps
RMS DNL < 1.1 ps
J.Mauricio - UB
Measurement Results
• Linearity - INL:
19
– Uncalibrated:
INL = ± 21.9 ps
RMS INL < 13.2 ps
– Calibrated:
INL = ± 10.2 ps
RMS INL < 3.7 ps
J.Mauricio - UB
Measurement Results
• Jitter (pulse generator + MATRIX TDC):
20
0
5
10
15
20
25
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#o
f s
tep
s
Jitter (ps)
Chip #1 - Ch0 step Jitter - PLL M = 16 (uncalibrated)
0
20
40
60
80
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#o
f ste
ps
Jitter (ps)
Chip #1 - Ch0 step Jitter - PLL M = 16 (calibrated)
0
20
40
60
80
5 6 7 8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#o
f ste
ps
Jitter (ps)
Chip #1 - Ch0 step Jitter - PLL M = 8 (calibrated)
0
10
20
30
5 6 7 8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#o
f ste
ps
Jitter (ps)
Chip #1 - Ch0 step Jitter - PLL M = 8 (uncalibrated)
0
10
20
30
3 4 5 6 7 8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#o
f ste
ps
Jitter (ps)
Chip #1 - Ch0 step Jitter - PLL M = 4 (uncalibrated)
0
20
40
603 4 5 6 7 8 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Fre
cu
en
cia
Clase
Chip #1 - Ch0 step Jitter - PLL M = 4 (calibrated)
J.Mauricio - UB
Measurement Results
• Jitter (pulse generator + MATRIX TDC):
– TDC jitter is dominated by PLL.
M = 4 has a natural frequency (𝝎𝒏) 2X M = 8 and thus jitter improves.
There is margin for the improvement in further MATRIX versions.
21
PLL MTDC Jitter (ps)
Uncalibrated Calibrated
4 9.7 9.3
8 12.3 11.7
16 21.2 20.6
𝝎𝒏 =𝑲𝑽𝑪𝑶·𝑰𝑪𝑷𝑴 · 𝑪𝟏
𝑰𝑪𝑷 = 𝑪𝑷 𝒄𝒖𝒓𝒓𝒆𝒏𝒕 [𝑨]
𝑴 = 𝒇𝒃 𝒅𝒊𝒗𝒊𝒔𝒐𝒓
𝑪𝟏 = 𝒍𝒂𝒓𝒈𝒆 𝒍𝒐𝒐𝒑 − 𝒇𝒊𝒍𝒕𝒆𝒓 𝒄𝒂𝒑 [𝑭]
𝑲𝑽𝑪𝑶 = 𝑽𝑪𝑶 𝒈𝒂𝒊𝒏𝑯𝒛
𝑽
J.Mauricio - UB
Measurement Results
• Power Consumption:
– Measurements are slightly better than simulations.
– Three operating modes:
Standby. Reference clock input disabled: 0.76 mW.
“Standby”. On-chip PLL still working: 30.1 mW.
Low Power. LVDS drivers with minimum differential swing: 45.2 mW.
Default: 56.3 mW.
22
Proposal Technology Bin size Time Res. Pow/ch
S. Russo et al. (2011) [2] 180 nm 41 ps 14 ps 25 mW
L. Perktold et al. (2014)[1] 130 nm 5 ps 3 ps 43 mW
P. Keränen et al. (2015) [3] 350 nm - 5 ps 80 mW
MATRIX (2016) 180 nm 15 ps 4 ps 11.3 mW
J.Mauricio - UB
Outline
• Motivation
• MATRIX TDC Design Overview
• Measurement Results
• Conclusions & Future Work
Conclusions & Future Work
• Conclusions:
– The new concept was successfully proved.
Design, prototype & test.
– Performance:
Similar to other proposals. 4 ps RMS time resolution.
Outstanding low power consumption. 11.3 mW/channel.
Jitter should be improved. 10 ps for M=4, 20 ps for M=16.
We should consider the impact of wide bins (>30 ps) for some applications.
• Future Work:
– MATRIX V2 with improved jitter.
Available in early 2017.
– SoC PET ASIC (HR-FlexToT):
Analog signal processing.
MATRIX TDC.
Power consumption and cost effectiveness.
24J.Mauricio - UB
References
1. L. Perktold and J. Christiansen, “A multichannel time-to-digital converter
ASIC with better than 3 ps RMS time resolution”; 2014 JINST 9 C01060
2. S. Russo et al., “A 41 ps ASIC time-to-digital converter for physics
experiments” in Nuclear Instruments and Methods in Physics Research,
Volume 659, Issue 1, p. 422-427.
3. P. Keränen and J. Kostamovaara, "A Wide Range, 4.2 ps(rms)
Precision CMOS TDC With Cyclic Interpolators Based on Switched-
Frequency Ring Oscillators," in IEEE Transactions on Circuits and
Systems I: Regular Papers, vol. 62, no. 12, pp. 2795-2805, Dec. 2015.
J.Mauricio - UB
THANK YOU FOR YOUR ATTENTION!
BACKUP SLIDES...
• Res. Interp. Mesh Delay Element:
– Full custom design.
– Design For Manufacturability (DFM) techniques to minimize systematic
process variability:
Dummy structures.
Symmetric buffer design.
1-D poly with constant pitch.
Continuous rectangular diffusions.
Dummy resistor.
– Dummy transistors (DPR, DNR…): More transistors allowed -if necessary-.
RIMC Layout
Delay Element layout (~10 x 12.5 um2).
Delay Element symbol
J.Mauricio - UB
• Res. Interp. Mesh Column design:
– DEs are abutted to improve DFM.
– DE stages are interleaved to equalize
interconnection lengths.
– The matrix building is as easy as
concatenate the column N times.
RIMC Layout
Fig 5: Schematic sample (left) , Layout sample (middle) and layout interconnection diagram (right).
3
0
1
6
2
5
4
DE Input
DE Output
J.Mauricio - UB
• Front-end Readout Improvements:
• Track lengths were optimized.
• TDC channels were stretched.
• Dedicated horizontal power rails were removed.
• Track capacitance decreased by factor 3 as a consequence (75 fF 25 fF).
RIMC + TCM Layouts
215 μ
m
70 μ
m70 μ
m70 μ
m70 μ
m90 μ
m
J.Mauricio - UB
• Front-End Readout (fine interpolation) transfer function:
– Phase between timing signal and clock is swept in steps of 1 ps along the
1.25 ns clock period.
– For each event, we have:
The row(s) number(s) where the transition occurred (NEW_EVENT<6:0>).
The transition type: rising or falling (CLK_PHASE<6:0>).
Encoded row data (DOUT0<2:0>, DOUT1<1:0>, … DOUT6<1:0>).
Simulation Results:
J.Mauricio - UB
• Front-End Readout (fine interpolation) transfer function:
Simulation Results:
Front-End Readout (fine interpolation) transfer function output.
J.Mauricio - UB
MATRIX TDC Detailed Block Diagram
33
FRONT-END READOUT
CLOCK SYNTHESIZER
COARSE COUNTER
CP
!scRstPll
/MscPllM<1:0>
clk2
00
PFDCLK_REF
!scVctlOen
clk800
···
x4
TIME<3:0>
TRIGGER<3:0>
clk800
COARSE_RST
scFiltFeRo<3:0>
scFiltBeRo<3:0>
SERIALIZERx4
scFiltSer<3:0>
BACK-END READOUT
x4
intVctl
clk8
00
clk200
TIME_DIG
<3:0>
TIME_DIG
_SE<3:0
>
DATA_STRB<3:0
>
PLL_VCTL
COARSE_OVF
Fine data
Coarse data
PLL_LOCKEDCLK_200
!GRST
!SPI_
SS
SPI_
MOSI
SPI_
SCLK
SPI_
MIS
O
scEnLvdsOut<3:0>
scLvdsOutCselx<3:0>
!scE
nSe
Ou
t<3
:0>
!scEnLvdsIn<3:0>
scEnLvdsOut<4>
scLvdsOutCselx<4>
SPI SLAVE
sc...
!scRstPll
J.Mauricio - UB
• 2-bit counter features:
– Maximum parallelism.
– The worst path is an XOR2.
– Overflow signal is used to increase the 8-bit counter.
Coarse Counter design
Schematic of the 800 MHz 2-bit counter
J.Mauricio - UB
• Coarse counter features:
– 2-level approach: 2-bit counter @ 800 MHz, 8-bit counter @ 200 MHz.
– Counter values are resynchronized to avoid metastability issues.
– Synchronous reset from an external signal.
Coarse Counter design
Schematic of the 800 MHz 10-bit counterJ.Mauricio - UB
• Coarse counter channel capture features:
– ToA signal (TIME) is discretized (timeResynch) at 800 MHz.
– Discretized signal (synchronous) captures coarse counter safely.
– LSB_CHANGE gives information about the phase (Coarse/Fine alignment).
Coarse Counter design
Schematic of the per-channel Coarse Counter value captureJ.Mauricio - UB
• Coarse counter. Conclusions:
• Power consumption: ~3 mW.
• Reliability problems for supply voltages < 1.7 V at high temperatures.
Coarse Counter design
Coarse counter layout (83 x 70 um2).
J.Mauricio - UB 37
MATRIX v1 Picture:
TDC V1 chip layout. Dimensions: X=2218 um, Y=2221 um.
This is thecore of thechip.
The rest are capacitorsdecouplingcapacitors.
J.Mauricio - UB 38
Code Density Test
• How it works?
– Two uncorrelated pulse generators:
One for the CLK reference (50 MHz).
The second for uncorrelated timing pulses (12.45 KHz).
In theory, all the TDC bins have the same probability to appear.
o Wider bins indicate linearity issues.
39
Code density test setup.
PPG 81110A
MATRIX PCB
5.0V 0.4A
E3631A
TIMEi
CLK REF
Generic PPG
J.Mauricio - UB
TDC INL Statistics
40
0
2
4
6
8
10
12
14
16
18
20
22
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
INL
(ps)
Channel sample
TDC INL of 24 channel samples (RMS and range)
Uncal INL +/- (ps) Uncal INL RMS (ps) Cal INL +/- (ps) Cal INL RMS (ps)
3.7 ps RMS
13.2 ps RMS
±10.2 ps
±21.9 ps
J.Mauricio - UB
-0.5
0.5
1.5
2.5
3.5
4.5
5.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
DN
L (p
s)
Channel sample
TDC DNL of 24 channel samples (RMS and range)
Uncal DNL +/- (ps) Uncal DNL RMS (ps) Cal DNL +/- (ps) Cal DNL RMS (ps)
TDC DNL Statistics
41
1.1 ps RMS
1.4 ps RMS
±4.7 ps
±5.1 ps
J.Mauricio - UB