oc-192 communications system block diagramgram.eng.uci.edu/faculty/green/public/courses/270c/... ·...
TRANSCRIPT
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 1
Network Processor
TX
RX
O E
Network Processor
O E
RX
TX
E O
E O
16
16
16
16
622Mb/s 10 Gb/s 622Mb/s
Photo Diode TIA + Preamp
10 Gb/s Mod Laser
10 Gb/s
10 GHz
• OC-192 (10 Gb/s) transceiver • 0.18 µm CMOS process
OC-192 communications system block diagram
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 2
Transceiver block diagram:
10 GHz
10 Gb/s
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 3
Read Pointer
FIFO Control RESET
CLK16IP CLK16IN
DI0P DI0N
DI15P DI15N
REFCLKP REFCLKN
REF155EN
INP
UT
RE
GIS
TER
10/10.7 GHz CMU
SELFECB
16 X
10
FIFO
Write Pointer
16:1
MU
X
DIVIDE-BY-16
Output Retime
IFSEL VCP VCN
TSDP TSDN
TSCKP TSCKN
CLK16OP CLK16ON
LCKDET
RB_LD
LVDS Parallel
Input Bus
LVPECL Ref.
Clock
CML High- Speed Outputs
LVDS Output Clock
OVF
Transmitter Block Diagram
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 4
Low-Frequency Input signals
Reference clock
Input clock
Input data
Input data aligned to input clock (usually jittery)
Very low jitter (~10 ppm) reference clock; used in CMU to generate 10 GHz internal clock
Reference clock and input clock are not synchronized.
T
tsh
Maximum allowable variation between Input clock & Reference clock is T − tsh
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 5
Input clock timing domain
Reference clock timing domain
input clock
input data
16
Reference clock
High-frequency clock
16 16
Connection could exhibit varying delay Variable phasing between
input & reference clock domains can cause bit errors in MUX
Illustration of Input Timing Regimes
16:1 MUX
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 6
We require an intermediate block to resolve timing variations between input & reference clock
First-In/First-Out (FIFO) Circuit (1)
16:1 MUX
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 7
First-In/First-Out Circuit (2)
Din_0
Din_n
k
k
Read clock
Write clock
Read clock
Write clock
Dout_0
Dout_n
Synchronized with input clock
To serializer (signals synchronized with reference clock)
Since these signals have period k times longer than the input period, the circuit can tolerate k times larger variation between input & reference clocks.
Ref clock
Ref clock
Read clock based on input clock Write clock based on reference clock
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 8
FIFO approach: • Large amount of hardware (many latches) • Significant power dissipation unless static CMOS is used • Can handle arbitrarily large delay variations
DLL approach: • Less hardware • Can handle modest delay variations • Better choice for BJT or GaAs processes
Appropriate phase chosen
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 9
static CMOS
CML
16:1 Multiplexer Tree Structure
10 Gb/s 5 Gb/s
2.5 Gb/s 1.25 Gb/s
5 GHz 2.5 GHz 1.25 GHz
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 10
2:1 MUX cell details
D flip-flop with extra latch
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 11
Assume all blocks have: • Tail current ISS • Resistor R • Diff pair transistor sizes W/L
ISS 5ISS 2ISS 10ISS Total current: 18ISS
€
f
€
12
f
€
12
f
€
14
f
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 12
We can take advantage of gain/bandwidth tradeoff by appropriate scaling:
Design parameters: • ISS • R • W/L ⇒ CL
LW
LW
LW
LW
€
f
€
f
€
12
fIdea:
€
τ = RCg
€
ISS,R,W,Cg
€
ISS,R,W,Cg
€
τ = 2RCg
€
12
ISS,2R, 12
W , 12
Cg
Lower bit rate allows lower power!
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 13
€
f
€
12
f
€
12
f
€
14
f
€
τ = RCg
€
τ = 2RCg
€
ISS,R,W
€
τ = 2RCg
€
12
ISS,2R, 12
W
€
τ = 4RCg
€
18
ISS,8R, 18
W
Cp ≈ 10 fF GSCALE=3 → ISS = 1.2 mA
MSCALE=1/8 MSCALE=1/2 MSCALE=1/2 MSCALE=1
€
12
ISS,2R, 12
W
€
52
ISS
€
108
ISS
€
ISS
€
ISS
Itotal = 5.75ISS = 6.9 mA
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 14
Clock Dividers
The operation of “real” high-speed clock dividers is more complex …
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 15
Clock divider based on CML D flip-flop:
Divider sensitivity curve:
Vmin = minimum input clock amplitude required for correct operation.
fso = self-oscillation frequency
Vmax = maximum dc differential voltage that can be applied to the input clock for which the circuit self-oscillates.
DLW!"
#$%
&
DLW!"
#$%
&
LLW!"
#$%
&
LLW!"
#$%
&
CLW!"
#$%
&
CLW!"
#$%
&
CLW!"
#$%
&
LLW!"
#$%
&
LLW!"
#$%
&
DLW!"
#$%
&
DLW!"
#$%
&
CLW!"
#$%
&
Vmax
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 16
Desired frequency divider operation Quasiperiodic operation Slew-rate limited operation
Region I: Region II:
Region III:
Sensitivity Curve Analysis
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 17
Region II: Quasiperiodic behavior self-oscillating
locked fin = 11GHz
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 18
Region III: Slew-rate limited Behavior
Sine-wave input
Square-wave input
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 19
Effect of Transistor Sizes on Sensitivity Curve
Latch transistors
Driver transistors
Clock transistors
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 20
Alternatives to DFF-Based Clock Dividers
• Latches present large capacitive load ! slow
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 21
At very high frequencies, latch transistors are not necessary and only add capacitance to the circuit:
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 22
Ring-Oscillator-Based Divider
Behaves like a 4-stage ring oscillator with injection of full-rate frequency.
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 23
Comparison of Sensitivity Curves
Conventional divider: Dynamic divider:
Wider frequency range; lower self-oscillation frequency Narrow frequency range; higher self-oscillation frequency
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 24
I1 I2
I2(t)
0
ISS
I1(t)
Vout-(t)
Vout+(t)
VDD
VDD – ISSR
VDD – ISS(R+ΔR)
0
Vout+(t) – Vout-(t)
ISSR
ISS(R+ΔR)
Effect of Non-Ideal Clock Signals
Offset resistance causes deviation from 50% duty cycle in clock signal.
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 25
Half-rate clock
MUX output
ideal
with offset
with offset ideal
Result of nonideal half-rate clock is Periodic Jitter.
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 26
retimer
10 GHz clock
10 Gb/s retimed data 5 Gb/s
5 GHz
10 Gb/s data
Full-rate clock (could be non-50% duty cycle)
retimed output
Retimer eliminates this problem:
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 27
10 Gb/s data output
10 GHz clock
2.5 Gb/s data input
5 Gb/s
5 GHz 2.5 MHz
tp1
tp2
tp1 & tp2 are “clock-to-Q” delays. Because the clock & data flow in opposite directions, alignment between 5 Gb/s data & 5 GHz clock is determined by the sum: tp1 + tp2 (High sensitivity to processing / temp. corners)
Internal MUX Timing
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 28
• 50Ω back termination used to reduce reflections. • CML blocks scaled up so that last stage drives ac load of 25Ω. • Shunt-peaking used in second stage.
Serial Output 50Ω Line Driver
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 29
RXDOUT0P RXDOUT0N
RXDOUT15P RXDOUT15N
RXPOCLKP RXPOCLKN
LCKDET
OUTPUT
REGISTERS
RDINP RDINN
Divide-by-16
VCP VCN
RXREFCLKP RXREFCLKN CDR
1:16 DEMUX
REF155ENB
LOSB LOS
DETECT
RATESEL0/1
RB_LD
RESETB Divide-by-4 RXMCLKP
RXMCLKN LCKREFB RXMCLKENB
9.953/10.3125/10.664/ 10.709Gbps CML
622.08/644.54/666.51/ 669.31Mbps LVDS
RSCLKP RSCLKN Test Only
9.953/10.3125/10.664/ 10.709G
Receiver Block Diagram
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 30
Static CMOSCML
10GHz CLK
1:2
D AB
/2 /2
1:2
D AB
1:2
D AB
/2
1:2
D AB
1:2
D AB
1:2
D AB
1:2
D AB
/2 /2
8:16
10Gb/s Data
622Mb/s
4:82:41:2
311MHz
622MHz
DMUX Architecture
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 31
1:4 DMUX Tree Structure
10 Gb/s data input 2.5 Gb/s data outputs
10 GHz clock 5 GHz 2.5 GHz
5 Gb/s
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 32
1:2 DMUX cell details:
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 33
10 Gb/s data input
2.5 Gb/s data output
10 GHz clock 5 GHz 2.5 GHz
tp1 & tp2 are “clock-to-Q” delays.
Because the clock & data flow in the same direction, alignment between 5 Gb/s data & 2.5 GHz clock is determined by the difference: tp1 – tp2 (Low sensitivity to processing/temp. corners)
tp1
tp2
Internal DMUX Timing
5 Gb/s
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 34
Crosstalk in Transceivers
f1 f2
• Capacitive coupling between VCO’s can cause “frequency pulling” • Momentary differences in frequencies between 2 VCO’s can give rise
to additional jitter.
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 35
Crosstalk Measurement
Serial input data 10 Gb/s
CMU reference clock
€
fref =10 GHz
16⋅ 1+10−4( )
recovered clock 10 GHz
output data 10 Gb/s + 100ppm
output clock 10 GHz + 100ppm
Low-frequency inputs/outputs
Low-frequency inputs/outputs
Jitter is measured at TX output clock (or data) and RX recovered clock.
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 36
• Sufficient physical separation between VCO’s
• Separate supply connections to package for each block (e.g., CMU, CDR, MUX, DMUX, FIFO, etc.) • Ample guard rings to minimize substrate coupling
Techniques for Reducing Transceiver Crosstalk
Very difficult to simulate & predict!
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 37
SONET Jitter Specifications
1. Jitter Generation (transmitters)
2. Jitter Tolerance (receivers)
3. Jitter Transfer (repeaters)
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 38
Jitter Generation (1)
• DJ always specified in peak-to-peak • RJ rms jitter well-characterized • RJ peak-to-peak jitter dependent on measurement time (increases
without bound)
SONET: JPP usually measured over a specified frequency range.
Wideband jitter (p-p or rms) can be measured directly from serial output data signal
Gigabit Ethernet & Fiber Channel: Equivalent JPP determined by measured BER.
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 39
SONET jitter generation is specified within a certain jitter frequency range. For OC-192: 50 kHz – 80 MHz
To measure narrowband jitter generation, we can: A. Measure the recovered clock from a “golden” CDR:
TX CDR (low jitter generation)
Ref. clock
output data
recovered clock
to jitter analyzer
Should have jitter bandwidth > 80MHz
SONET OC-192 bandpass filter
Jitter Generation (2)
10 GHz
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 40
B. Measure the TX output clock directly (assuming its jitter is the same as the data):
TX
Ref. clock
output data
TX output clock
to jitter analyzer
Note: ISI is usually measured separately (peak-to-peak only).
Jitter Generation (3)
10 GHz
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 41
9.95328 GHz 10.6642 GHz
Measured at output clock; 231-1 PRBS serial data applied to input
Phase noise: -100 dBc/Hz @ 1MHz offset Jitter generation (SONET filter): 5.6mUI rms / 60mUI p-p
Phase noise: -100 dBc/Hz @ 1MHz offset Jitter generation (SONET filter): 6.2mUI rms / 65mUI p-p
Jitter Generation (4)
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 42
Jitter Generation (231-1 PRBS): 6.44 ps pp (wide band) 0.38ps rms (within SONET band)
Closed-loop VCO phase noise (231-1 PRBS): –107 dBc/Hz @ 1 MHz offset
Jitter Generation (5) Jitter measurements from clock:
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 43
10.6642GHz clock Wideband jitter: 7.5ps p-p / 1.2ps rms
10.6642Gb/s data Wideband jitter: 10.7ps p-p / 1.8ps rms
231-1 PRBS input data applied:
Jitter Generation (6)
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 44
Serial data in
recovered clock
retimed data out
To DMUX retimer
Experiment: Apply serial data to CDR with jitter at a certain frequency. Increase the jitter amplitude until a bit error occurs.
If data jitter & recovered clock jitter could perfectly track, then retiming would be error-free.
Data in
Recovered clock
T
tsh
Jitter Tolerance (1)
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 45
Given CDR open-loop characteristic
€
φclock
φdata
=G
1+G
fdata fclock
€
φdata −φclock
φdata
=1
1+G
€
φdata(max)(ω) = 1+G( jω) ⋅ φdata −φclock max= 1+G( jω) ⋅2π T − tsh
T
€
JTOL(ω) = 1+G( jω) ⋅ 1− tsh
T
%
& '
(
) * (expressed in UI)
€
G jω( ) = Kpd ⋅F jω( ) ⋅Kvco
jω
^
Jitter Tolerance (2)
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 46
Bit rate: 10.7Gb/s Pattern: 231-1PRBS BER threshold: 10-12
Jitter Frequency (Hz)"
Jitte
r Tol
eran
ce [U
Ipp]
Jitter Frequency [Hz]
Bit rate: 10.7 Gb/s Pattern: 231-1PRBS Data in: 50 mV pp BER threshold: 10-12
100 1K 10K 100K 1M 10M 100M 0.01
10
0.1
1
10
100
1000
10000
Jitter Tolerance > 40 ps pp at high frequency
Jitter Tolerance (3)
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 47
O ⇒ E RX TX E ⇒ O
€
HRX ( jω)
€
HTX ( jω)
n repeaters:
€
HRX jω( ) ⋅HRX jω( )[ ]n
repeater
Jitter peaking should be minimized.
Jitter Transfer
f0
0.1dB
-20 dB/decade
Jitter Transfer Mask:
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine
Electrical-to-Optical Interfaces (1)
Electrical to optical (TX):
MUX laser driver
IL
IL
optical output power
T
€
Ith ~ 10mA
laser diode or Vertical Cavity Surface Emitting Laser (VCSEL)
48
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 49
Electrical-to-Optical Interfaces (2)
Electroabsorption modulator
Operates by making optical material more or less absorptive.
⇓Pin
⇓Pout
VM
VM Vswing~ 3V
€
Pout
Pin
49
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 50
Mach-Zender modulator:
Electrical-to-Optical Interfaces (3)
Mach-Zender interferometer:
• Invented in 1890s • Used to precisely measure optical
phase shift of materials. • By using constructive/destructive
interference, can be used as a laser modulator.
VM Vswing ~ 6V
€
Pout
Pin
50
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 51
Electrical-to-Optical Interfaces (4)
Optical pulsewidth distortion commonly occurs due to: • Unequal turn-on/turn-off times of laser diode • Non-ideal bias voltage in modulators.
Electrical signal (IL or VM)
Optical output
Additional circuitry to correct pulsewidth is often added to system...
Results in DCD
51
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 52
Electrical-to-Optical Interfaces (5)
VM
Vref
IB
laser diode monitor diode
R
Optical output control circuit:
€
IB =Vref
RFeedback sets
52
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 53
Optical Receiver Block Diagram
O ⇒ E
LA CDR EQ DMUX
≈ -18 dBm ≈ 10 mV p-p ≈ 10 µA ≈ 400 mV p-p
TIA
53
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 54
Optical-to-Electrical Interfaces (1)
p-i-n photodetector structure: circuit model:
€
ID = ρ ⋅Popt
€
CD
+
_
+
_
n
p
i
applied optical signal
resulting electrical
current
VR~5V
€
ρ = 0.6 ~ 0.9 A WCD ~ 400 fF
54
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 55
Eye diagram of PRBS resulting from 96 km of single-mode fiber
and photodetector.
Optical-to-Electrical Interfaces (2)
• DCD & ISI are evident. • Noise is higher at logic 1 than at logic 0.
€
in2 = 4qρPopt ⋅ ΔfPhotodetector noise:
55
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 56
Transimpedance Amplifier (TIA)
R
Cd
A0
Vref
Iin
from photodetector
Vout
Used to convert photodetector current into voltage.
low-impedance node maintains nearly constant detector voltage ⇒ good linearity.
56
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 57
Transimpedance Amplifier (2)
R
Cd
A0
Vref Iin
photodetector
Vout
€
ZT ≡Vout
Iin= −Rf ⋅
11+ 1
A0
Zin ≡V−
Iin=
Rf
1+ A0
Transimpedance:
Input impedance:
€
A(s) ⋅ f (s) =A0
(1+ s p1) ⋅ (1+ s p2 ) ⋅ ⋅ ⋅⋅
11+ sRf (Cd +Cg )
Loop gain:
additional pole limits closed-loop BW
Cg
57
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 58
Transimpedance Amplifier (3)
Noise analysis:
€
vni
€
inR
€
R
€
vout
€
vout2 = vni
2 + inR2 ⋅Rf
2
= ieq2 ⋅Rf
2
⇒ ieq2 =
vni2
Rf2
+ inR2
€
vni2 =
4kTγgm
⋅ Δf +Kf
Cg
⋅Δff
inR2 =
4kTRf
⋅ Δf
€
⇒ ieq2 =
4kTRf
⋅ 1+γ
gmRf
+Kf
CgRf f
%
& '
(
) * ⋅ Δf
Good sensitivity requires: • Large Rf • Large Cg • Large gm
Tradeoff with BW
58
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 59
Transimpedance Amplifier (4)
• Cd decoupled from feedback network • Common-gate device increases noise
Cd
Cg
R Cd
Cg LB
R
• LB provides decoupling (series peaking); could be realized by bondwire.
59
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 60
Limiting Amplifiers
Requirements:
• Amplify input signal with variable amplitude (~10-30 mV) to a fixed-amplitude (~450 mV) output.
• Sufficiently high bandwidth • Sufficiently low noise • Low offset voltage +
Vin −
+ Vout −
n stages
€
A(s)
€
A(s)
€
A(s)
€
A(s) =A0
1+ s p
Single stage:
€
An (s) =A0
1+ s p
"
# $
%
& '
nn-stage amplifier:
Overall gain:
Overall bandwidth:
€
A0n
€
p ⋅ 21 n −1
60
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 61
7-Stage Limiting Amplifier Example (1)
Each stage uses shunt-peaked CML buffer with: A0 = 5.5 dB BW = 10 GHz
€
A jω( ) (dB)
100 MHz 1 GHz 10 GHz 100 GHz
1st stage output
7th stage output
61
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 62
7-Stage Limiting Amplifier Example (2)
Input amplitude = 20 mV p-p Input amplitude = 40 mV p-p
1st stage output
7th stage output
6th stage output
1st stage output
7th stage output
6th stage output
62
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 63
Input amplitude = 20 mV p-p
1st stage output
7th stage output
6th stage output
Input-referred offset of 5 mV applied
7-Stage Limiting Amplifier Example (3)
Offset-cancellation circuitry required!
€
V out ≈ A0n ⋅VOS
63
EECS 270C / Spring 2014 Prof. M. Green / U.C. Irvine 64
Limiting Amplifier Offset Compensation
n-stage amplifier core
lowpass filter
+ vin −
offset compensation
RL RL
M1 M1 M1 M1 RF
CF
RF
CF
VOS
Vout
+ −
+ −
V1
+ Vout −
H.-Y. Huang et al., “A 10-Gb/s inductorless CMOS limiting amplifier with third-order interleaving active feedback,” JSSC, May 2007, pp. 1111-1120.
€
⇒V out =gm1RA0
n
1+ gm1RA0n⋅VOS ≈VOS
vout = gm1RA0n ⋅vin
€
V1 = gm1R ⋅ vin +VOS( ) −V out[ ] ⇒V 1 = gm1R ⋅ VOS −V out( )v1 = gm1R ⋅vin
compensation circuit:
€
Vout = A0n ⋅V1 ⇒V out = A0
n ⋅V 1vout = A0
n ⋅v1
amplifier circuit:
64