low-power high-speed cmos i/os: design challenges and ...€¦ · thomas toifl twepp 2012 sept. 20...

69
IBM Research GmbH Zurich Research Laboratory Rüschlikon, Switzerland Thomas Toifl TWEPP 2012 Sept. 20 th , 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions

Upload: others

Post on 21-Sep-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Research GmbHZurich Research LaboratoryRüschlikon, Switzerland

Thomas Toifl

TWEPP 2012Sept. 20 th, 2012

Low -power High-Speed CMOS I/Os:

Design Challenges and Solutions

Page 2: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

22

I/O Link Technology group @ IBM ZRL

What we do

- I/O PHY circuit design

- backplane and chip-to-chip (e.g. processor, memory)

Card

Card

Package

CPU

Boa

rd

Socket

6” trace Connector

Package

Socket

24” trace

CPU

6” trace

Technical goals to meet

- Distance: 10-100cm

- Attenuation: 10-40dB

- Data rate: 3-40 Gbit/s

- Energy/bit: <10pJ/bit

- Chip area : 0.1 mm2

Page 3: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

33

General research directions

Data rate

Power efficiency

(e.g. 28Gb/s )

(e.g. 5pJ/bit)

(e.g. 28Gbit @ 30dB channel attenuation measured at fbaud/2)

Side conditions: chip area, testability, standard compliance, reliability etc.

Distance (=allowable channel attenuation ~equalizer complexity

Page 4: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

44

Outline▪ Introduction

• High-Speed I/Os in the wireline link space• Equalization techniques

▪ Low-power Transmitter• SST vs. CML driver• Ultra-low power TX � Avoid TX-FFE

▪ Low-power Receiver• Power-optimal amplification• Incomplete Settling / Integrating Buffer

• Low-power DFE Architecture

• Low-power clock generation

▪ Digital approaches• ADC

• DFE implementations• Analog vs. Digital

Page 5: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

High-speed I/Os compared with other wireline links

NEXT+FEXT75 tap NEXT-X-talk canc

x50x10x1Energy/bit

16 tap THP12 tap FFE14 tap DFE

£5 tap FFE£15 tap DFE

Equalization

>200 Taps60 Taps-Echo canc

PAM-16 + LDPCPAM-5 + TrellisNRZCoding

1021Rate/channel BW (HSS=1)

2*2.52 * 0.253-20Data rate/lane [Gbps]

YESYES-Bidirectional

10Gbit Ethernet 10GBase-T

802.3an

Gbit Ethernet 1000Base-T

802.3ab

High-speed serial (backplane or chip-to-chip)

5

Case #1 High-speed I/Os (up to now): Simple, analog, rather low-powerCase #2 Ethernet over copper : Digital (ADC) + lots of signal processing

���� Goal: More complex equalization at low power

Page 6: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

© 2012 IBM Corporation6

I/O Standards Roadmaps versus Data Rate

� Current standards mostly in the 10-14Gb/s range– OIF CEI pushing aggressively to 25G, still in “early adoption” phase

� Expect most “major” standards to move to 25Gb/s regim e within the next 12-18 months– Ethernet, InfiniBand, FibreChannel all in development at those speeds

0

10

20

30

40

50

60

FibreChannel Ethernet InfiniBand PCI-Express OIF CEI SAS

Gb/

sProposed

In Development

Available

Page 7: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

© 2012 IBM Corporation7

Background— Limited Bandwidth Channels Create Inter-Symbol Inteference; Equalization Required to Support High Data Rates

� Adjacent symbols smear into each other, resulting in inter-symbol interference (ISI) and potential data errors

0 0 1 0 1 0 0 0 0 1 1 1 1 0

DecisionThreshold

� Dispersive nature of channel causes significant spread ing of data pulse

Page 8: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

88

Architecture of current high-speed I/Os

cyclically

TX

RX

T T

c(1) c (0)

d0(n)

VTx

T

c(-1)

T T

d1(n-1) d1(n-2) d1(n-3)

h(1) h(2) h(3)

T

VRx

h(8)

T …

x8

d1(n-8) d1(n-4)

Channel

Clock tracking

loop

c (2)

Driver

Clock generation

Decision-feedback equalizer (DFE)

Feed-forward equalizer (FFE)

Continuous-time linearequalizer (CTLE)

Page 9: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

9

Low -power Design Areas

TX Latch

Clock Gen

DFE feedback

Data path

TX Latch

Clock Gen

DFE feedback

Data path

Sampling & AmplificationInput data path

Amplifier / CTLE / DFE summation DFE architecture

Clock generation for CDRTransmitter

Page 10: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

1010

Outline▪ Introduction

• High-Speed I/Os in the wireline link space

• Equalization techniques

▪ Low-power Transmitter• SST vs. CML driver• Ultra-low power TX ���� Avoid TX-FFE

▪ Low-power Receiver• Power-optimal amplification• Incomplete Settling / Integrating Buffer

• Low-power DFE Architecture

• Low-power clock generation

▪ Digital approaches• ADC

• DFE implementations• Analog vs. Digital

Page 11: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

11

CML vs. voltage-mode (SST) driver

• 1Vppd swing• Load current : +/- 5mA• Total current: 20mA

• 1Vppd swing• Load current : +/- 5mA• Total current: 5mA

- Power is proportional to output swing in both cases

- SST driver allows different termination options (differential, to GND, to VDD)

CML (Current mode logic) SST (Source-Series Terminated)

Page 12: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

12

Half-rate SST Driver

C. Menolfi et al., "A 16Gb/s Source-Series Terminated Transmitter in 65nm SOI ," ISSC 2007.

- Transistors are small due to large gate overdrive

- Small input capacitance → low power in pre-driver

Page 13: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

13

TX with 4-tap Feed-forward equalizer (FFE)

- 3.6 mW/Gbps @ 16 Gb/s and 1V swing

C. Menolfi et al., "A 16Gb/s Source-Series Terminated Transmitter in 65nm SOI ," ISSC 2007.

FFE waveform exampleck2

Page 14: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

1414

28 Gb/s SST-TX

� Operating speed: up to 28Gb/s at VDDmin=0.95V

� 4 tap bit-rate FIR feed-forward equalization (FFE)

� +/- 20ps programmable True/Comp. output skew (enabled by SST concept)

� +/- 5% Duty-cycle control

� Adjustable driver impedance

� Full ESD compliance: 2kV HBM, 100V MM, 250V CDM

� High-frequency impedance matching with on-chip T-Coils, <-10dB return loss

� Power consumption = 7mW/Gbps (175mW)

T-Coil

T-Coil

ESD

ESD

125um

180 um

290 um

T-Coil

T-Coil

ESD

ESD

125um

180 um

290 um

720mV

35.7ps

C. Menolfi et al., "A 28Gb/s Source-Series Terminated Transmitter in 32nm SOI“, ISSCC 2012.

Page 15: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

1515

Ultra-Low-power Transmitter Concept

TX Channel

NO Feed-forward equalizer(FFE)

CTLE

Continuous-TimeLinear Equalizer(CTLE)

n-tapDFE

Decision-FeedbackEqualizer (DFE)

ReceivedData

• Equalization done on RX side only• Avoid TX-FFE

� (1) Low power (1mW/Gbps for TX for 1V swing)

� (2) Less amplification of TX clock jitter (see next slides)

Page 16: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

1616

Jitter Amplification in High-Loss Channels

Case I : Low-loss channel

Case II : High-loss channel :

����Jitter is distributed along multiple UIs ���� Jitter amplification

Voltage error in sampling point

Case II can be converted to Case I by(1) Continues time linear equalizer (CTLE) at RX side(2) Continues time linear equalizer (CTLE) at TX side(3) Discrete time RX FFE

But not: TX FFE (since it is agnostic of actual jitter value )���� TX FFE sub-optimal, should be replaced with (1)-(3) if possible

Jitter event �

Page 17: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

1717

Jitter Amplification: Comparison TX FFE vs. RX FFE

TX-FFE RX-FFE

- 25% duty cycle variation applied = high frequency TX jitter at fb/2- Same equalizer coefficients for TX-FFE and RX-FFE

�RX-FFE suffers much less from TX jitter amplificati on than TX-FFE� Should make use of this fact in future standard definitions

Courtesy Troy Beukema

TX FFE Channel RX

Clock Jitter

TX FFE Channel RX

Clock Jitter

Page 18: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

1818

Outline▪ Introduction

• High-Speed I/Os in the wireline link space

• Equalization techniques

▪ Low-power Transmitter• SST vs. CML driver• Ultra-low power TX � Avoid TX-FFE

▪ Low-power Receiver• Power-optimal amplification• Incomplete Settling / Integrating Buffer• Low-power DFE Architecture• Low-power clock generation

▪ Digital approaches• ADC

• DFE implementations• Analog vs. Digital

Page 19: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

19

Power optimal Amplification

- Energy/bit = Power * Delay

- Regenerative amplification most power efficient

J. Wu and B. Wooley, "A 100-MHz Pipelined CMOS Comparator," JSSC 1988.

[14]

SPA = Single-Pole Amp

OMA = Optimized Multi-pole

RSA = Regenerative Amp

Amplification

Pow

er x

Del

ay

Page 20: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

20

CML vs. StrongARM (=DCVS) comparator latch

vout

voutb

vin

vinb

φφφφ

φφφφ

vbias

vout

voutb

vin

vinb

φφφφ

φφφφ

vbias

φφφφ φφφφ

vin

vinb

φφφφ

φφφφ

voutvoutb

φφφφ φφφφ

vin

vinb

φφφφ

φφφφ

voutvoutb

n

n

nn

ni gm

C

Rgdsgm

C

9.0

2

/1≅

−−=τ

n

n

pnpn

pni gm

C

gdsgdsgmgm

CC

9.0

25.1≅−−+

+=τ

� For Vdd=1V transistors are in similar operating point (Vgs=Vds=0.5V)

� CML latch achieves smallest τi when R=2/gmn

� StrongARM latch intrinsically faster

� P/N ratio was >2, now approaching 1 due to strain engineering

Page 21: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

21

� CML latch optimized for speed or power

� Comparison for given Tcycle and amplification A = exp(5)

� StrongARM latch ≈2x as power efficient at T cycle =100ps

CML vs. StrongARM Latch Energy Consumption

80 100 120 140 160 180 2000

5

10

15

20

25

30

Ene

rgy

/ Vdd

2 C

L

Tcycle [ps]

StrongARM

CMLspeed

CMLoptim

vout

voutb

vin

vinb

φφφφ

φφφφ

vbias

vout

voutb

vin

vinb

φφφφ

φφφφ

vbias

φφφφ φφφφ

vin

vinb

φφφφ

φφφφ

voutvoutb

φφφφ φφφφ

vin

vinb

φφφφ

φφφφ

voutvoutb

CML-latch StrongARM-latch

Page 22: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

22

Incomplete Settling → Integrating buffer

vbias

CL RL

vbias

CL RL

vbias

CL CL

vbias

CL CL

RL→∞, ts/τ→0 0 1 2 3 4 5 60

2

4

6

8

10

τ=RLCL ts/τ

Power

Noise

� First, introduce sampling & reset in data path� For given CL

� τ=RLCLis varied by varying load resistance RL

� Power and noise are function of normalized settling time ts/τ� Power can be reduced significantly for RL→∞, � integrator)

� But: Noise rises significantly when ts/τ < 1.5� Can drive higher load at same noise with lower power

E. Iroaga and B. Murmann, "A 12-Bit 75-MS/s Pipelined ADC Using Incomplete Settling", JSSC 2007. A

rbitr

ary

Uni

ts

Normalized settling timeTime

[9]

ts= Tcycle/2

ts/ττττ = 0

Page 23: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

23

Low -Power Decision Feedback Equalizer (DFE)

� DFE principle of operation� Direct DFE

� Low -power DFE� Speculative DFE� Integrating DFE� Switched-cap feedback

� Low -power DFE at high-speed (28 Gbps)� Integrating DFE with fast switched-cap path and 3-t ap

speculation

Page 24: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

24

Principle of DFE

vin(t) sn

+/-1

TTT

h-1 h-2 h-3

vin(t) sn

+/-1

TTTTTT

h-1 h-2 h-3

Direct DFEcritical path

t=1UI

� Fundamental problem

� Have to close the to timing loop in the DFE in 1 UI

� Low-power solutions : Relax timing ���� lower supply voltage

� Use speculative DFE (see next slide)

� Improve speed of DFE feedback ���� switched-cap feedback

Vint=AVin+dn-4*h4+…+dn-15*h15

dn

Page 25: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

25 25

Speculation in DFE

vin(t) sn

+/-1

TTT

h-1 h-2 h-3

vin(t) sn

+/-1

TTTTTT

h-1 h-2 h-3

Direct DFE

DFE with onespeculative tap

DFE feedbackpath ∆t=1UI

∆t=2UI

���� Timing relaxed by 1UI - t MUX

dn

dn +/-1

-h1

+h1 2:1

dn-1

T T

h-2

T

h-3

vin(t)

T

h-4

Page 26: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

26

CTLE + 1-tap speculative DFE results in power-effic ient solution

J. Proesel et al., VLSI 2011

� Good solution for smooth channels

� Avoids analog DFE summation

� Power = 0.7pJ/bit @ 20Gb/s

Page 27: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

27

Summation node: Current-Integrating DFE

� Integrating buffer instead of resistively loaded buffer

� Lower power

� 1.4 pJ/bit for 2 taps, 90nm CMOSM. Park, et al., "A 7 Gb/s 9.3 mW 2-Tap Current Integrating DFE Receiver", ISSCC 2007.

Resistors replaced with resettable capacitors

Page 28: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

28 28

Integrating DFE with I-feedback or switched-cap feedb ack

� I-feedback is replaced with charge feedback � SC-DFE

� Enables faster feedback loop (see next slide)

Vint=AVin+dn-4*h4+…+dn-15*h15

I- feedback[Parks and Bulzacchelli, ISSCC 2007]

Switched-cap feedbackSC-DFE (VLSI 2011)

vin vin d-2 d-2

h2

d-8

h8

d-8

φ φ CL CL

vin vin

d-2

Vreg

φ φ CL CL

h2 h2

d-8

Vreg

h8 h8

T. Toifl et al. “A 2.6mW/Gbps 12.5Gbps RX with 8-tap Switched-Cap DFE in 32nm CMOS”, VLSI 2012

Page 29: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

29

DFE: Current feedback vs. switched-cap feedback

INTEGRATE INTEGRATE RESET Clock

vo, vo

hn

hn

SAMPLE SAMPLE

Current feedback : previous symbol decision must arrive before integration

INTEGRATE INTEGRATE RESET Clock

vo, vo

hn

hn

SAMPLE SAMPLE

SC-feedback : previous symbol decision may arrive later � more margin

Page 30: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

30

-150

-100

-50

0

50

100

150

-64 -48 -32 -16 0 16 32 48 64Code

Vol

tage

[V]

-1.5

-1

-0.5

0

0.5

1

1.5

DN

L [L

SB

]

161616841 2

161616841 2

17um

SC-DFE signlogic+cap driver

cells

• Caps made of min finger M1+M2• Size of entire SC-DFE tap < 70um2

in 32nm (including DAC)

SC-DFE DAC Implementation and Measurement

• ±63 steps (6bit+sign) for taps 2-8• 1 LSB = 250aF• Excellent linearity • Very Small area (no I-DAC needed)

Vol

tage

[mV

]

Page 31: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

31

Low -power DFE at 28 Gbit/s

� 28 Gbps : 1UI = 35ps

� Half-rate receiver reaches limit• Hard to achieve required amplification in integrator• 14GHz CMOS clocking difficult

• Duty cycle hard to control, need small FO

• Electromigration requires to add lots of metal in the layout

→ Quarter-rate receiver more power-efficient

� Single-tap speculative DFE too slow• Feedback time only tfb=70ps

→ Three-tap speculation relaxes timing (tfb=210ps)→ Can reduce supply voltage and use higher fan-out

Page 32: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

32 32

3 Tap Speculation

� Now 8 latches� Power ~ 150fJ/bit per latch � moderate penalty� Critical path now 4UIs� Timing relaxed by 3UI - 3tMUX

DFE feedbackpath ∆t=4UI

dn +/-1

+h1+h2+h3

+h1+h2-h3

2:1

dn-3

dn-3

dn-2

dn-2

dn-3

dn-3

+h1-h2+h3

+h1-h2-h3

-h1+h2+h3

-h1+h2-h3

-h1-h2+h3

-h1-h2-h3

dn-1

vin(t)

T

T T

h-5 h-6 h-4

T T

2:1

2:1

2:1

2:1

2:1

2:1

T

Page 33: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

33

DFE + Demux slice (1 slice of 4 total)

Actually 10 comparator latches per slice : 8 active + 2 used for calibration

5:1

dp/n-3

dp/n0 2:1

dp/n-1

5:1

L0

L1

L2

L3

LA0

L4

L5

L6

L7

h1+h2+h3

h1+h2-h3

h1-h2+h3

h1-h2-h3

h1±h2±h3

-h1+h2+h3

-h1+h2-h3

-h1-h2+h3

-h1-h2-h3

-h1±h2±h3

LA1

T/H �-Amp

φint

FIFO (shared between

slices 0/2 and 1/3)

… SC-DFE h5

SC-DFE h4

SC-DFE h15

SC-DFE h6

d0

d-1

PG XOR

sgn(h4)

dp/n-2

10:1

dp/n-3 dp/n-2

spy 10

s0p/n

s1p/n

s2p/n

s3p/n

sa0p/n

s4p/n

s5p/n

s6p/n

s7p/n

sa1p/n

vint vin

∫-Amp

Page 34: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

34

Layout and power breakout @ 30Gb/s

0 2 1 3

T-coil&

ESD

SC-DFE12 cells

CTLE

Digital

Clk gen & bias

250um0 2 1 3

T-coil&

ESD

SC-DFE12 cells

CTLE

Digital

Clk gen & bias

250um

� Core DFE size = 200x90 µm

� Need to calibrate 40 comparator offsets and 48 DFE taps

DFE and Demux2.5mW/Gbps

CTLE0.2mW/Gbps I/Q gen

(CML Clock div)0.27mW/Gbps

Clkbuf & CML2CMOS0.13mW/Gbps

Total: 92mW = 3.1mW/Gbps

DFE and Demux2.5mW/Gbps

CTLE0.2mW/Gbps I/Q gen

(CML Clock div)0.27mW/Gbps

Clkbuf & CML2CMOS0.13mW/Gbps

Total: 92mW = 3.1mW/Gbps

T. Toifl, et al., "A 3.1mW/Gbps 30Gbps Quarter-Rate Triple-Speculation 15-tap SC-DFE RX Data Path in 32nm CMOS", VLSI 2012.

Page 35: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

35

-50

-40

-30

-20

-10

0

0 5 10 15 20Frequency [GHz]

DD

21 [d

B]

30Gb/s

-12

-10

-8

-6

-4

-2

0

0 25 50 75 100Phase position [%UI]

30Gb/s

-12

-10

-8

-6

-4

-2

0

0 25 50 75 100Phase position [%UI]

30Gb/s

-30dB @ 15GHz

TX FFE = [0 67 -29]

PRBS 31, No TX FFE

Power =92mW @ V DD=1.15V = 3.1pJ/bit

Page 36: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

36

Low -power RX clock generation

� Clock generation for quarter-rate CDR system

� Phase-programmable PLL (P -PLL)� Design example:

• 40Gb/s RX using P-PLL

Page 37: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

37

Quarter-rate Clock and Data Recovery

φ7

φ6

φ5

φ4

φ3

φ2

φ1

φ0

Data

CDR loop adjusts timing

D0 D1 D2 D3

4 Received bits per clock cycle2 Sample bits/symbol (data, edge)→ Need 4x2 clock phases→ Need to shift 8 clocks simultaneously

From Multi-Phase Clock Generator

Page 38: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

38

PLL/DLLPolyphase

8 Sampling Latches

Digital Loop Filter

Digital DLL

8 Phases

k Phases

4 Data Bits

φref

Data40 Gbps

8 Phase Rotators

PLL/DLLPolyphase

8 Sampling Latches

Digital Loop Filter

Digital DLL

8 Phases

k Phases

4 Data Bits

φref

Data40 Gbps

8 Phase Rotators

Quarter-rate Dual-loop architecture

Phase rotators:

- Area

- Power

- Mismatch

- Need at least 2 ref phases

Previous solution

Page 39: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

39

PLL/DLLPolyphase

8 Sampling Latches

Digital Loop Filter

Digital DLL

8 Phases

k Phases

4 Data Bits

φref

Data40 Gbps

8 Phase Rotators

PLL/DLLPolyphase

8 Sampling Latches

Digital Loop Filter

Digital DLL

8 Phases

k Phases

4 Data Bits

φref

Data40 Gbps

8 Phase Rotators

Quarter-rate Dual-loop architecture

⇒⇒⇒⇒ Phase-Programmable PLL (P-PLL)

P-PLL

8 Sampling Latches

Digital Loop Filter

Digital DLL

8 Phases

4 Data Bits

φref

Data40 Gbps

� Phases can be programmed by digital value � Enables digital CDR loop

� Provides tight lock to high-reference frequency � low jitter

Page 40: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

40

40 Gb/s CDR Architecture using P-PLL

288 steps / 4 UI

VCO

8 Phasesφref …

φ1

…φ0 φ14

φ15…

α-DAC

x 8

Vc

V/I Conv

Sampling

XOR Phase Detector

Data 40 Gbit/s

10 GHz

PRBS15 Checker

4:8 Demux

d0-3 e0-3

d0-7

e0-7 Early/LateSignal Gen

error_outDigital loop Filter

α-DACSignal Gen

18

E/L

up/dn

φ2n+1

φ2n

αααα1αααα2

αααα7

8 Phases

αααα0

2

2

Loop filter

VCO

8 Phasesφref …

φ1

…φ0 φ14

φ15…

α-DAC

x 8

Vc

V/I Conv

Sampling

XOR Phase Detector

Data 40 Gbit/s

10 GHz

PRBS15 Checker

4:8 Demux

d0-3 e0-3

d0-7

e0-7 Early/LateSignal Gen

error_outDigital loop Filter

α-DACSignal Gen

18

E/L

up/dn

φ2n+1

φ2n

αααα1αααα2

αααα7

8 Phases

αααα0

2

2

Loop filter

288 steps / 4 UI [37] T. Toifl et al., "A 72 mW 0.03 mm2 inductorless 40 Gb/s CDR in 65 nm SOI CMOS," ISSCC 2007.

P-PLL

Page 41: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

41

Chip Photograph and Layout of 40Gb/s RX

� Power : 1.8pJ/bit @ 40 Gbps� Area: <0.03mm2 in 65nm CMOS, no inductors uses

0.0

0.5

1.0

1.5

0.0 0.2 0.4 0.6

SiGe45 Gb/s

CMOS25 Gb/s

CMOS40 Gb/s[23]

[24]

[25]

CMOS44 Gb/s

[26]

Tbp

s/m

m2

Tbps/W

This work[37]

CMOS40 Gb/s

data

Sam

pler

s +

4:8

Dem

uxV

CO

Pha

se D

etec

t.

Loop

F

ilter

Loop

F

ilter

4:8 Lo

op

Filt

erLo

op

Filt

er

data

4:8 Lo

op

Filt

erLo

op

Filt

er

4:8 Filt

er

data

150 um

data

190 um

data

CDR Logic

+ PRBS15checker (digital)

φφφφref

Page 42: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

4242

Outline▪ Introduction

• High-Speed I/Os in the wireline link space

• Equalization techniques

▪ Low-power Transmitter• SST vs. CML driver• Ultra-low power TX � Avoid TX-FFE

▪ Low-power Receiver• Power-optimal amplification• Incomplete Settling / Integrating Buffer

• Low-power DFE Architecture

▪ Digital approaches• ADC• DFE implementations• Analog vs. Digital

Page 43: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

4343

ADC-based I/Os

▪ New standards emerging operating with PAM-4• IEEE 802.3bj (100Gb Ethernet over backplane)

• PAM-4 mode for high channel loss (>40dB)

▪ Requirements for ADC• Rather low resolution (<5bits)• Low latency (for closing CDR loop)� Flash ADC• High conversion rates� Interleave Flash ADC� Low input capacitance required

Example: Low-power flash ADC with small input cap

Page 44: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

4444

Low-power flash ADC with small input cap

Term+ ESD

vin T/H

vinb

Inte- grator

Clock generation

5

Comp Slice 29

Bubble correction

Therm

ometer to B

inary Encode

r

30x12 Bit Memory

Comp Slice 1

Comp Slice 0

Clock receiver

clk

clkb

- Integrating buffer drives big load with small input capacitance

- Comparator slice consists of offset-adjustable SenseAmp latch(no reference ladder required)

Page 45: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

4545

Low-power flash ADC results

65 mm

120 mm

65 mm

120 mm

latch threshold memory

latch threshold memory

30 threshold-adjustableDCVS latches

buf

� ADC cell can be time-interleaved to achieve higher conversion rates

50fFInput cap

230fJ/conversion stepFOM

19mW @ 1V supplyPower consumption

<0.3 LSB, <0.3 LSBDNL, INL

4.1ENOB @ fs/2

4.4ENOB @ DC

5GHzSampling frequency

32nm CMOS SOITechnology

120 µm

65 µm

Page 46: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

4646

Digital DFE

▪ Relatively simple and low-power• Case I : low speed (<5Gbit/s)

• DFE loop can be implemented by digital adder• Critical path is adder and comparator• Taps can also be combined in LUT

T

c4

T

c3

T T

T

c2 c1

T T T

c5

≥ dn

ADC sn

Page 47: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

4747

Digital DFE

▪ Relatively simple and low-power• Case II : small number of DFE taps (n ≤ 5)

• Digital loop unrolling requires 2n comparators

c00000

c00001

c00010

c00011

c11111

32:1 MUX

T T T T

T

dn

ADC sn

- Comparator power: <100uW/Gbps- Critical path: 2:1 MUX

- Hard problem is long DFE (eg 15 taps) at high speed (eg 28Gb/s)

Page 48: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

4848

Analog vs. Digital: RX power comparison

▪ Power: Optimized Analog RX with 15-tap DFE• Clock path: 2mW/Gbps

• Linear equalizer (CTLE): 1.5mW/Gbps

• DFE: 3mW/Gbps for 15-tap DFE @ 25Gb/s• Total: 6.5 mW/Gbps

▪ Power: ADC based RX with 15-tap digital DFE • Clock path: 1mW/Gbps (no edge samples required with Mueller-Müller CDR)

• Linear equalizer (CTLE): 1.5mW/Gbps

• ADC: 4mW/Gbps• DSP: FFE: 3mW/Gbps + DFE: 3mW/Gbps

• Total: 12.5mW/Gbps

�ADC power alone is ≥ DFE power� Analog will stay lowest power solution for DFE equa lization

Why ? Analog well suited for DFE: fast summation with moderate accuracyBUT: gap will shrink as technology scales

Page 49: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

4949

Summary

▪ Low -power TX techniques• SST drivers: low-power, multi-standard• Avoid TX-FFE: Low power, less jitter amplification

▪ Low -power RX Techniques• Regenerative amplification, StrongARM (DCVS) latch• Integrating DFE using switched-cap feedback• P-PLL phase generation

▪ Digital (ADC -based) I/O approaches• ADC : Flash ADC <4mW/Gbps• Analog is lower power solution for long DFE at high speeds• High-speed I/Os will follow route of Gb Ethernet over UTP

Page 50: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

50

AcknowledgementsIBM

� Christian Menolfi

� Peter Buchmann

� Marcel Kossel

� Matthias Brändli

� Thomas Morf

� Pier Andrea Francese

� John Bulzacchelli

� Troy Beukema

� Tod Dickson

� Dan Dreps

� Steve Baumgartner

� Fran Keyser

� John Berqkvist

� Martin Schmatz

Miromico

� Robert Reutemann

� Michael Ruegg

� Daniele Gardellini

� Andrea Prati

� Giovanni Cervelli

Page 51: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

5151

Thank you for your attention!

Page 52: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

5252

Additional Slides

Page 53: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

© 2012 IBM Corporation53

Summary of Equalization Techniques

� Feed-forward Equalization (FFE)

– FIR filter, usually implemented at TX side

� Continuous-Time Linear Equalizers (CTLE)

– Increase high-frequency gain and/or decrease low-frequency gain to compensate for low-pass channel characteristics.

� Decision Feedback Equalization (DFE)

– Feed back previous bit decisions to cancel postcursor ISI caused by those bits.

Page 54: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

5454

Source-synchronous Links Application Area

Important parameters:• throughput (Gb/s per pin)• power (mW per Gb/s or equivalently pJ per bit)• limited die area (µm2 per Gb/s, fit beneath C4 balls)• Latency (memory links, SMP links)

Page 55: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

5555

Switched-cap DFE principle

1

1

Vo Vob

1

0

16C

16C

4C

2C

1C

1

1

CL

8C 1

16C 0

CL 16C

16C

4C

2C

1C

8C

16C

Vreg

0

d-2

Vreg

0

d-3

Vreg

0

d-4

Vreg

0

d-8

C2

C3

C4

C8

sgn(c 2)

sgn(c 3)

sgn(c 4)

sgn(c 8) …

vo

FF

½-rate FIFO

FF

sign logic

single SC-cap DFE tap

- Current feedback replaced by charge feedback

- Capacitive DAC (6 bits+sign, +/-0..63): 4 binary (1,2,4,8) + 3 bits therm-coded (16,16,16)

- unused caps disconnected by PFET pass transistors (reduced cap loading)

- Transistors used as switches (digital design style, don't care about 'analog' parms like gm, gds)

- DFE timing margin is improved wrt to integrating DFE with current feedback

- Addition of charges is highly linear: needed for large number (e.g. 20) DFE or X-talk cancellation

Page 56: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

5656

TX Architecture

d0

d2

d1

d3

CML-to-CMOSclock generator

32 SST units

ck2cml

ck2/ck2b

ck4/ck4b

even

odd

sign[0:3]data/term[0:3]

tapsel[0:31]

out+out-

2:1

mux4:2da

ta in

cloc

k in

data

out

FIRshiftreg.

4-tap

(even/odd)tap0

(even/odd)tap1

(even/odd)tap2

(even/odd)

tap3

config. reg.enable

tunen,tunep

d0

d2

d1

d3

CML-to-CMOSclock generator

32 SST units

ck2cml

ck2/ck2b

ck4/ck4b

even

odd

sign[0:3]data/term[0:3]

tapsel[0:31]

out+out-

2:1

mux4:2

2:12:1

mux4:2da

ta in

cloc

k in

data

out

FIRshiftreg.

4-tap

(even/odd)tap0

(even/odd)tap0

(even/odd)tap1

(even/odd)tap1

(even/odd)tap2

(even/odd)tap2

(even/odd)

tap3

(even/odd)

tap3

config. reg.enable

tunen,tunep

▪ Half-rate architecture▪ 4-tap feed-forward equalizer (FFE)▪ Programmable output impedance

courtesy Christian Menolfi

[1],[5]

Page 57: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

5757

Jitter Amplification in High-Loss Channels

- Jitter can be approximated by Dirac impulse at edge position [1]

- Jitter is then converted to voltage noise in the sampling point

[1] V. Stojanović et al., "Optimal Linear Precoding with Theoretical and Practical Data Rates in High-Speed Serial-Link Backplane Communication", ICC 2004

TX jitter folded with channel response

Page 58: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

58

L

∆V

LL

∆V

Sampling in Data Path vs. continuous time data path

� Sampling is done by T/H at the input

� Total input capacitance is N·Cs/2

� Advantages

- Signal processing now in discrete time domain

-> Reduced bandwidth requirements due to

-> Sub-rate

-> Can use reset to erase history

-> Buffers can use incomplete settling or integration

L

∆V

Cs

T/H

LL

∆V

Cs

T/H

Page 59: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

IBM Zurich Research Laboratory

59

P-PLL – Key points:� Advantages

• Clocks from VCO go directly into the latches

• No need for phase rotators

• Clock path is very short

• Phase-rotation with XOR phase detectors is inherently linear

• High-frequency noise on input clock signal is filtered out

� Disadvantages

• Phase noise is accumulated due to PLL operation

But: PLL bandwidth is extremely high (>1 GHz for 10GHz clock)

� Noise is attenuated to large extent

• Phase-rotation now in feedback-path

But: No influence on CDR due to high PLL bandwidth

Page 60: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

6060

References

[1] C. Menolfi, T. Toifl, P. Buchmann, M. Kossel, T. Morf, J. Weiss, M. Schmatz, "A 16Gb/s Source-Series Terminated Transmitter in 65nm CMOS SOI," ISSCC Dig. Tech Papers, pp. 446-447, Feb. 2007.

[2] M. Kossel, C. Menolfi, J. Weiss, P. Buchmann, G. von Bueren, L. Rodoni, T. Morf, T. Toifl, M. Schmatz, "A T-Coil-Enhanced 8.5Gb/s High-Swing source-Series-Terminated Transmitter in 65nm Bulk CMOS", IEEE Journal of Solid-State Circuits, Vol. 43, pp. 2905-2920, Dec. 2008.

[3] R. Philpott, J. Humble, R. Kertis, K. Fritz, B. Gilbert, E. Daniel, "A 20Gb/s SerDes transmitter with adjustable source impedance and 4-tap feed-forward equalization in 65nm bulk CMOS," IEEE Custom Integrated Circuits Conference (CICC), pp.623-626, 21-24, 2008.

[4] W. Dettloff, J. Eble, Lei Luo, P. Kumar, F. Heaton, T. Stone, B. Daly, "A 32mW 7.4Gb/s protocol-agile source-series-terminated transmitter in 45nm CMOS SOI," ISSCC Dig. Tech Papers, pp.370-371, 7-11 Feb. 2010.

[5] C. Menolfi, T. Toifl, M. Rueegg, M. Braendli, P. Buchmann, M. Kossel, T. Morf, "A 14Gb/s high-Swing Thin-oxide device SST TX in 45nm CMOS SOI", ISSCC 2011.

SST- Transmitter

Page 61: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

6161

References

[6] B. Wicht, T. Nirschl, D. Schmitt-Landsiedel, "Yield and Speed Optimization of a Latch-Type Voltage Sense Amplifier", IEEE Journal of Solid-State Circuits, vol. 39, pp. 1148 - 1158, July 2004.

[7] T. Toifl, C. Menolfi, M. Ruegg, R. Reutemann, P. Buchmann, M. Kossel, T. Morf, J. Weiss, M. Schmatz, "A 22-Gb/s PAM-4 receiver in 90-nm CMOS SOI technology," IEEE Journal of Solid-State Circuits, Vol. 41, pp. 954 - 965, April 2006.

[8] P. Haydari, R. Mohanavelu, "Design of Ultrahigh-Speed Low-Voltage CMOS CML Buffers and Latches", IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol.12, No. 10, October 2004.

[9] T. Chalvatzis, K.H.K. Yau, P. Schvan, M.T. Yang, S.P. Voinigescu, "A 40-Gb/s Decision Circuit in 90-nm CMOS," Proceedings of the 32nd European Solid-State Circuits Conference, Vol. 32, pp. 512 - 515, September 2006.

[10] Y. Okaniwa, H. Tamura, M. Kibune, D. Yamazaki, T. Cheung, J. Ogawa, N. Tzartzanis, W. W. Walker, and T. Kuroda, "A 40-Gb/s CMOS clocked comparator with bandwidth modulation technique," IEEE Journal of Solid-State Circuits, vol. 40, pp. 1680 - 1687, August 2005.

[11] P. Nuzzo, F. De Bernardinis, P. Terreni, G. Van der Plas, "Noise Analysis of Regenerative Comparators for Reconfigurable ADC Architectures", IEEE Trans. on Circuits and Systems I, Vol. 55, No 6, pp 1441-1454, July 2008.

[12] J. Kim. B. Leibowitz, J. Ren, C. Madden, "Simulation and Analysis of Random Decision Errors in Clocked Comparators", IEEE Trans on Circuits and Systems I, Vol. 56, pp 1844-1857, August. 2009.

[13] M. Jeeradit, J. Kim, B. Leibowitz, P. Nikaeen, V. Wang, B. Garlepp, C. Werner, "Characterizing Sampling Aperture of Clocked Comparators", pp 68-69, IEEE Symposium on VLSI Circuits, June 2008.

Latch Modeling and Optimization

Page 62: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

6262

References

[14] J. Wu and B. Wooley., "A 1OO-MHz Pipelined CMOS Comparator," IEEE Journal of Solid-State Circuits, Vol. 23, pp. 1379 - 1385, December 1988.

[15] M. Choi, A. Abidi, "A 6-b 1.3-Gsample/s A/D converter in 0.35-um CMOS," IEEE Journal of Solid-State Circuits, Vol. 36, pp. 1847 - 1858, Dec 2001.

[16] E. Iroaga and B. Murmann, "A 12-Bit 75-MS/s Pipelined ADC Using Incomplete Settling", IEEE Journal of Solid-State Circuits, Vol. 42, pp. 784 - 756, April 2007.

[17] M. Park, J, Bulzacchelli, M. Beakes, D. Friedman, "A 7Gb/s 9.3mW 2-Tap Current-Integrating DFE Receiver", ISSCC Dig. Tech Papers, pp. 230-231, Feb. 2007.

[18] T. Dickson, J. Bulzacchelli, D. Friedman, "A 12-Gb/s 11-mW Half-Rate Sampled 5-Tap Decision Feedback Equalizer With Current-Integrating Summers in 45-nm SOI CMOS Technology", IEEE Journal of Solid-State Circuits, pp 1298-1305, Vol. 44, April 2009

[19] K. Fukuda et al,"A 12.3-mW 12.5Gb/s Complete Transceiver in 65-nm CMOS Process", IEEE Journal of Solid-State Circuits, Vol. 45, Dec. 2010

Low-power techniques

Page 63: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

6363

References

[20] R. Payne, B. Bhakta, S. Ramaswamy, S. Wu, J. Powers, P. Landman, U. Erdogan, A. Yee, R. Gu, L. Wu, B. Parthasarathy, K. Brouse, W. Mohammed, K. Heragu, V. Gupta, L. Dyson, W. Lee "A 6.25Gb/s Binary Adaptive DFE with First Post-Cursor Tap Cancellation for Serial Backplane Communications", ISSCC Dig. Tech Papers, pp. 68-69, Feb. 2005.

[21] B. Leibowitz, J. Kizer, H. Lee, F. Chen, A. Ho, M. Jeeradit, A. Bansal, T. Greer, S. Li, R. Farjad-Rad, W. Stonecypher, Y. Frans, B. Daly, F. Heaton, B. W. Garlepp, C. W. Werner, N. Nguyen, V. Stojanović, J L. Zerbe, "A 7.5 Gb/s 10-Tap DFE Receiver with First Tap Partial Response, Spectrally Gated Adaptation, and 2nd-Order Data Filtered CDR," ISSCC Dig. Tech Papers, pp. 228-229, Feb. 2007.

[22] T. Beukema, M. Sorna, K. Selander, S. Zier, B. L. Ji, P. Murfet, J. Mason, W. Rhee, H. Ainspan, B. Parker, and M. Beakes, "A 6.4-Gb/s CMOS SerDes core with feed-forward and decision-feedback equalization," IEEE Journal of Solid-State Circuits, vol. 40, pp. 2633 - 2645, December 2005.

[23] J. F. Bulzacchelli, M. Meghelli, S. V. Rylov, W. Rhee, A. V. Rylyakov, H. A. Ainspan, B. D. Parker, M. P. Beakes, A. Chung, T. J. Beukema, P. K. Pepeljugoski, L. Shan, Y. H. Kwark, S. Gowda, and D. J. Friedman, "A 10-Gb/s 5-tap DFE/4-tap FFE transceiver in 90-nm CMOS technology," IEEE Journal of Solid-State Circuits, vol. 41, pp. 2885 - 2900, December 2006.

[24] A. Emami-Neyestanak, A. Varzaghani, J. Bulzacchelli, A., C. K. Ken Yang and D. Friedman, "A Low-Power Receiver with Switched-Capacitor Summation DFE", Symp.VLSI Circuits Dig. Tech. Papers, June 2006.

[25] K. J. Wong, A. Rylyakov, and C. K. Ken Yang, "A 5-mW 6-Gb/s quarter-rate sampling receiver with a 2-tap DFE using soft decisions," Symp. VLSI Circuits Dig. , pp. 190 - 191, June 2006.

[26] A. Garg, A. C. Carusone, and S. P. Voinigescu, "A 1-tap 40-Gb/s look-ahead decision feedback equalizer in 0.18-µm SiGe BiCMOS technology," IEEE Journal of Solid-State Circuits, vol. 41, pp. 2224 - 2232, October 2006.

DFE Implementations

Page 64: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

6464

References

[27] T. Toifl, C. Menolfi, M. Ruegg, R. Reutemann, P. Buchmann, M. Kossel, T. Morf, J. Weiss, M. Schmatz, "A 0.94-ps-RMS-jitter 0.016-mm2 2.5-GHz multiphase generator PLL with 360º digitally programmable phase shift for 10-Gb/s serial links", IEEE Journal of Solid-State Circuits, Vol. 40, pp. 2700 - 2712, Dec 2005.

[28] E. Prete, D. Scheideler, A. Sanders, “A 100mW 9.6Gb/s Transceiver in 90nm CMOS for Next- Generation Memory Interfaces,” ISSCC Dig. Tech. Papers, vol. 49, pp. 88-89, Feb. 2006.

[29] R. Palmer, J. Poulton, W. J. Dally, J. Eyles, A. M. Fuller, T. Greer, M. Horowitz, M. Kellam, F. Quan, F. Zarkeshvari, “A 14mW 6.25Gb/s Transceiver in 90nm CMOS for Serial Chip-to-Chip Communications”, ISSCC Dig. Tech Papers, pp. 440-441, Feb. 2007.

[30] J. Poulton, R. Palmer, A.M. Fuller, T. Greer, J. Eyles, W.J. Dally, M. Horowitz, "A 14-mW 6.25-Gb/s Transceiver in 90-nm CMOS", IEEE Journal of Solid-State Circuits, vol. 42, pp. 2745 - 2755, December 2007.

[31] R. Reutemann, M. Ruegg, F. Keyser, J. Bergkvist, D. Dreps, T. Toifl, M. Schmatz, " 4.5 mW/Gb/s 6.4 Gb/s22+1-Lane Source Synchronous Receiver Core With Optional Cleanup PLL in 65 nm CMOS", IEEE Journal of Solid-State Circuits, Vol. 45, pp. 2850 - 2860, Dec 2010.

[32] A. Agrawal, A. Lie, P. Kumar Hanumolu, G. Wei, "An 8 x 5 Gb/s Parallel Receiver With Collaborative Timing Recovery", IEEE Journal of Solid-State Circuits, vol. 44, pp. 3120 - 3130, Nov. 2009.

[33] B. Leibowitz, R. Palmer, J. Poulton, Y. Frans, S. Li, J. Wilson, M. Bucher, A. Fuller, J. Eyles, M. Aleksic, T. Greer, N. Nguyen, "A 4.3 GB/s Mobile Memory Interface With Power-Efficient Bandwidth Scaling", IEEE Journal of Solid-State Circuits, vol. 45, pp. 889 - 898, April 2010

[34] N. Kurd, S. Bhamidipati, C. Mozak, J. Miller, P. Mosalikanti, et al. ", A Family of 32 nm IA Processors," IEEE Journal of Solid-State Circuits, vol.46, no.1, pp.119-130, Jan. 2011

Source-Synchronous RX /Clean up PLL/ P-PLL

Page 65: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

6565

References

[35] J. Lee, B. Razavi, “A 40-Gb/s clock and data recovery circuit in 0.18-µm CMOS technology,” IEEE Journal of Solid-State Circuits, vol. 38, pp. 2181 - 2190, Dec. 2003.

[36] C. Kromer, G. Sialm, C. Menolfi, M. Schmatz, F. Ellinger, H. Jackel “A 25Gb/s CDR in 90nm CMOS for High-Density Interconnects,” ISSCC Dig. Tech Papers, pp. 326-327, Feb. 2006.

[37] T. Toifl, C. Menolfi, P. Buchmann, C. Hagleiter, M. Kossel, T. Morf, J. Weiss, M. Schmatz, “A 72mW 0.03mm2

Inductorless 40Gb/s CDR in 65nm SOI CMOS”, ISSCC Dig. Tech Papers, pp. 410-411, Feb. 2007.

[38] D. Kucharski, K. Kornegay, “2.5 V 43-45 Gb/s CDR Circuit and 55 Gb/s PRBS Generator in SiGe Using a Low-Voltage Logic Family,” IEEE Journal of Solid-State Circuits, vol. 41, pp. 2154 - 2165, Sept. 2006.

[39] N. Nedovic, N. Tzartzanis, H. Tamura, F. Rotella, M. Wiklund, J. Ogawa, W. Walker, “A 40-44Gb/s 3x Oversampling CMOS CDR/1:16 DEMUX,” ISSCC Dig. Tech Papers, pp. 224-225, Feb. 2007.

[40] C. Liao, S. Liu; , "A 40 Gb/s CMOS Serial-Link Receiver With Adaptive Equalization and Clock/Data Recovery," IEEE Journal of Solid-State Circuits, vol.43, no.11, pp 2492-2502, Nov. 2008.

[41] S. Kaeriyama, Y. Amamiya, H. Noguchi, Z. Yamazaki, et al. "A 40 Gb/s Multi-Data-Rate CMOS Transmitter and Receiver Chipset With SFI-5 Interface for Optical Transmission Systems," IEEE Journal of Solid-State Circuits, vol.44, no.12, pp.3568-3579, Dec. 2009.

[42] N. Nedovic, A. Kristensson, S. Parikh, S. Reddy et al., "A 3 Watt 39.8–44.6 Gb/s Dual-Mode SFI5.2 SerDesChip Set in 65 nm CMOS," IEEE Journal of Solid-State Circuits, , vol.45, no.10, pp.2016-2029, Oct. 2010.

25-40 Gbps CDR Circuits

Page 66: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

6666

References

[43] H. Chung, A. Rylyakov, Z. T. Deniz, J. Bulzacchelli, W. Gu-Yeon, and F. Daniel, “A 7.5-GS/s 3.8-ENOB 52-mW flash ADC with clock duty cycle control in 65 nm CMOS,” in Proc. Symp. VLSI Circuits, 2009, pp. 268–269.

[44] K. Deguchi, N. Suwa, M. Ito, T. Kumamoto, and T. Miki, “A 6-bit 3.5-GS/s 0.9-V 98-mW flash ADC in 90-nm CMOS,” IEEE J. Solid-State Circuits, vol. 43, no. 10, pp. 2303–2310, Oct. 2008.

[45] S. Sarvari, T. Tahmoureszadeh, A. Sheikholeslami, H. Tamura, M. Kibune, "A 5Gb/s Speculative DFE for 2x Blind ADC-based Receivers in 65-nm CMOS", Symp.VLSI Circuits Dig. Tech. Papers, June 2010.

[46] C. Huang, C. Wang, J. Wu, "A CMOS 6-Bit 16-GS/s Time-Interleaved ADC with Digital Background Calibration", Symp.VLSI Circuits Dig. Tech. Papers, June 2010.

[47] M. El-Chammas, B. Murmann, "A 12-GS/s 81-mW 5-bit Time-Interleaved Flash ADC with Background Timing Skew Calibration", Symp.VLSI Circuits Dig. Tech. Papers, June 2010.

[48] E. Chen, C. Ken Yang, "ADC-Based Serial I/O Receivers", IEEE Transactions on Circuits and Systems I, Vol. 57, No. 9, Sept. 2010.

[49] M. Harwood, N. Warke, R. Simpson, T. Leslie et al., “A 12.5 Gb/s SerDes in 65nm CMOS using a baud-rate ADC with digital equalization and clock recovery,” ISSCC Dig. Tech Papers, pp. 436-591, Feb. 2007.

[50] H. Yamaguchi, H. Tamura, et al, “A 5 Gb/s transceiver with an ADC-based feedforward CDR and CMA adaptive equalizer in 65 nm CMOS,” ISSCC Dig. Tech Papers, pp. 168–169, Feb. 2010.

[51] O. Tyshchenko, A. Sheikholeslami, H. Tamura, M. Kibune, H. Yamaguchi, J. Ogawa, " A 5-Gb/s ADC-Based Feed-Forward CDR in 65 nm CMOS", IEEE J. Solid-State Circuits, vol. 45, no. 10, pp. 1091–1098, June 2010.

High-Speed ADC and Digital I/O Implementations

Page 67: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

6767

References

[52] V. Stojanović, A. Amirkhany, M. Horowitz, "Optimal Linear Precoding wit Theoretical and Practical Data Rates in High-Speed Serial-Link Backplane Communication", International Conference on Communications (ICC), 2004.

[53] G. Balamurugan, N. Shanbag, "Modeling and Mitigation of Jitter in Multi-Gbps Source-Synchronous I/O Links", Proceedings of the 21st International Conference on Computer Design (ICCD), 2003.

[54] G. Balamurugan, B. Caspar, J. Jaussi, M. Masuri, F. O'Mahony, "Modeling and Analysis of High-Speed I/O Links", IEEE Transactions on Advanced Packaging, Vol. 32, Feb. 2009.

[55] J. Ren, D. Oh, S. Chang, "Hybrid Statistical and Time-Domain Simulation Methodology for High-Speed Links", DesignCon 2010.

[56] S. Chun, G. Peterson, R. Mandrekar, D. Dreps, M. Sorna, T. Beukema, "Method to Determine Optimum Equalization for Maximum Eye in High-Speed Computer System", IEEE 19th Conference on Electrical Performance of Electronic Packaging and Systems (EPEPS), vol., no., pp.293-296, Oct. 2010.

Link modelling

Page 68: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

6868

References

[57] M. Lee, W. Dally, P. Chiang, "Low-power area-efficient high-speed I/O circuit techniques," IEEE Journal of Solid-State Circuits, vol. 35, pp. 1591 - 1599, November 2000.

[58] J. Kim, M. Horowitz," Adaptive supply serial links with sub-1-V operation and per-pin clock recovery," IEEE Journal of Solid-State Circuits, vol. 37, pp. 1403 - 1413, November 2002.

[59] K. J. Wong, H. Hatamkhani, M. Mansuri, and C. K. Ken Yang, "A 27-mW 3.6-Gb/s I/O transceiver," IEEE Journal of Solid-State Circuits, vol. 39, pp. 602 - 612, April 2004.

[60] B. Casper, J. Jaussi, F. O'Mahony, M. Mansuri, K. Canagasaby, J. Kennedy, E. Yeung, and R. Mooney, "A 20Gb/s forwarded clock transceiver in 90nm CMOS," IEEE International Solid-State Circuits Conference, vol. XLIX, pp. 90 - 91, February 2006.

[61] R. Gonzalez, B. Gordon, and M. A. Horowitz, "Supply and threshold voltage scaling for low power CMOS," IEEE Journal of Solid-State Circuits, vol. 32, pp. 1210 - 1216, August 1997.

Additional/General Papers

Page 69: Low-power High-Speed CMOS I/Os: Design Challenges and ...€¦ · Thomas Toifl TWEPP 2012 Sept. 20 th, 2012 Low-power High-Speed CMOS I/Os: Design Challenges and Solutions. 2 2 I/O

6969

AcronymsASST At-Speed Structural TestBER Bit Error RateCDR Clock and Data recoveryCML Current Mode LogicCTLE Continuous Time Linear EqualizerDCVS Differential Cascode Voltage Switch (circuit)DFE Decision Feedback EqualizerDLL Delay Locked LoopFFE Feed-forward EqualizerFFT Fast Fourier TransformFS-CMOS Full-swing CMOS (=inverter based clocking)INL Integral Non-LinearityISI Intersymbol InterferenceLSSD Level Sensitive Scan DesignPLL Phase-Locked LoopP-PLL Phase Programmable PLLPPF Poly-phase filterRX ReceiverSC-DFE Switched-Cap DFES.E. Single-EndedSOI Silicon On InsulatorSST Source-Series TerminatedVCO Voltage Controlled OscillatorT/H Track and HoldTX Transmitter