looking beyond cmos: integrated circuit design with nano ...scaled nems vs. cmos adders for similar...

Post on 27-Mar-2021

9 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Looking Beyond CMOS:

Integrated Circuit Design with

Nano Electro-Mechanical Switches

Vladimir Stojanović (MIT)

in collaboration with

Tsu-Jae King Liu, Elad Alon (UC Berkeley)

Dejan Marković (UCLA)

MIEL 2010

Acknowledgements

Circuit design

Fred Chen, Hossein Fariborzi

Matthew Spencer, Abhinav Gupta

Cheng Wang, Kevin Dwan

Device design

Hei Kam, Rhesa Nathanael, Vincent Pott, Jaeseok Jeon

Sponsors

DARPA NEMS program

FCRP (C2S2, MSD)

MIT CICS

Berkeley Wireless Research Center

NSF

2

CMOS is Scaling, Power Density is Not

Since ~2000 supply voltage (Vdd) stuck at ~1V

Leakage stops you from lowering threshold (Vth)

Vdd and Vt not scaling well power/area not scaling

400480088080

8085

8086

286386

486Pentium® proc

P6

1

10

100

1000

10000

1970 1980 1990 2000 2010Year

Po

we

r D

en

sit

y (

W/cm

2)

Hot Plate

Nuclear Reactor

Rocket Nozzle

S. Borkar, Intel

Sun’s Surface

Power Density Prediction circa 2000

Core 2

3

Performance limited by power, not number of transistors

Parallelism to the Rescue

Parallelism allows slower, more efficient units

Helps scale performance for fixed power

Will this last forever?

4

Lower supply,

balanced Vth

Parallelize to recover performance

IBM CELL processor

4

Subthreshold leakage: Game Over for CMOS

Leakage and sub-threshold slope define minimum

energy/op for CMOS

Parallelism cannot reduce power if already operating at

minimum energy

5

NEM Switches to the Rescue

NEM switches show zero leakage & sharp sub-threshold slope

Could potentially enable reduced E/op with scaling

Measured MEM Switch I-V Curve MEM Switch Energy vs. VDD

6

R. Nathanael et al., “4-Terminal Relay Technology for Complementary Logic,” IEDM 2009

Etotal=Edynamic

NEM Switch Structure and Operation

ON Switch:

|Vgb| > Vpi (pull-in voltage)

OFF Switch:

|Vgb| < Vpo (pull-out voltage)

7

Poly-SiGe Anchor

Poly-SiGe Beam

/Flexure

Tungsten Body

Tungsten Source/Drain

Tungsten Channel Poly-SiGe Gate

NEM Switch as a Logic Element

8

DrainSource Body

Gate

4-terminal design mimics MOSFET operation Electrostatic actuation is ambipolar

Non-inverting logic is possible Actuation independent of source/drain voltages

8

NEM Switch Model

9

Spring k Damper bMass m

Cgb

G

S D

Rcs Rcd

B

Cgc

CdbCsb

Anchor

Mechanical

Model

Rs Rd

Lumped Verilog-A model for circuit design and simulation:

Mechanical dynamics: spring (k), damper (b), mass (m)

Electrical parasitics: non-linear gate-body (Cgb), gate-channel

(Cgc), and source/drain-body cap (Cs,db), contact (Rcs,d)

9

NEM Switch Characteristics

1.E-07

1.E-06

0 5 10 15

tPI[s]

VDD [V]

Model

Experiment

L=50mm 40mm 14mm

Simple electro-mechanical model matches experiment well Mechanical “pull-in” delay scales with geometry and overdrive voltage

“Pull-in” voltage (threshold) scales with switch geometry

Constant E-field scaling Mechanical delay and VPI scale linearly

10

Model

*H. Kam et al., “Design and Reliability of a Micro-Relay Technology…,” IEDM 200910

Digital Circuit Design with NEMS

CMOS: delay set by electrical time constant

Quadratic delay penalty for stacking devices

Buffer & distribute logical/electrical effort over many stages

NEMS: delay dominated by mechanical movement

Can stack ~100-200 devices before td,elec ≈ td,mech

So, want all to switch simultaneously

Implement logic as a single complex gate

NEMS: 12 switches

11

Need to Compare at Block Level

Delay Comparison vs. CMOS

Single mechanical delay vs. several electrical gate delays

For reasonable load, NEMS delay unaffected by fan-out/fan-in

Area Comparison vs. CMOS

Larger individual devices

But often need fewer devices to implement same function

4 gate delays 1 mechanical delay

F. Chen et al., “Integrated Circuit Design with NEM Relays,” ICCAD 2008

NEMS: 12 switches

12

Example: NEMS Adder

Full adder cell:

12 NEMS vs. 24

transistors

XOR “free”

Complementary signals

avoid extra mechanical

delay (to invert)

NEMS all sized minimally

13

N-bit NEMS Adder

Ripple carry configuration

Cascade full adder cells to

create larger complex gate

Stack N NEMS, but still

single mechanical delay

Performance gap to

CMOS shrinks to ~10x for

32-bit add

(10ns delay at 90nm)

14

Scaled NEMS vs. CMOS Adders

For similar area: >9x lower E/op, >10x greater delay

*D. Patil et. al., “Robust Energy-Efficient Adder Topologies,” in Proc. 18th IEEE Symp.

on Computer Arithmetic (ARITH'07).

9x

10x

Energy/op vs. Delay/op across Vdd

30x less capacitance

Lower device Cg, Cd

Fewer devices

2.4x lower Vdd

No leakage energy

Compare vs.

Sklansky CMOS

adder*

F. Chen et al., “Integrated Circuit Design with NEM Relays,” ICCAD 2008

15

Parallelism helps

16

Energy/op vs. Delay/op across Vdd & CL Can extend energy

benefit up to

GOP/s throughput

As long as

parallelism is

available

Area overhead

bounded

CMOS needs to be

parallelized at

some point too

16

Contact Resistance

Low contact R

not critical

Good news for

reliability…

17

Energy/op vs. Delay/op across Vdd & CL

NEMS Circuit Demonstration

Test Devices

Logic

Adders,

Compressors

Timing

Elements

Latches, FFs

Memory

SRAM, DRAM

I/O

ADC, DAC

18

F. Chen et al., “Demonstration of Integrated Micro-Electro-Mechanical Switch Circuits for

VLSI Applications,” ISSCC 2010.

Things We Learned…

Layout Matters

Unbalanced current flow:

flexures burn up!

Parasitic capacitors: can

affect Vpi (DIBL-like effect)

19

Measured Inverter VTC

VTC looks digital, suggests composability

20

NEMS Latch Shows Composability

Designed as if we used MOSFETS, but that is not always optimal…

21

Switch-based Carry Generation

Demonstrates propagate-generate-kill logic as a

single complex gate

22

Measured DRAM Results

Simultaneous read and write

Read latency = tmech,on (decoder) + tmech,off (WLRD switch)

23

Measured DAC Results

Resistive divider based DAC

2-bit thermometer coded output

Code = 0 0 “Vin”

Code = “Vin” 1 1

2424

NEM switch scaling – next design

1um litho

Scaled NEMS size

20um x 20umNEM switch size

120um x 150um

0.25um litho

25

Conclusions

NEMS unique features enable scaling post-CMOS

Nearly ideal Ion/Ioff

Switching delay largely independent of electrical

Need to adapt circuit design style

Reliability improving

Circuit level insights critical (contact R)

Demonstrated simple circuits

Can start thinking about building more complex systems

Potentially order of magnitude lower E/op than CMOS

Next steps: scaling and improved device design, testing larger

digital blocks

26

t

top related