signaling with conserved quantities: two realizations in cmos and sfq logic jim kajiya microsoft...

Signaling with conserved quantities: two realizations in

CMOS and SFQ Logic

Jim Kajiya

Microsoft Research

Power dissipation governs computing performance

• Mobile– Performance is determined by stringent total power

dissipation requirements– ambient temperature

• Fixed– Power dissipation limited by available cooling heat

flux capacity – Lower than ambient

reduces dissipation ofhigh performancedevices.

Source: BeHardware.com

Talk outline

• Discuss power consumption in CMOS– Dynamic– Static

• Signaling with conserved charge to improve dynamic power of async logic.

• Causes of static power dissipation.• Why cold computers can solve this

problem.• SFQ, the ultimate cold electronics

Where does the power go?

• RL = Re(ZL) the real part of the load impedance

• The on (and off) resistance of the switching devices

• The power dissipated in the line.

Where does the power go in CMOS?

• RL=$\infty$ in CMOS

• Most of the power dissipation happens in the switching devices

• Line dissipation is becoming increasingly important as we scale down

CMOS power dissipation has two causes

• Switching power dissipation– Energy is U=CVdd

2 per cycle.

– Async Logic traditionally touted as good approach to this, but it can be much better.

• Static power dissipation– Leakage is dependent on subthreshold swing

S=∂VGS / ∂log(ID)

– Async logic is no better than any other logic with respect to leakage.

Signalling with conserved quantities

• The practice of “readback” in aviation radio communications

• Implement a bipolar version of van Berkel’s single rail handshake

• A conventional version would look like this

Adiabatic logic

• The conventional scheme does not conserve the charge but dissipates it across switches

• Switches avoid dissipation by closing only when ΔV=0 (and opening only when ΔI=0).

• Adiabatic logic conserves charge by powering from the clock line, recycling charge, and using an external inductor to store recovered current from the clock pin.

• Requirement for multiphase clocking.• Is there an asynchronous version of Adiabatic

Logic?

Asynchronous Adiabatic Logic

• Throw away the clock function of the power supply but keep its oscillatory behavior

• The power supply is a global AC signal π, locally halfwave rectified to π+ and π-

• π is not a clock– The frequency of π determines slew rates of signals– Hence it determines an upper bound on system timing but does

not otherwise determine it.– The period may be shorter than logic delays, or it may be longer

for extremely low dynamic power.– The phase of π need not be managed unlike clock skew.– Only a single phase is needed.

Asynchronous Adiabatic Signaling

Static power dissipation

• The other cause of power dissipation is static power dissipation

• In ideal CMOS, Pstatic=0.• But as everyone knows, modern CMOS has

signficant static dissipation because we can’t turn off the transistors.

• Subthreshold leakage has caused hundred million dollar projects to be canceled.

• Asynchronous logic has no power dissipation advantage for static power.

Leakage

Where does leakage come from?

MOS device physics in 3 slides

Fermi Function

• P(E)=1/(e(E-Ef)/kT+1)

MOS Device

• A MOS transistor works by manipulating Fermi levels via its terminals.

ID=-(W/L) 0VD QN d

Subthreshold leakage is an exponential phenomenon

• Id=ISexp((VG-VT)q/kT)

• So difference between gate and threshold voltage measured in (kT/q) units determines leakage

Source: D. Foty “Eval of deep-submicron CMOS design”

We need to adopt temperature scaling

• Voltage scaling was required at the half micron node for field strength limits

• Temp scaling is required for leakage now.• Temp scaling, along with length and voltage

scaling, travels toward MOS scaling paradise• Mobility increases inversely as a bonus• Short channel effects are still very significant,

but dealing with them is better in the cold.• Thermal scaling has a limit with freeze-out• New non-MOS devices work better when cold.

Cold wires are better, too.

• Speed of wires in Elmore model is an RC phenomenon.– Submicron ICs have crosstalk cap 20% of line cap yielding data

dependent power and delay.– Charge recycling/Adiabatic logic can mitigate cap, but not

resistance.– Resistance in pure metals is by phonon scattering: resistance

linear factor with T.

• Speed of wires in transmission lines– In a properly terminated line, power = duty cycle.

• Narrow width RZ signalling has lowest power.

– Limited by dispersion/– Dispersion set by conductance of line which gets better as it gets

colder.

There are other ways of combatting leakage

• Simple but difficult– Throw away silicon dioxide as insulator– Use high-k insulators: Hafnium Dioxide

• Elaborate device structures:– Double and triple gate structures– 3D structures: Finfets

• All of these are complementary to temperature scaling, and can enhance each other.

• $30 heatpipe/Radiator structures say that refridgeration is not so scary in high volumes.

The ultimate in cold electronics

• Superconducting electronics is the ultimate in cold electronics

• In a superconductor all the electron pair wavefunctions collapse into a single order parameter .

• Josephson junction: =1- 2

– I=Ic sin – d/dt = 2qV/hbar

JJs and SQUIDS

• Rapid Single Flux Quantum Logic signals with single flux quantum voltage pulses.

Source: E. Tolkacheva, et al. Chalmers University

FLUX-1 Microprocessor

SUNY SB - TRW

Hypres offers multiproject chip foundry services

• Cell libraries (multipliers, c-elements, etc)• 10K JJ sized chips.• 6 week turn around time• Deep academic discounts

• Superconducting logic is naturally asynchronous• Integration is low enough, that architecture

involves basic ideas instead of gluing together microprocessors and caches as in CMOS.

Asynchronous SFQ logic?

• RFSQ is clock based

• Flux pulses can be both positive and negative

• Bipolar flux signals easily inverted with transformer coupling.

What are the problems with superconducting logic?

• Cryogenic operation• Early in technology cycle

– Integration level is relatively low– Hypres process is at 2m line sizes– No good mass memory technology

• Flux shuttle shift registers• Van Duzer’s hybrid CMOS memory

– Architectural concepts are not well developed• Microprocessor designs using 30-100 clocks per instruction

• Interface to conventional logic is difficult– mV level signals at 100GHz rates– SERDES

• Flux trapping

What about refrigeration?

• Major advances in new refrigerators have been made in the last two decades: pulse tube cryocoolers vs. stirling coolers.

• Power required is a function of heat flux and temperature difference.

• Power of superconducting circuits is so low, heat flux is dominated by heat leak from copper interconnect

signaling with conserved quantities: two realizations in cmos and sfq logic jim kajiya microsoft...

cause of power dissipation

power dissipation advantage

downcmos power dissipation

dynamic power of async

switching devicesthe

low dynamic power

signficant static dissipation

asynchronous logic

Documents

realizations 1975 - 2020

sfq circuits with ground plane hole-assisted inductive ......

performance evaluations of finite difference applications...

single flux quantum (sfq) circuit fabrication and design ......

small farm quarterly - pub › published › sfq ›...

sinteza električnih...

high‐efficiency metasurfaces: principles, realizations,...

generation of sfq-pulses in sns-junctions: learning...

analogue realizations

rendering fur with three dimensional textures james t....

mem1833 topic 3 state space realizations handout

an operand routing network for an sfq reconfigurable...

pocket gamelan: realizations of a microtonal composition

pingala realizations - chaitanya charan pr

buddhist women across cultures: realizations. edited by

optimal realizations of floating-pointimplemented digital

252.eon, kajiya command line datasets profile vs ......

2 three-phase sq/sfq/bq/cq sq sfq bq cq€¦ · liquids in...

weyerhaeuserfilecache.investorroom.com/mr5ir_weyerhaeuser/94/download... ·...

optimizing the architecture of sfq-rdp (single flux quantum-...