energy recovery and recycling in computation: reversible … · 2016-01-22 · there is no...
TRANSCRIPT
§ There is no fundamental lower limit of energy! § Reversible computation can recover signal energy and
avoid dissipation to heat. § Design tools were developed to support the design of
Bennett-clocked adiabatic circuits. § A reduced MIPS microprocessor was designed in
adiabatic logic. § Paradigms such as QCA can avoid limits due to leakage.
Energy Recovery and Recycling in Computation: Reversible Adiabatic Logic Gregory L. Snider1, Ismo K. Hänninen1, César O. Campos-Aguillón1,2, Rene Celis-Cordova2, Alexei Orlov1, and Craig S. Lent1
1Department of Electrical Engineering, University of Notre Dame, Indiana 46556 2Tecnológico de Monterrey, Mexico
Motivation
§ Power dissipation is the limiting factor for CMOS ICs.
§ In standard CMOS the full energy of each bit is dissipated to heat at each logic transition.
§ Adiabatic CMOS using reversible logic can dramatically reduce the power dissipation in a digital system.
§ Landauer proposed that there is no lower limit on dissipation in computation. Dissipation must occur only when information is destroyed. This is known as the Landauer Principle. Is it correct?
§ If the Landauer principle is correct, when would energy recovery make sense?
§ An adiabatic microprocessor would make a significant circuit test-bed for reversible adiabatic logic.
§ Beyond-CMOS logic paradigms such as Quantum-dot Cellular Automata (QCA) map well onto reversible adiabatic logic.
References 1. G.P. Boechler, J.M. Whitney, C.S. Lent, A.O. Orlov, and G.L. Snider, “Fundamental limits of
energy dissipation in charge-based computing,” APL, Vol. 97, pp. 103502, 2010. 2. A.O. Orlov, C.C. Thorpe, G. P. Boechler, C.S. Lent, and G.L. Snider, “Experimental Test of
Landauer’s Principle at the Sub-kBT Level,” Jpn. J. Appl. Phys, 51, pp. 06FE10-1-5, 2012. 3. Ismo K. Hänninen, César O. Campos-Aguillón, Rene Celis-Cordova, and Gregory L.
Snider, “Design and Fabrication of a Microprocessor using Adiabatic CMOS and Bennett Clocking,” 7th Conference on Reversible Computation (RC 2015), Grenoble, France, 2015.
CONCLUSIONS
• Maxwell’s demon (1875) – by first measuring states, could perform reversible processes to lower entropy
• Szilard (1929), Brillouin (1962): measurement causes kBT ln(2) dissipation per bit.
• Landauer (1961,1970): only erasure of information must cause dissipation of kBT ln(2) per bit (Landauer’s Principle)
• Bennett (1982): full computation can be done without erasure.
logical reversibility – physical reversibility
Still somewhat controversial.
Minimum Energy for Computation
Test of the Landauer Principle
Design Tools Fabricated Chip Cell Design
Copy 1 Copy 0
Erase 1 Erase 0
Room temperature operations on a 73 kBT bit of information
Measured dissipation was 0.005 kBT (15 yJ).
MIPS Microprocessor
JJAP, 51, pp. 06FE10, 2012.
Adiabatic CMOS with Bennett Clocking When Does Energy Recovery Make Sense?
Example Eo = 1 aJ, Do = 10 ps, Ao = 10-11 cm2
104 W/cm2
102 W/cm2
Performance must be sacrificed!
Reversible adiabatic computing makes the best use of limited resources!
Multi-Core: EDA = Eo BDo Ao 100 EDA0
Dark Si: EDA = Eo Do BAo
100 EDA0 Reversible: EDA = Eo/C CDo Ao = Eo Do Ao
1 EDA0
What if Power Density is Constrained? Reversible computation always has some associated overhead
Bennett Clocking (retractile cascade) combined with split-level logic can implement with minimum spatial overhead.
CLK +
Vout
CLK –
Vin
CLK2+
CLK2–
CLK1+
CLK1–
CLK3+
CLK3–
CLK1+
CLK2+
CLK3+
PCMux
0
1RegistersWrite
register
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Instruction[15: 11]
Mux
0
1
Mux
0
1
1
Instruction[7: 0]
Instruction[25: 21]
Instruction[20: 16]
Instruction[15: 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorDMemReadMemWriteMemtoReg
PCWriteCondPCWrite
IRWrite[3:0]
ALUOp
ALUSrcBALUSrcA
RegDst
PCSource
RegWriteControl
Outputs
Op[5: 0]
Instruction[31:26]
Instruction [5: 0]
Mux
0
2
JumpaddressInstruction [5: 0] 6 8
Shiftleft2
1
1 Mux
0
32
Mux
0
1ALUOut
MemoryMemData
Writedata
Address
PCEn
ALUControl
From Weste and Harris (Addison-Wesley)
Three Bennett zones of 12 phases, no pipelining Reduced instruction set, but still universal 32 bit instructions, 8 bit data path
Design tools based on SystemVerilog and Mentor Graphics ModelSim for Bennett clocked adiabatic circuits
Ramp Logic Timing simulation Standard Cell Library Bennett Energization Sequence Checker Standard Logic Synthesis
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
0 2 4 6 8 10 12
InputOutput
Time (s)
0
0.5
1
1.5
2
2.5
3
3.5
CLK +CLK –
Initial Experimental results
Notre Dame 1µm process Mosis fab in process
Approximately 5700 transistor, 45% in adiabatic circuits.