low power architecture and implementation of multicore design khushboo sheth, kyungseok kim fan...

28
Low Power Architecture and Low Power Architecture and Implementation of Implementation of Multicore Design Multicore Design Khushboo Sheth, Kyungseok Khushboo Sheth, Kyungseok Kim Kim Fan Wang, Siddharth Dantu Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic ELEC6270 Low Power Design of Electronic Circuits Team Project Circuits Team Project VLSI D&T Seminar VLSI D&T Seminar Nov. 8 2006 Nov. 8 2006 Advisor: Dr. V Agrawal

Upload: kelley-bradley

Post on 31-Dec-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Low Power Architecture and Low Power Architecture and Implementation of Multicore DesignImplementation of Multicore Design

Khushboo Sheth, Kyungseok KimKhushboo Sheth, Kyungseok Kim

Fan Wang, Siddharth DantuFan Wang, Siddharth Dantu

ELEC6270 Low Power Design of Electronic Circuits ELEC6270 Low Power Design of Electronic Circuits Team Project Team Project

VLSI D&T Seminar VLSI D&T Seminar Nov. 8 2006Nov. 8 2006

Advisor: Dr. V Agrawal

Page 2: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Project ObjectivesProject Objectives

Design and verify 16-bit ALU with Design and verify 16-bit ALU with synchronous clocked inputs and outputs.synchronous clocked inputs and outputs.

Study low-voltage power and delay Study low-voltage power and delay characteristics of the design.characteristics of the design.

Redesign ALU for minimum power and Redesign ALU for minimum power and highest speed. highest speed.

Page 3: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Component of Power DissipationComponent of Power Dissipation

DynamicDynamic

Power due to Signal transitions.Power due to Signal transitions.• Logic power (due to logic transitions).Logic power (due to logic transitions).• Glitch power (due to glitches).Glitch power (due to glitches).

Short Circuit powerShort Circuit power

StaticStatic Leakage power (due to leakage currents).Leakage power (due to leakage currents).

Page 4: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Power components in CMOS circuitPower components in CMOS circuit

VVDDDD

GroundGround

CL

Ron

R=large

vi (t) vo(t)

Dynamic power

Short circuit power

Leakage power

Power = CVDD2

Page 5: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

1-bit ALU Design1-bit ALU Design

1-bit ALU Core

Reg B

Reg A

Reg C

Page 6: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

1 bit ALU Core1 bit ALU CoreSimulation SpecificationSimulation Specification

TechnologyTechnology TSMC 0.25 umTSMC 0.25 um

Application VoltageApplication Voltage 2.5 Volt2.5 Volt

N-MOS VthN-MOS Vth 0.365 V0.365 V

P-MOS VthP-MOS Vth -0.5625 V-0.5625 V

TemperatureTemperature 90 C degree90 C degree

Spice SimulatorSpice Simulator Eldo ver. 6.3.1.1Eldo ver. 6.3.1.1

Sweep Supply Voltage (6 point)Sweep Supply Voltage (6 point) 0,0.5,1.0,1.5,2.0,2.5 V0,0.5,1.0,1.5,2.0,2.5 V

Page 7: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Combinational Logic

DFF

NX156

NX80

NX16

NX60

A

B

CLK

C

CYINCY

Z

1-bit ALU Core Timing ( Vdd=2.5V )1-bit ALU Core Timing ( Vdd=2.5V )

Longest Path in Combinational Logic: c <= a+b (Opcode 0000)

opcode[3:0]

COMPOUT

C

CY

COMPOUT

Z

opcode 1010 (nand) opcode 1001 (c<=b)

opcode 1000 (c<=a) opcode 0111 (and)opcode 0110 (or) opcode 0101 (nor)opcode 0100 (xor) opcode 0011 (not equal)opcode 0010 (equal) opcode 0001 (a-b)opcode 0000 (a+b) opcode others (all zero’s output)

Page 8: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

1-bit ALU Core Sweep Vdd from 2.5V to 0V1-bit ALU Core Sweep Vdd from 2.5V to 0V

2.5V

2.0V

1.5V

1.0V

0.5V

0.0V

Analog Mode C(NX156) Output

Vdd=2.5

Vdd=0.5

Page 9: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

1Bit ALU Core Logic Operation Voltage @200Mz1Bit ALU Core Logic Operation Voltage @200Mz

Supply Voltage Sweep near PMOS Vth = -0.5625 V ( ver. NMOS Vth= 0.365)

Sweep From Vsupply = 0.50 to 1.00 Volt ( linear increment 0.05 V, 11 point)

Vsupply = 0.85 V

Correct Operation

Overshoot

Ripples

Vsupply = 0.85 V

(Analog Domain)

Output

Input

Vsupply = 0.80 V

(Analog Domain)

Vsupply = 0.80 V

Wrong Operation

Output

Input

opcode 1000 (c<=a)

Page 10: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

1-bit ALU Average Power vs. Delay 1-bit ALU Average Power vs. Delay @200MHz@200MHz

1-bit ALU Core

Average Power

1bit ALU Block

Average Power

1-bit ALU Core

Delay

0 0.5 1 1.5 2 2.50

200

400

Vsupply(V)

Pow

er(

uW

)

Average Power ( Total ALU Block ver. ALU Core)

0 0.5 1 1.5 2 2.50

2

4

Dela

y(n

sec)

0.0 1.00.5 2.01.5 2.5

31.02830.5427

82.8828

354.563

179.91532.2493

1.4203

0.49550.7204

0.4123

Power = CVDD2

Page 11: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

16 Bit ALU (Single Core) Design16 Bit ALU (Single Core) Design

CombinationalLogic

(16-Bit ALU)OutputInput

Re

gis

ter

Re

gis

ter

CK

Supply voltage = Vref

Total capacitance switched per cycle = Cref

Clock frequency = fPower consumption: Pref = CrefVref

2f

Cref

Page 12: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

16-BIT ALU Vectors16-BIT ALU Vectors

aa bb OpcodeOpcode cyincyin

Vector1Vector1 10101010101010101010101010101010 00010101010101010001010101010101 0001 (sub)0001 (sub) 00

Vector2Vector2 01010101010101010101010101010101 10101010101010101010101010101010 0011 (comp)0011 (comp) 00

Vector3Vector3 01010101010101010101010101010101 10101010101010101010101010101010 0100 (xor)0100 (xor) 00

Vector4Vector4 11111111111111111111111111111111 00000000000000010000000000000001 0000 (add)0000 (add) 00

Vector5Vector5 01100110011001100110011001100110 00000000000000000000000000000000 1010 (nand)1010 (nand) 00

Vector6Vector6 00010110011011010001011001101101 01010100101010100101010010101010 0001 (sub)0001 (sub) 00

*Vector4 activate the critical path, carryout = 1

Page 13: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

16-Bit ALU Simulation Result16-Bit ALU Simulation ResultCircuit information: # 694 Gates Clock Frequency applied: 10 MHz

Temperature: 27C Temperature: 27C oo Vectors Applied: 6 vectors Vectors Applied: 6 vectors

TSMC025 Technology : Vthn = 0.365 V, Vthp = -0.562 V

By ELDO, SPICE simulation Simulation Time: 700 nsSimulation Time: 700 ns

VoltageVoltage

(v)(v)2.5 2.5 1.25 1.25 0.85 0.85 0.625 0.625 0.45 0.45

Static Static Power(nw)Power(nw)

24.55 24.55 6.02 6.02 3.05 3.05 1.84 1.84 1.711.71

Average Average Power Power (uw)(uw)

391.16 391.16 62.62 62.62 26.66 26.66 14.57 14.57 3.56 3.56

Delay (ns)Delay (ns) 2.83 2.83 7.14 7.14 18.88 18.88 73.21 73.21 Ckt Ckt failedfailed

Page 14: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

16 Bit ALU Functional Correct Operation at 2.5 V, 1.25 V, 0.85 V and 16 Bit ALU Functional Correct Operation at 2.5 V, 1.25 V, 0.85 V and 0.625 V for 6 Vectors0.625 V for 6 Vectors

Page 15: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Circuit fail @0.45 V (< Vth)Circuit fail @0.45 V (< Vth)

Simulated Single Vector Pair

Page 16: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

16-Bit ALU Power Savings and Delay 16-Bit ALU Power Savings and Delay Increase with Reference @ 2.5 VoltsIncrease with Reference @ 2.5 Volts

VoltageVoltage

(v)(v)

(Reference)(Reference)

VDDVDD

2.5V2.5V

1.25 V1.25 V VDD/2VDD/2

0.85 V0.85 V VDD/3VDD/3

0.625 V0.625 V VDD/4VDD/4

Average Average Power Power (uw)(uw)

391.16391.16

62.22 62.22

P2.5/6.24P2.5/6.24

84%84%

26.22 26.22

P2.5/14.67P2.5/14.67

93%93%

14.67 14.67

P2.5/26.66P2.5/26.66

96%96%

Delay Delay (ns)(ns) 2.832.83

7.147.14

2.57*D2.52.57*D2.5

18.87 18.87

6.67*D2.56.67*D2.5

73.21 73.21

25.87*D2.525.87*D2.5

Page 17: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

16 Bit ALU Power Savings and Delay 16 Bit ALU Power Savings and Delay Increase with Reference @1.25 VoltsIncrease with Reference @1.25 Volts

VoltageVoltage

(v)(v)

(Reference)(Reference)

1.251.250.850.85(VDD/1.5)(VDD/1.5)

0.6250.625(VDD/2)(VDD/2)

Average Average PowerPower

(uw)(uw)62.22 62.22

26.66 26.66

P1.25/2.35P1.25/2.35

57%57%

14.67 14.67

P1.25/4.27P1.25/4.27

77%77%

DelayDelay

(ns)(ns)7.147.14

18.87 18.87

2.63 * D1.252.63 * D1.25

73.21 73.21

10.25 * D1.2510.25 * D1.25

Page 18: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Different Technology Impact On Power SavingDifferent Technology Impact On Power Saving16 Bit ALU16 Bit ALU

Simulation Setup:Simulation Setup: Supply Voltage: 2.5vSupply Voltage: 2.5v Simulation Transient Time: 700 nsSimulation Transient Time: 700 ns 6 vectors6 vectors Temperature: 27CTemperature: 27Coo

TechnologyTechnology TSMC035 TSMC035 TSMC025TSMC025

#Gates after synthesis#Gates after synthesis 734 gates734 gates 694 gate694 gate

Voltage Voltage 2.5 V2.5 V 2.5 V2.5 V

Static PowerStatic Power 24.555 N Watts24.555 N Watts 24.550 N Watts24.550 N Watts

Average Power Average Power 381.60 U Watts381.60 U Watts 391.16 U Watts391.16 U Watts

Delay Delay 3.12 ns3.12 ns 2.83 ns2.83 ns

Page 19: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Temperature Influence On PowerTemperature Influence On Power Circuit information: # 734734 Gates Clock Frequency applied: 10 MHz ; Vdd=2.5V Vectors Applied: 6 vectorsVectors Applied: 6 vectors Simulation Time: 700 nsSimulation Time: 700 ns TSMC035 Technology

TemperatureTemperature

(C (C o o )) 00 2727 6060 9090 120120 900900

Static Power Static Power

(nw)(nw)12.712.7 24.524.5 75.5175.51 357.36357.36 4803.34803.3

3.383.38

mwmw

Average Power Average Power (uw)(uw) 404.23404.23 381.60381.60 378.15378.15 367.48367.48 363.15363.15

70.4370.43

ww

Delay (ns)Delay (ns) 2.582.58 3.123.12 3.183.18 3.533.53 3.913.91 Ckt Ckt fail!!fail!!

Page 20: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Multicore Design MethodologyMulticore Design Methodology Lower supply voltageLower supply voltage

This slows down circuit speedThis slows down circuit speed Use parallel computing to gain the speed backUse parallel computing to gain the speed back

Multi-core means to place two or more complete cores Multi-core means to place two or more complete cores within a single module.within a single module.

This architecture is a “divide and conquer” strategy. By This architecture is a “divide and conquer” strategy. By splitting the work between multiple execution cores , a splitting the work between multiple execution cores , a multi-core design can perform more work within a given multi-core design can perform more work within a given clock cycle.clock cycle.

About more than 60% reduction in power is observed.About more than 60% reduction in power is observed.

Source: http://www.eng.auburn.edu/~vagrawal/D&TSEMINAR_SPR06/SLIDES/Agrawal_DTSem06.ppt

Page 21: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Parallel ArchitectureParallel ArchitectureComb.Logic

Copy 1

Comb.Logic

Copy 2

Comb.Logic

Copy 4

Rg

st

Re

gis

ter

Rg

stR

gst

4 to

1 m

ulti

ple

xer

InputOutput

CK

f

f/4

f/4

Rg

stf/4

Comb.Logic

Copy 3

f/4

Mux controlCk0

Ck1Ck2

Ck3

16 Bit ALU

Page 22: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Control Signals, N = 4Control Signals, N = 4

CK

Phase 1

Phase 2

Phase 3

Phase 4

Mux control00 01 10 11 00 01 01 10 11 ……

Page 23: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

16 Bit ALU 16 Bit ALU Multi-core Power Savings and Delay Increase with Multi-core Power Savings and Delay Increase with

Reference @2.5 VoltsReference @2.5 Volts

Circuit information: # 2617 Gates Clock Frequency applied: 10 MHz Temperature: 27C Vectors Applied: 6 vectorsTemperature: 27C Vectors Applied: 6 vectorsTSMC025 Technology : Vthn = 0.365 V, Vthp = -0.562 V Simulator: ELDO(Spice) Simulation Setup: Simulation Time: 700 nsSimulation Setup: Simulation Time: 700 ns

VoltageVoltage

(v)(v)

(Reference)(Reference)

2.5 2.5 1.25 1.25

VDD/2VDD/2

0.85 0.85

VDD/3VDD/3

0.625 0.625

VDD/4VDD/40.45 0.45

Static Static Power (nw)Power (nw) 96.3596.35 23.5623.56 11.9411.94 7.217.21 6.376.37

Average Average PowerPower

(uw)(uw)687.86687.86

95.64U95.64UP2.5/7.19P2.5/7.19

86%86%

40.93U40.93UP2.5/16.8P2.5/16.8

94%94%

21.13U21.13UP2.5/32.55P2.5/32.55

94.75%94.75%7.26U7.26U

DelayDelay

(ns)(ns) 0.110.11 0.570.575.18*D2.55.18*D2.5

1.521.5213.8*D2.513.8*D2.5

30.7030.70279.1*D2.5279.1*D2.5

Ckt Ckt failed failed

Page 24: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

16 Bit ALU Multicore Power Savings and 16 Bit ALU Multicore Power Savings and Delay Increase with Reference @1.25 VoltsDelay Increase with Reference @1.25 Volts

VoltageVoltage

(v)(v)

(Reference)(Reference)

1.251.25

VDDVDD

0.850.85

VDD/1.5VDD/1.5

0.6250.625

VDD/2VDD/2

Average Average PowerPower

(uw)(uw)95.6495.64

40.93 40.93

P1.25/2.33P1.25/2.33

57%57%

21.13 21.13

P1.25/4.52P1.25/4.52

78%78%

DelayDelay

(ns)(ns)0.570.57

1.52 1.52

2.67 * D1.252.67 * D1.25

30.7 30.7

53.86 * D1.2553.86 * D1.25

Page 25: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Power and Delay comparison @2.5 V Power and Delay comparison @2.5 V Reference Design with Multicore Design at different voltagesReference Design with Multicore Design at different voltages

VoltageVoltage

(v)(v)

2.52.5

VDDVDDReference Reference DesignDesign

1.251.25Multicore Multicore DesignDesign

VDD/2VDD/2

0.85 0.85 Multicore Multicore DesignDesign

VDD/3VDD/3

0.7250.725MulticoreMulticore

DesignDesign

VDD/3.5VDD/3.5

0.70.7MulticoreMulticore

DesignDesign

VDD/3.6VDD/3.6

0.625 0.625 Multicore Multicore DesignDesign

VDD/4VDD/4

Average Average PowerPower

(uw)(uw)

391.16 391.16 95.6495.64

P2.5/4.09P2.5/4.09

76%76%

40.9340.93

P2.5/9.56P2.5/9.56

89.5%89.5%

25.625.6

P2.5/15.23P2.5/15.23

93.45%93.45%

22.3522.35

P2.5/17.5P2.5/17.5

94.3%94.3%

21.1421.14

P2.5/18.5P2.5/18.5

94.6%94.6%

DelayDelay

(ns)(ns)

2.83 2.83 0.57 0.57

D2.5/4.96D2.5/4.96

1.52 1.52

D2.5/1.86D2.5/1.86

2.612.61

D2.5/1.08D2.5/1.08

3.043.04

D2.5/0.93D2.5/0.93

30.7 30.7

D2.5/0.09D2.5/0.09

Page 26: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

SummarySummary

For Single core ALU design we get more than 60% For Single core ALU design we get more than 60% power savings at reduced voltage but at the cost of power savings at reduced voltage but at the cost of performance. performance.

With Reference of 2.5 Volts we observe power drops With Reference of 2.5 Volts we observe power drops faster than 1/Vsquare.faster than 1/Vsquare.

With Reference of 1.25 Volts, power drop is almost With Reference of 1.25 Volts, power drop is almost equal to 1/Vsquare.equal to 1/Vsquare.

Multi-core design helps to gain the speed back at Multi-core design helps to gain the speed back at reduced voltage and consumes less power. reduced voltage and consumes less power.

Page 27: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

ReferencesReferences ELEC6270 Low Power Design Electronics Class Slides from Dr. Agrawal ELEC6270 Low Power Design Electronics Class Slides from Dr. Agrawal Spring 06, Dr. Agrawal’ Presentation on VLSI D&T seminar “Spring 06, Dr. Agrawal’ Presentation on VLSI D&T seminar “

Multi-Core Parallelism for Low-Power DesignMulti-Core Parallelism for Low-Power Design”” www.tomshardware.comwww.tomshardware.com N. H. E. Weste and D. Harris, N. H. E. Weste and D. Harris, CMOS VLSI Design, Third EditionCMOS VLSI Design, Third Edition, Reading, , Reading,

Massachusetts, Addison-Wesley, 2005.Massachusetts, Addison-Wesley, 2005. L. Shang, R.P Dick, “Thermal crisis: challenges and potential solutions,” PotL. Shang, R.P Dick, “Thermal crisis: challenges and potential solutions,” Pot

entials IEEE, vol. 25 , Issue 5, 2006entials IEEE, vol. 25 , Issue 5, 2006 International Technology Roadmap for Semiconductors. International Technology Roadmap for Semiconductors. http://public.itrs.nethttp://public.itrs.net Alokik Kanwal, “A review of Carbon Nanotube Field Effect Transistors” VersiAlokik Kanwal, “A review of Carbon Nanotube Field Effect Transistors” Versi

on 2.0, 2003on 2.0, 2003 K. K Likharev, “Single Electron Devices and their applications,” Proc IIEEE, K. K Likharev, “Single Electron Devices and their applications,” Proc IIEEE,

vol. 87, no. 4, pp. 606-632, Apr. 1999vol. 87, no. 4, pp. 606-632, Apr. 1999 A. P. Chandrakasan and R. W. Brodersen, A. P. Chandrakasan and R. W. Brodersen, Low Power Digital CMOS Low Power Digital CMOS

DesignDesign, Boston: Kluwer Academic Publishers (Now Springer), 1995., Boston: Kluwer Academic Publishers (Now Springer), 1995. ““Quad-core processor forecas”,Quad-core processor forecas”,Alexander WolfeAlexander Wolfe @ @TechWebTechWeb

Page 28: Low Power Architecture and Implementation of Multicore Design Khushboo Sheth, Kyungseok Kim Fan Wang, Siddharth Dantu ELEC6270 Low Power Design of Electronic

Thank You !!!