on the energy efficiency of computation mihai budiu cmu cs calcm seminar feb 17, 2004 note: this...

40
On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 this version fixes some errors in the ASH performance graphs

Upload: connor-hughes

Post on 27-Mar-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

On The Energy Efficiency of Computation

Mihai Budiu

CMU CS

CALCM Seminar

Feb 17, 2004

Note: this version fixes some errors in the ASH performance graphs shown

Page 2: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

2

Presentation Setup

main( )

{

signal(SIGINT, welcome);

while (slides( ) && time( )) {

talk( );

}

}

Page 3: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

3

Why Do We Care?

Toasted CPU: about 2 sec after removing cooler. (Tom’s Hardware Guide)

Page 4: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

4

Power and Power Density

0

50

100

150

200

250

0.25m 0.18m 0.13m 0.1m

Wat

ts

0

25

50

75

100

Po

wer

Den

sity

(W

/cm

2)Leakage power

Active power

Power Density

Data from Fred Polack, Intel, MICRO 32

Assuming constant die size, no power management

Page 5: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

5

Power Density Distribution

Chip surface

Data from Fred Polack, Intel, MICRO 32

Page 6: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

6

Outline• Introduction

• Power and Energy Efficiency– data from Bob Brodersen,

Berkeley wireless group

• Synchronous Hardware Efficiency

• Asynchronous Hardware Efficiency

• ASH Efficiency

• Conclusions

Page 7: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

7

Energy Efficiency Metric

How much computing can we can do... ...with a finite

energy source?

Page 8: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

8

Some Arithmetic

Page 9: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

9

Energy and Power Efficiency

The energy efficiency metric for energy constrained applications (OP/nJ) =

thermal (power) considerations when maximizing throughput (MOPS/mW).

Joule Watt

OP/nJ = MOPS/mW

Page 10: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

10

ISSCC Chips (.18mm-.25mm)# Year Description # Year Description

1 1997 S/390  

11 1998 Graphics

2 2000 PPC (SOI)  

12 1998 Multimedia

3 1999 G5  

13 2000 Multimedia

4 2000 G6 

14 2002 Mpg decoder

5 2000 Alpha  

15 1998 Multimedia

6 1998 P6

16 2001 Encryption Processor

7 1998 Alpha

17 2000 Hearing Aid Processor

8 1999 PPC

18 2000 FIR for Disk Read Head

9 1998 StrongArm

19 1998 MPEG Encoder

10 2000 Comm

20 2002 802.11a Baseband

Microprocessors DedicatedDSP’s# Year Description

Page 11: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

11

0.01

0.1

1

10

100

1000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Chip Number

En

erg

y (P

ow

er)

Eff

icie

ncy

M

OP

S/m

WEnergy Efficiency (MOPS/mW or OP/nJ)

3 orders of magnitude!

Page 12: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

12

Outline• Introduction

• Power and Energy Efficiency

• Synchronous Hardware Efficiency

• Asynchronous Hardware Efficiency

• ASH Efficiency

• Conclusions

Page 13: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

13

Explaining the Difference

Operations per second:

MOPS = fclk £ N op

Operations per clock

Chip area per operation

Efficiency:

MOPS/Pchip= (fclk £ Nop )/ (Achip £ Csw £ Vdd2 £ fclk )

=1/(Aop £ Csw £ Vdd2)

Normalized switched capacitancePower:

Pchip = Achip £ Csw £ Vdd2 £ fclk

Page 14: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

14

Supply Voltage, Vdd

0

0.5

1

1.5

2

2.5

3

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Chip Number

Vd

d (

Vo

lts

)

MOPS/Pchip =1/(Aop £ Csw £ Vdd2)

Page 15: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

15

Normalized Switched Capacitance, Csw

10

30

50

70

90

110

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Chip Number

Csw

(pf

/mm

2 )

MOPS/Pchip =1/(Aop £ Csw £ Vdd2)

3x

Page 16: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

16

Area per operation, Aop

0.01

0.1

1

10

100

1000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Chip Number

Ao

p (m

m2 p

er

op

era

tio

n)

Aop = Achip/NopMOPS/Pchip =1/(Aop £ Csw £ Vdd2)

AHA!

Page 17: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

17

0.01

0.1

1

10

100

1000

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Chip Number

En

erg

y (

Po

wer)

Eff

icie

nc

y (

MO

PS

/mW

)Focusing In

PPC

NECDSP

802.11a

Page 18: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

18

mP: MOPS/mW=.13

Useful arithmetic

Nop = 2 (two ways)fclock = 450 MHz

) 900 MIPS

Aop = Achip/2= 42mm2

Power = 7 Watts

Page 19: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

19

DSP: MOPS/mW=7

4 processors £ 4 ops eachNop = 16

fclock = 50 MHz) 800 MOPS

Aop = Achip/16= 5.3mm2

Power = 110 mW

Page 20: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

20

Dedicated Design: MOPS/mW=200

Nop = 96

fclock = 25 MHz

) 2400 MOPS

Aop = 5.4 mm2/96 =.15 mm2

Power = 12 mW

Complex MAC = 8 ops

Fully parallel mapping of adaptive correlator algorithm.

Page 21: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

21

Memory is More Power-Efficient

1

10

100

0.25m 0.18m 0.13m 0.1m

Po

wer

Den

sit

y (

Watt

s/c

m2)

Logic

Memory

Hint: use on-chip caches

Page 22: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

22

Energy Distribution in mP

Integer execution

19%

Reservation stations

10%

Reorder buffer15%

Memory order buffer

8%

Data cache14%

Branch target buffer

6%

Floating point execution

10%Global clock

10%

Register alias table8%

“useful” (includes local clock)

Page 23: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

23

Efficiency and Performance

• Vdd + ! fclock +, MOPS +Power +MOPS/mW *

• Better metric: Energy £ delay

–Roughly independent of Vdd

Page 24: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

24

Efficiency and Technology

1000

100

10

1

0.1

0.01

0.0012 1 0.5 0.25 0.13 0.1 0.07

MOPS / mW

feature size [µ]

hardwired

microprocessors

[T. Claasen, ISSCC 1999]

DSP

Page 25: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

25

How Low Can You Go?

• Energy required to compute is ZERO

• If computation is quasistatic...

• ...and no information is destroyed (reversible)

Ops/nJ ! 1

Rolf Landauer

Page 26: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

26

Outline• Introduction

• Power and Energy Efficiency

• Synchronous Hardware Efficiency

• Asynchronous Hardware Efficiency

• ASH Efficiency

• Conclusions

Page 27: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

27

Lutonium Performance

• Asynchronous microcontroller

• Designed and implemented at Caltech

• 0.18 mm technology

• 1.8V supply, 0.4V/0.5V th

• 200 MIPS

• 1.8 ops/nJDSP-like

Alain Martin

Page 28: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

28

Efficiency and Supply Voltage

200

100

48

4

66

1.8

4.83

10.9

23

7.2

0

50

100

150

200

250

1.8V 1.1V 0.9V 0.8V 0.5V

Supply voltage

MIP

S

0

5

10

15

20

25

MIP

S/m

W

performance

efficiency

Page 29: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

29

Async Processor Breakdown

ALU2%

Registers14%

Decode24%

I-Mem24%

I-Fetch24%

Slack6%

Buses2%PSW

4%

“useful”

Page 30: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

30

Outline• Introduction

• Power and Energy Efficiency

• Synchronous Hardware Efficiency

• Asynchronous Hardware Efficiency

• ASH Efficiency

• Conclusions

Page 31: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

31

Application-Specific Hardware

C code

Compiler forApplication

SpecificHardware

Asynchronous Circuits

Memory

Page 32: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

32

Tool-FlowC

CASHcore

Verilog back-end

Synopsys,Cadence P/R

ASIC

180nm std. cell library, 2V

~1999technology

Mediabench kernels(1 hot function/benchmark)

Memory

Page 33: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

33

Caveat

Memory

we model this partaccurately

optimistic speed model,no power accounting

Page 34: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

34

ASH Performance

0

500

1000

1500

2000

2500

3000

adpc

m_d

adpc

m_e

g721

_d

g721

_e

gsm

_d

gsm

_e

jpeg_

e

mpe

g2_d

mpe

g2_e

pegw

it_d

Meg

aop

erat

ion

s p

er s

eco

nd

MOPSall

MOPSspec

MOPS

Page 35: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

35

ASH vs 600MHz CPU

Page 36: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

36

ASH Area

minimal RISC core

0

1

2

3

4

5

6

7

8

9

adpc

m_d

adpc

m_e

g721

_d

g721

_e

gsm

_d

gsm

_e

jpeg_

e

mpe

g2_d

mpe

g2_e

pegw

it_d

Sq

ua

re m

m

Page 37: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

37

Normalized Area

0

10

20

30

40

50

60

70

80

90

100

adpc

m_d

adpc

m_e

g721

_d

g721

_e

gsm

_d

gsm

_e

jpeg_

e

mpe

g2_d

mpe

g2_e

pegw

it_d

So

urc

e l

ine

s/s

q m

m

many Cmacros

Page 38: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

38

ASH Energy Efficiency

0

10

20

30

40

50

60

70

adpc

m_d

adpc

m_e

g721

_d

g721

_e

gsm

_d

gsm

_e

jpeg_

e

mpe

g2_d

mpe

g2_e

pegw

it_d

Use

ful o

pe

ratio

ns/

nJ

Page 39: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

39

All Together Now

0.01 0.1 1 10 100 1000

Energy Efficiency (MOPS/mW or OP/nJ)

General-purpose DSP

Dedicated hardware

ASH media kernels

Asynchronous microcontroller

Microprocessors

Page 40: On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs

40

Conclusions

• Performance comes at a price

• Energy efficiency is expressed in ops/nJ or MOPS/mW

• Dedicated hardware is more power-efficient than microprocessors

• ASH efficiency competitivewith dedicated hardware