energy efficient computing - 26mar13

48
1 Energy Efficient Computing - Through a 21c Looking Glass. Abstract: With the assistance of its global partners, ARM shipped 8.7 billion CPUs in 2012; a number which continues to grow at around ~20%pa. The 40B we have shipped to date outnumber the total of PC's more than 50 times; and today more than 75% of the things connected to the Internet are ARM based. The dominant nature of Computing in the 21c is very different to that of the Mainframe era. It is sobering to think that if each of those 8.7B CPUs was to dissipate just 100mw, then it would require the output of two modern power stations to drive them; with 2.4 next year, and 3 the year after that! So Electronic Systems are also defining where the real Energy Efficient Computing issue is! But with such a small footprint it must be easy to measure and manage power optimisation? An increasing percentage of these are immensely complex systems, running significant multi-tasking and multi-threaded operating systems on platforms which include multi-processor CPU/GPU configurations, and GB of memory. Whilst their minimum dissipations are a few uW, their peak power exceed the silicon's ability to dissipate it; so the penalty for power un-aware software design is huge. What has been done to manage this in Electronic Systems design, and what lessons can be transferred to the Classic Computing domain? Context 40min Keynote at Energy Efficient Computing Workshop at University of Bristol, UK. 26mar13 By: The TSB’s Energy Efficient Computing SIG (EEC-SIG) and; UoBristol Energy Aware COmputing (EACO) initiative https://connect.innovateuk.org/web/eec ..and.. http://www.Cs.bris.ac.uk/Research/Micro/eaco.jsp SlideCast and pdf available via http://ianp24.blogspot.co.uk/

Upload: ian-phillips

Post on 06-May-2015

1.364 views

Category:

Technology


2 download

DESCRIPTION

(See: http://youtu.be/9rP-5TSk_dA) Electronic Systems support every aspect of our lives today, both Visibly and Invisibly. Numbered in their tens of billions these are the dominant form of computing we now experience. And whilst many dissipate just milliwatts, their shear volume makes them a significant consumer of energy in their own right. Energy Efficiency in Computing has moved from the mainframe to become a consumer issue. ## By Ian Phillips http://ianp24.blogspot.co.uk/ ## Opinions expressed are my own Moved from SlideShare 10mar14 with 1064 views)

TRANSCRIPT

Page 1: Energy Efficient Computing - 26mar13

1

Energy Efficient Computing - Through a 21c Looking Glass.

Abstract: With the assistance of its global partners, ARM shipped 8.7 billion CPUs in 2012; a number which continues

to grow at around ~20%pa. The 40B we have shipped to date outnumber the total of PC's more than 50 times; and today more than 75% of the things connected to the Internet are ARM based. The dominant nature of Computing in the 21c is very different to that of the Mainframe era. It is sobering to think that if each of those 8.7B CPUs was to dissipate just 100mw, then it would require the output of two modern power stations to drive them; with 2.4 next year, and 3 the year after that! So Electronic Systems are also defining where the real Energy Efficient Computing issue is! But with such a small footprint it must be easy to measure and manage power optimisation? An increasing percentage of these are immensely complex systems, running significant multi-tasking and multi-threaded operating systems on platforms which include multi-processor CPU/GPU configurations, and GB of memory. Whilst their minimum dissipations are a few uW, their peak power exceed the silicon's ability to dissipate it; so the penalty for power un-aware software design is huge. What has been done to manage this in Electronic Systems design, and what lessons can be transferred to the Classic Computing domain?

Context 40min Keynote at Energy Efficient Computing Workshop at University of Bristol, UK. 26mar13

By: The TSB’s Energy Efficient Computing SIG (EEC-SIG) and; UoBristol Energy Aware COmputing (EACO) initiative

https://connect.innovateuk.org/web/eec ..and.. http://www.Cs.bris.ac.uk/Research/Micro/eaco.jsp

SlideCast and pdf available via http://ianp24.blogspot.co.uk/

Page 2: Energy Efficient Computing - 26mar13

2

Prof. Ian Phillips Principal Staff Eng’r,

ARM Ltd [email protected]

Visiting Prof. at ...

Contribution to Industry Award 2008

Energy Efficient Computing Workshop Uo.Bristol 26mar13

1v1

Page 3: Energy Efficient Computing - 26mar13

3

Our 21c World ...

Page 4: Energy Efficient Computing - 26mar13

4

Electronic Systems are Everywhere ...

Page 5: Energy Efficient Computing - 26mar13

5

Electronic Systems are Everywhere ...

Bringing Embedded Intelligence to the Consumer Market, has changed the face of Computing!

Page 6: Energy Efficient Computing - 26mar13

6

Source: Adapted from Morgan Stanley, Nov 2009

Electronic Systems Will Create Our Future

‘Old Drivers’ don’t go away ... they don’t dominate any longer.

Today: ~2% of our Energy Use goes on Computing and Electronics! 1 ... Tomorrow: It could easily be 20%!

1: NATIONAL ACADEMY OF SCIENCES

Page 7: Energy Efficient Computing - 26mar13

7

ARM in the Digital World

1998 2012 2020

40+ billion CPUs to date

150+ billion CPUs cumulative by 2020

http://www.arm.com/

8.7B CPUs shipped in 2012 (Growing 20%pa.pa)

75% of the things connected to the Internet today are ARM Powered! Gartner

Page 8: Energy Efficient Computing - 26mar13

8

Moore’s Law ... 10nm

100nm

1um

10um

100um

Appr

oxim

ate

Proc

ess

Geo

met

ry

ITRS’99

Tran

sist

ors/

Chi

p (M

)

Tran

sist

or/P

M (K

)

X

... x More Functionality on a Si Chip in 20 yrs!

Gordon Moore. Founder of Intel. (1965)

http://en.wikipedia.org/wiki/Moore’s_law

Page 9: Energy Efficient Computing - 26mar13

9

Is HPC The Pinnacle of Computing?

Page 10: Energy Efficient Computing - 26mar13

10

... Or the Cloud?

Page 11: Energy Efficient Computing - 26mar13

11

... Or the iGadget?

Page 12: Energy Efficient Computing - 26mar13

12

A Machine for Computing ... Computing: A general term for algebraic manipulation of data ...

... State and Time are normally factors in this.

It can include phenomena ranging from human thinking to calculations with a narrower meaning. Wikipedia

Usually used it to exercise analogies (models) of real-world situations; Frequently in real-time (Fast enough to be a stabilising factor in a loop).

... So what part does Hardware and Software play? ... And what about Energy?

y=F(x,t,s) Numerated Phenomena

IN (x)

Processed Data/ Information

OUT (y)

Page 13: Energy Efficient Computing - 26mar13

13

Antikythera c87BC ... Planet Motion Computer

See: http://www.youtube.com/watch?v=L1CuR29OajI

Mechanical Technology

• Inventor: Hipparchos (c.190 BC – c.120 BC). Ancient Greek Astronomer, Philosopher and Mathematician.

• Single-Task, Continuous Time, Analogue Mechanical Computing (With backlash!)

Page 14: Energy Efficient Computing - 26mar13

14

Orrery c1700 ... Planet Motion Computer

• Inventor: George Graham (1674-1751). English Clock-Maker. • Single-Task, Continuous Time, Analogue Mechanical Computing (With backlash!)

Mechanical Technology

Page 15: Energy Efficient Computing - 26mar13

15

Babbage's Difference Engine 1837

The difference engine consists of a number of columns, numbered from 1 to N. Each column is able to store one decimal number. The only operation the engine can do is add the value of a column n + 1 to column n to produce the new value of n. Column N can only store a constant, column 1 displays (and possibly prints) the value of the calculation on the current iteration.

Computer for Calculating Tables: A Basic ALU Engine

(Re)construction c2000

Mechanical Technology

Page 16: Energy Efficient Computing - 26mar13

16

“Enigma” c1940

Data Encryption/Decryption Computer

Mechanical Technology

Page 17: Energy Efficient Computing - 26mar13

17

“Colossus” 1944

Code-Breaking Computer: A Data Processor

Valve/Mechanical Technology

Page 18: Energy Efficient Computing - 26mar13

18

“Baby” 1947 (Reconstruction)

General Purpose, Quantised Time and Data, (Digital) Electronic Computing

Valve/Software Technology

Page 19: Energy Efficient Computing - 26mar13

19

The Analogue Computer

Bush Radio 7 Transistors

1 Diode

c1960

Evoke DAB Radio 100 M Transistors

2-3 Embedded Processors

c2005

BTH Crystal Set

1 Diode

c1925

Tele-Verta Radio 4 Valves

1 Rectifier Valve

c1945

Page 20: Energy Efficient Computing - 26mar13

20

Vrf=Vi*100

Vlo=Cos(t*1^6)

Vi

Vrf

Vif=Vrf*Vlo

Vlo

Vif

Vro='Bandpass'(Vif*1000)

Vro

Radio as Computation ...

Single-Task (Embedded), Real-Time, Analogue (Close-Enough) Computing

Valve Technology Transistor Technology

‘Integrated Circuit’ Technology

Page 21: Energy Efficient Computing - 26mar13

21

The Pinnacle is Era and Application Related ...

Computing: is just Creating Output from Input ... Architecture: is the way this is done on the day.

It is the Most Important Product Decision! (HW, SW, Analogue, Optics, Graphene, Mechanics, Steam, etc)

Page 22: Energy Efficient Computing - 26mar13

22

Computation in a Cool iCon ...

Page 23: Energy Efficient Computing - 26mar13

23

A lot of Cool Stuff in a Smart Phone ...

... Computation in many forms

Page 24: Energy Efficient Computing - 26mar13

24

Take a Look Inside...

http://www.ifixit.com

The Control Board.

Level-1: Modules

Page 25: Energy Efficient Computing - 26mar13

25

Inside The Control Board (a-side)

http://www.ifixit.com

Level-2: Sub-Assemblies Visible Computing Contributors ...

Samsung: Flash Memory - NV-MOS (ARM Partner) Cirrus Logic: Audio Codec - Bi-CMOS (ARM Partner) AKM: Magnetic Sensor - MEM-CMOS Texas Instruments:Touch Screen Controller and mobile DDR - Analogue-CMOS (ARM Partner) RF Filters - SAW Filter Technology

Invisible Computing Contributors ... OS, Drivers, Stacks, Applications, GSM, Security, Graphics, Video, Sound, etc Software Tools, Debug Tools, etc

Page 26: Energy Efficient Computing - 26mar13

26

Inside The Control Board (b-side)

GPS Bluetooth, EDR &FM

http://www.ifixit.com

Level-2: Sub-Assemblies More Visible Computing Contributors ... A4 Processor. Spec:Apple, Design & Mfr: Samsung Digital-CMOS (nm) ...

Provides the iPhone 4 with its GP computing power. (Said to contain ARM A8 600 MHz CPU and other ARM IP)

ST-Micro: 3 axis Gyroscope - MEM-CMOS (ARM Partner) Broadcom: Wi-Fi, Bluetooth, and GPS - Analogue-CMOS (ARM Ptr) Skyworks: GSM Analogue-Bipolar Triquint: GSM PA Analogue-GaAs Infineon: GSM Transceiver - Anal/Digi-CMOS (ARM Partner)

Page 27: Energy Efficient Computing - 26mar13

27

Level-3: Processor (Nvidea Tegra 3, Around 1B transistors)

NB: The Tegra 3 is similar to the A4/5, but not used in the iPhone

Page 28: Energy Efficient Computing - 26mar13

28

Architecting your Product : Is the cumulative non-functional choices made to

support the functional need A Good Architecture is the one that ‘survives’ History is written by the winners (2nd is for losers)

: Component Performance may be ‘poor’ as long as System Performance is ‘better’ for its use.

Architectural Options ... : Business Model (Cost-of Ownership, ROI), TTM (Productivity, History, IP-

Availability, Know-How), Aesthetics (Power, Quality, Behaviour, Appearance)

: Analogue, Digital, Mechanical, Optical, RF, Software, Plastics, Metal-forming, Manufacturing, Glass, ...

: More than 99% of a Product is Reused from its Predecessor

... is assumed (working is expected!) ... It used to be the only consideration!

Page 29: Energy Efficient Computing - 26mar13

29

Power Philosophy Hardware Dissipates The Power ... Chose Underlying Technology for best power efficiency. One size does not fit all (Products, Applications or Instances)

... But Software Tells It To! Chips can melt-down under software ‘instruction’ Make computing hardware power as ‘Activity’ dependent as possible Zero Activity => Zero Power

Make OS/Apps aware of the power/performance situation, and their options for controlling it (Indicators and levers)

Avoid Moving Data Becoming the dominant energy consumption in a system Energy ∝ DataVolume x Speed x Distance>2(3)

Bring the processing to the data

... Think System: It’s how the ‘box’ performs, not the components

Page 30: Energy Efficient Computing - 26mar13

30

All ARM Processors are Power Efficient

Page 31: Energy Efficient Computing - 26mar13

31

Chose The Horses for The Course

... Delivering ~5x speed (Architecture + Process + Clock)

About 50MTr

About 50KTr

Page 32: Energy Efficient Computing - 26mar13

32

Parallel is More Efficient

Processor

f

Input Output

Processor

f/2

Processor

f/2

f

Input Output

Capacitance = 2.2C Voltage = 0.6V

Frequency = 0.5f Power = 0.4CV2f

Capacitance = C Voltage = V

Frequency = f Power = CV2f

... The limit determined by Amdahl’s or Gustafson’s Law

Page 33: Energy Efficient Computing - 26mar13

33

Multicore ARM On-Chip ... Heterogeneous Multicore Systems have been in ARM for a long time:

Cortex™-A8 Mali™-400

MP Cortex-M3

Interconnect

Power Manager Application UI & 3D graphics

Memory

Page 34: Energy Efficient Computing - 26mar13

34

Coherent Multicore Cluster ...

Cortex-A9 Cortex-A9 …

Coherency Logic

Power Manager User Interface

and 3D graphics

Mali-400 MP Cortex-M3

Interconnect

Homogenous Multicore cluster, as part of a heterogeneous system:

Page 35: Energy Efficient Computing - 26mar13

35

Multiple Clusters ... Multiple Homogeneous Coherent Clusters

Cortex-A15 …

Coherency Logic in L2 Cache

Coherent Interconnect

Cortex-A15 Cortex-A15 …

Coherency Logic in L2 Cache

Cortex-A15

Page 36: Energy Efficient Computing - 26mar13

36

Today’s Consumer require a pocket ‘Super-Computer’ ... Silicon Technology Provides a Billion transistors ...

It will be supported with a few GB of memory ...

Computer On a Chip c2010 ...

• Typically 10 Processors ... • 4 x A9 Processors (2x2): • 4 x MALI 400 Frag. Proc • 1 x MALI 400 Vertex Proc • 1 x MALI Video CoDec • Software Stacks, OS’s and Design

Tools/

• ARM Technology gives chip/system designers ...

• Improved Productivity • Improved TTM • Improved Quality/Certainty

http://www.arm.com/

Page 37: Energy Efficient Computing - 26mar13

37

CoreLink™ CCN-504 and DMC-520

ACE

ACE

NIC-400 Network Interconnect

Flash GPIO

NIC-400

USBQuad Cortex-

A15

L2 cache

Interrupt Control

CoreLink™DMC-520

x72DDR4-3200

PHY

AHB

Snoop Filter

Quad Cortex-

A15

L2 cache

Quad Cortex-

A15

L2 cache

Quad Cortex-

A15

L2 cache

CoreLink™DMC-520

x72DDR4-3200

8-16MB L3 cache

PCIe10-40GbE

DPI Crypto

CoreLink™ CCN-504 Cache Coherent Network

IO Virtualisation with System MMU

DSPDSP

DSP

SATA

Dual channel DDR3/4 x72

Up to 4 cores per cluster

Up to 4 coherent clusters

Integrated L3 cache

Up to 18 AMBA

interfaces for I/O coherent accelerators

and IO

Peripheral address space

Heterogeneous processors – CPU, GPU, DSP and accelerators

Virtualized Interrupts

Uniform System

memory

Page 38: Energy Efficient Computing - 26mar13

38

C/C++ Development

Middleware

Debug & Trace

Methodology As Well As Hardware

Energy Trace Modules

Page 39: Energy Efficient Computing - 26mar13

39

Power Management For Single-Processor systems, and Peripheral Circuitry...

Variable/Gated clock domains

Variable/Switched power domains

Maximises power efficiency by ... Minimise voltage/frequency (P=CV2f) so that processor has just

enough performance for the current application need Controlled by the OS and the Application SW Maximises ‘Activity Power’ dependence Apply on/off-chip zones ... Methodology Retention Flops/Latches, Level Shifters, Power-Switch Cells, PLLs

Page 40: Energy Efficient Computing - 26mar13

40

big.LITTLE Processing For High-Performance systems...

Tightly coupled combination of two ARM CPU clusters: Cortex-A15 and Cortex-A7 - functionally identical Same programmers view, looks the same to OS and applications

big.LITTLE combines high-performance and low power Automatically selects the right processor for the right job Redefines the efficiency/performance trade-off

big

“Demanding tasks”

LITTLE

“Always on, always connected tasks”

30% of the Power (select use cases)

Current smartphone

big.LITTLE Current smartphone

big.LITTLE

>2x Performance

Page 41: Energy Efficient Computing - 26mar13

41

Fine-Tuned to Different Performance Points

Simple, in-order, 8 stage pipelines

Performance better than mainstream, high-volume smartphones (Cortex-A8 and Cortex-A9)

Most energy-efficient applications processor from ARM

Complex, out-of-order, multi-issue pipelines

Up to 2x the performance of today’s high-end smartphones

Highest performance in mobile power envelope

Cortex-A7 Cortex-A53

Cortex-A15 Cortex-A57

LIT

TLE

bi

g

Queue

Issue

Integer

Page 42: Energy Efficient Computing - 26mar13

42

CPU Migration Migrate a single processor workload to the appropriate CPU Migration = save context then resume on another core Also known as Linaro “In Kernel Switcher”

DVFS driver modifications and kernel modifications Based on standard power management routines Small modification to OS and DVFS, ~600 lines of code

big.LITTLE MP OS scheduler moves threads/tasks to appropriate CPU Based on CPU workload Based on dynamic thread performance requirements

Enables highest peak performance by using all cores at once

big.LITTLE Software

Page 43: Energy Efficient Computing - 26mar13

43

Bringing the Processing to the Data …

288 server nodes in a 4U rack space Public Source: http://www.engadget.com/2011/11/02/hp-and-calxedas-moonshot-arm-servers-will-bring-all-the-boys-to/

Dell + Marvell, Copper

BaiDu + Marvell, Baserock

Press Claims:

Page 44: Energy Efficient Computing - 26mar13

44

... Refining Data into Information

Page 45: Energy Efficient Computing - 26mar13

45

Transferrable Lessons to GP Software Moving data is Power Expensive ... Don’t move data; use it locally (Cache it) Refine it once, use it often (Pre-Process it)

Your CPU Power is work-load independent ... So, get in; get the work done; and get out. Maximise the workload of your code; terminate when complete.

Make your Processing work-load dependent Use a Hypervisor and turn off (at least free) processors not in use.

Page 46: Energy Efficient Computing - 26mar13

46

Societies Challenges in the 21c Urbanisation (Smart Cities) Health (eHealth) Transport Energy (Smart Grid) Security Environment

And whilst our technologies will be an essential part of all solutions, they cannot not fix them without Society’s help and cooperation!

... Energy Efficient Computing will help but not avert the Energy (or other) challenges!

Food/Water Ageing Society Sustainability Digital Inclusion Economics

Having a great time!

Page 47: Energy Efficient Computing - 26mar13

47

Conclusions Putting the power of Computation into the hands of the masses,

has changed the face of Computing (again) Electronic Systems will become Ubiquitous in out Lives and Economy

Power Efficient ES is a major issue to Society Which faces a future with it as a significant energy consumer

Power Efficiency must be architected into the System Hardware and Software from the beginning To realise the maximum potential out of your Silicon (Avoiding Dark Si) Architect & Design HW as efficiently as possible (reflecting the task) Strive for: No Work => No Power

Equip HW with Indicators and Levers so the System/App can manage it Bring Processing to the Data ... Don’t move Data; move Information Process data locally Energy ∝ DataVolume x Speed x Distance>2(3)

Page 48: Energy Efficient Computing - 26mar13

48

Computing the 21c …

Enabling the Creation of High-Performance Electronic Systems

... Productively, Economically and Reliably through Hw & Sw Reuse Methodologies

based on a family of CPU/GPU cores