benchmarking of dynamic power management solutions · benchmarking of dynamic power management...

46
Benchmarking of Dynamic Power Management Solutions Frank Dols CELF Embedded Linux Conference Santa Clara, California (USA) April 19, 2007

Upload: vuhanh

Post on 11-May-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

Benchmarking of Dynamic Power Management Solutions

Frank Dols

CELF Embedded Linux Conference

Santa Clara, California (USA)

April 19, 2007

Page 2: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

2 / 43

Why Benchmarking?

VendorVendor NXPNXP

What do we get?What do we get?

From Here to There, 2000whatever From Here to There, 2000whatever

!!

Page 3: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

3 / 43

Presentation Outline:

The context;– Power management concepts.

– Hardware and software.

Benchmarks;– The process.

– Metrics.

– Findings.

Conclusions.– Relation to other work.

– What’s next.

Page 4: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

4 / 43

Area Of Interest

Mobile/portable devices mostly exist of:1. Battery.

2. Storage (flash disk, hard disk, …).

3. Display (LCD, …).

4. Speaker.

5. Broadband I/O (Bluetooth, UMTS, …).

6. Processing (CPU)

We are initially only focusing on the optimization of the “Processing” power

consumption.

– As next, we are also taking 1-5 into account.

Page 5: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

5 / 43

Power Measurements

A board under test

LabView

Mp3 playback – LabViewmeasurements

Measurements equipment: high precision DMM (digital Multi-Meters)

Page 6: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

6 / 43

Energy Consumers

Energy saving methods trade performance or functionality for energy:– Scaling performance of processors, memories and buses;

– Using various stand-by modes of peripherals.

Energy saving is about supplying

the right amount of performance

at the right time.

However, the future is unknown!

other

CPU

Memory

LCD

DC/DC

Energy consuming components in a typical

audio/video playing mobile device.

Page 7: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

7 / 43

Power Dissipation Basics

� Reduce capacitance switched.

� Reduce switching currents.

� Reduce operating voltage.

∫t

cDD fVC0

2)( Dynamic

Power Dissipation

Dynamic

Power Dissipation

dtIVfVCE QDD

t

cDD ))((

0

2+= ∫

Total Power

Dissipation

Total Power

Dissipation

� Reduce leakage current in active and standby modes of operation.

� Reduce operating voltage.

lkgDD

t

IV∫0

Leakage

Power Dissipation

Leakage

Power Dissipation+

Page 8: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

8 / 43

Dynamic Voltage Frequency Scaling (DVFS)

Scales performance according to demand (using an estimation of future

workload).

Based on the fact that energy per clock cycle rises with frequency:

Implemented by switching between operating points (voltage and frequency

pairs).

fVP ⋅=2

ˆ

Clock Generator

Voltage Controller

Applications, task scheduler,or other source of information

Workload estimation

Processor ormemory

Page 9: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

9 / 43

DVFS: How Does it Save Power

Power

Time

High f,V

Performance

Stand-by power

Idle

Full speed

50% CPU usage, no DVFS

Power

Time

Optimal!

Performance

Low f,V

50% speed

50% CPU usage, with DVFS

Performance

Power

TimeEnergy used

100% CPU usage

Page 10: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

10 / 43

Performance Prediction Methods

100%

0%

Interval-based: use CPU-usage of previous interval(s) to determine frequency for next interval.

operating frequency

task /workload

100%

0%

Problem: late reaction on changing workloads �missed deadlines

Page 11: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

11 / 43

Performance Prediction Methods

100%

0%

Application-directed: use information from the application to change frequencies.

Problems: requires changes in the application;not always possible (interactive applications).

task /workload

operating frequency

Page 12: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

12 / 43

Dynamic Power Management (DPM) Concept

DPM software connects to OS kernel and collects data,– with collected data, and usage of policies, try to predict future workload.

Multiple policies categorize software workload.

Single global prediction of future performance is made.

OSDPM

SoftwareRequired

Performance

Policy

PolicyPolicy

Policy

Policy Evaluation

Volts

MHz

Apps.

Apps.

Apps.

Apps.

% of max.

performance

Page 13: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

13 / 43

Application directed DVFS

Two main groups of mobile applications:

Internet browsing

Gaming

Interactive

Audio decoding

Video decoding

Streaming

Future workload unknown (depends on user input)

Future workload known to application

Page 14: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

14 / 43

Presentation Outline:

The context;– Power management concepts.

– Hardware and software.

Benchmarks;– The process.

– Metrics.

– Findings.

Conclusions.– Relation to other work.

– What’s next.

Page 15: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

15 / 43

Benchmark Process

Plan:– Define Key Performance Indicators (KPI).

– Define test to measure KPI.

Do:– Measurement on evaluation platform.

Check:– Analyze gained results and can they be explained.

– When necessary, cross verify with Vendor.

Act:– Take necessary actions on gained results.

Page 16: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

16 / 43

Key Performance Indicators (KPI)

Memory usage.– Footprint of DPM framework and administration.

CPU usage.– Cycles consumed by DPM framework.

– System behavior, predictability & reproducibility.

System idleness.– Prediction of optimum frequency (minimization of idleness).

Real time behavior.– Application deadlines missed.

– Responsiveness (latency on events).

– DPM Policy prediction accuracy.

Page 17: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

17 / 43

Test / Use Cases

Synthetic (fine-grained) benchmark,– Simulate different workload levels,

– Test corner cases.• LMbench as available from Open Source; lat_proc, lat_syscall, memory, clock, idle, …

Application level (coarse-grained) benchmark:– Whetstone & Dhrystone (artificial system load).

– Hartstone (real time behavior).

Video decoding,– ffplay mpeg video decoding use case.

• use ffplay, part of ffmpeg.

Audio decoding,– mp3 audio decoding use case.

• use ARM MP3 decoding library.

Page 18: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

18 / 43

Metrics for SW based Benchmarking

Missed deadlines;

Overdue time;

Idle time;

Jitter;

Responsiveness.

Page 19: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

19 / 43

Missed Deadlines

When operating frequency is too low, deadlines are missed:

Power

Time

Power

Time

Page 20: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

20 / 43

Overdue Time

Are missed deadlines caused by timing inaccuracy, or by performance

deficiency?

Power

Time

Power

Time

Overdue time

Page 21: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

21 / 43

Idle Time

When operating frequency is too high, extra slack time is introduced, and

power is wasted:

Power

Time

Power

Time

Power

Time

Optimal scenario

Page 22: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

22 / 43

Jitter

Execution times vary, so time of completion varies as well:

Power

Time

Arrival times are periodic, but completion times vary ���� Jitter

arrival

completion

Page 23: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

23 / 43

In Summary, DPM Metrics

Missed deadlines (OS schedule);

Overdue time;

Idle time;

Jitter (fluctuation in Completion time);

Responsiveness.

Power

Time

Power

Time

Missed Deadline

Overdue Time

Idle TimeCompletion

Time

Page 24: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

24 / 43

NXP’s DVFS Benchmark Platform

Energizer I SoC;– ARM1176JZF-s,

– DVFS-enabled (core also),

– CPU voltage can be set in

25 mV increments,

– CPU frequency can be set using

2 PLL’s (300MHz and 400MHz by default)

and a divider.

Energizer I Software;– Linux 2.6.15,

– CPUfreq and PowerOP.

– MV’s DPM framework ported but not used.

Page 25: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

25 / 43

CPU Usage CharacteristicsNo DPM

0

10

20

30

40

50

60

70

80

90

100

0

2,7

5,4

8,1

10,8

13,5

16,2

18,9

21,6

24,3 27

29,7

32,4

35,1

37,8

40,5

43,2

45,9

48,6

51,3 54

56,7

59,4

62,1

time (s)

perf

orm

an

ce (

%)

DPM Load CPU usage

Exponential Weighed Average Policy

0

20

40

60

80

100

120

0

1,9

3,8

5,7

7,6

9,5

11,4

13,3

15,2

17,1 19

20,9

22,8

24,7

26,6

28,5

30,4

32,3

34,2

36,1 38

39,9

41,8

43,7

45,6

47,5

49,4

51,3

53,2

55,1 57

58,9

60,8

62,7

time (s)

perf

orm

an

ce (%

)

DPM Load CPU usage

Moving Average Policy

0

20

40

60

80

100

120

0

1,9

3,8

5,7

7,6

9,5

11

,4

13

,3

15

,2

17

,1 19

20

,9

22

,8

24

,7

26

,6

28

,5

30

,4

32

,3

34

,2

36

,1 38

39

,9

41

,8

43

,7

45

,6

47

,5

49

,4

51

,3

53

,2

55

,1 57

58

,9

60

,8

62

,7

time (s)

pe

rfo

rma

nce

(%

)

DPM Load CPU usage

CPU usage,

applied policy reacts slow

on required performance.

Page 26: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

26 / 43

DPM; Synthetic Workload

0

20

40

60

80

100

120

time

pe

rfo

rma

nc

e

(1st) Synthetic Workload (2nd) DPM, Predicted Performance Requirement

(3rd) Actual CPU Usage

Headroom

Late Response

(Exponential Weighed Average Policy)

Page 27: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

27 / 43

Application Level Benchmark

Video playback (DivX).– Open-Source ffplay using frame-buffer device.

– Instrumented with time measurements.

Evaluation:– How many hard deadlines are missed with & without DPM.

Page 28: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

28 / 43

ffplay Deadlines

• ffplay deadline: time interval in between displaying of

successive (decoded) frames.

- For 10 fps movie -> the deadline is 100 ms.

- With DPM activated,

fluctuation increases.

- REMARK: video output

(display) on CompactPresen-

ter via Compact flash interface,

results in high fluctuation

oops !! (see next slide).

ffplay@10fps

0

20000

40000

60000

80000

100000

120000

140000

160000

180000

200000

1 28 55 82 109 136 163 190 217 244 271 298 325 352 379 406 433 460 487

frames

us

1

2

3

4

Page 29: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

29 / 43

Kernel & User Space Activity

While running ffplay, sample the CPU activity in both

kernel and user space (based on /proc/stat).

Kernel activity dominates during the video playback.

ffplay@10fps

0

10

20

30

40

50

60

70

80

90

100

0 10000 20000 30000 40000 50000

time, us

CPU u

sage, %

user

kernel

Page 30: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

30 / 43

DPM; Video Decoding

0

20

40

60

80

100

5 10 15 20 25

frames per second

dead

lin

es m

issed

/ i

dle

tim

e (

%)

idle idle DPM deadlines deadlines DPM

Power Savings

Performance Impact

PolicyOverhead

(Exponential Weighed Average Policy)

System Overload

Page 31: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

31 / 43

Video Decoding Jitter

1

10

100

1000

10000

0 5 10 15 20 25 30

frames per second

jitt

er

(milliseco

nd

s)

jitter jitter DPM

Clock scaling causes jitter increase

Jitter metric gets polluted by high number of deadline misses

Page 32: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

32 / 43

Video Decoding and Hints

CITY_5fps.avi

0

20

40

60

80

100

120

0 5 10 15 20 25 30 35 40 45 50

time (s)

fre

qu

en

cy

(M

Hz

)

No DPM DPM, No hints DPM, with Hints

Page 33: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

33 / 43

0

10

20

30

40

50

60

70

80

2,3 3,55 4,8 6,05 7,3

idle

tim

e (%

)

5 7,5 10 12,5 15

frames per second

Custom DPM disabled Custom DPM enabled, without hints Custom DPM enabled, with hints

DPM disabled DPM enabled

Comparison

DPM without hints is only effective at

low workloads

When application-directed hints are enabled;• idle time reduction increases,• effective in a wider frame rate window.

Page 34: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

34 / 43

Presentation Outline:

The context;– Power management concepts.

– Hardware and software.

– Relation to other work.

Benchmarks;– The process.

– Metrics.

– Findings.

Conclusions.– Relation to other work.

– What’s next.

Page 35: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

35 / 43

Technology Conclusions

Dynamic Power Management (DPM).– Added value is in policies.

– Performs best on low workloads.

– Adaptation to changing workload is heavily dependent on chosen policy.

– As application developer, you have to develop your own policies.

– Be aware, policy adds overhead.

Applications cannot feed data directly into DPM.– Performance prediction based on logging application history does not always

work.

– The best performance prediction may come from the application itself (feed

forward)!

– Hinting can give improvements over an interval based approach.

Page 36: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

36 / 43

Conclusions

Interval based DPM at CPU-level results in higher CPU utilisation, but seems

only really effective at low workloads;

Precise energy saving figures should be measured using HW measuring

tools, but SW benchmarks give good insight in saving potentials and

consequences on real-time behaviour.

SoC is not the biggest power consumer.– Backlight, DC/DC converters & power amplifiers are big consumers, by optimizing

these, most can be gained.

Page 37: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

37 / 43

What’s Next

1. CELF (you guys), MIPI PM working group

- Matt Locke and his team

2. Mode switching.

3. Correlation between hardware and software;

- What to solve at which level?

4. Deadline based scheduling, from priority based to time based.

5. Start contributing to the Community on this.

Regarding bullet 1, 2 & 3, see extra slides

Page 38: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

38 / 43

Overview of Open Source Power Mgt. Software

CPU power management

Standby (low power mode)when idle

DynamicVoltage/FrequencyScaling

Device power management

Suspend/Resumeof

devices

(memory)bus

frequencyscaling

Suspendcompletesystem

CPUfreqTask

schedulerAPM/

ACPI

Power

OP

Linux Device

Model

(also look at Mark Gross presentation)

Page 39: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

39 / 43

Mode Transitions

ACTIVE mode LP mode ACTIVE modetransition transition

app

policy

slp_sw

app

slp_hw wu_hw

wu_sw

PM policy (performance monitoring, decision taking)

Mode transition in sw domain

(state save, sequencing, PM infra control)

Mode transition in hw domain(sequencing, communication)

event

Mode transition in hw domain(communication, Voltage/

Frequency settling,

sequencing)

Mode transition in sw domain

(warm boot, state restore, sequencing)

Page 40: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

40 / 43

Typical

Hardware

aided

Improving power managementthrough co-design of hardware and software

Other ideas for improving power management– Performance counters for other peripherals:

• Cache miss rate is an indicator for memory bus usage of the CPU.

– Hardware aided workload prediction algorithms (in embedded FPGA):• DVFS algorithm in FPGA, instead of software;• Enables fast calculation of estimations and quick operating point changes;• Algorithm can be changed for every use case because of the use of an FPGA.

– More hardware acceleration for specific applications:• Needs discussion with SW developers in early stage about which parts can be

accelerated;• Enables running on a lower frequency, and as such saving energy.

Workload

prediction

algorithm

Policy

manager

DVFS Hardware

(clock generator,

voltage controller)Applications

Software

Hardware

OS

FPGA

Page 41: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

41 / 43

To come to the End.

Page 42: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

42 / 43

From Here to There, 2000whatever From Here to There, 2000whatever

We Started With the Question:“Why Benchmarking?”

Do your homework,

don’t bring in the Trojan Horse!

Page 43: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

43 / 43

Page 44: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

44 / 43

NXP Semiconductors

Established in 2006(formerly a division of Philips)

Builds on a heritage of 50+ years of experience in semiconductors

Provides engineers and designers with semiconductors and software that deliver better sensory experiences

Top-10 supplier with Sales of € 4.960 Bln (2006)

Sales: 35% Greater China, 31% Rest of Asia, 25% Europe, 9% North America

Headquarters: Eindhoven, The Netherlands

Key focus areas:

– Mobile & Personal, Home, Automotive & Identification, Multimarket Semiconductors

Owner of NXP Software: a fully independent software solutions company

Page 45: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

45 / 43

Where to Find Us

Operating systems Knowledge Centre (OKC)

phone: +31 40 - 27 45 131

fax: +31 40 - 27 43 630

email: [email protected]

Web: www.nxp.com

address: High Tech Campus 46

location 3.31

5656 AE Eindhoven (NL)

Page 46: Benchmarking of Dynamic Power Management Solutions · Benchmarking of Dynamic Power Management Solutions ... 50% CPU usage, no DVFS Power Time Optimal! ... Interval based DPM at CPU-level

46 / 43