scheduling for reduced cpu energy m. weiser, b. welch, a. demers, and s. shenker

21
Scheduling for Reduced CPU Energy M. Weiser, B. Welch, A. Demers, and S. Shenker

Upload: randell-hancock

Post on 22-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Scheduling for Reduced CPU Energy

M. Weiser, B. Welch, A. Demers, and S. Shenker

Introduction

The Energy Saving of A Typical Laptop Computer Backlight and display & Disk

Turning off after a period of no use CPU

Simple power-down-when-idle techniques Another way?

Opportunities Dynamically varying chip speed & energy

consumption Cooperation of the operating system scheduler

When to use full power & when not

CPU Energy Q = C * V

Charge = Capacitance * Volts E = Q * V

Energy = Charge * Volts When charging a capacitor

E =1/2 C V2 => Energy spent charging a gate is

proprotional to the square of the voltage

Reducing CPU Energy Empirically, lower-voltage circuits have longer settling

times SGS-Thompson 8051 CMOS microcontroller 6 Mhz @ 5V 4.5 MHz @ 3.3 V 3 MHz @ 2.2 V

=> max clock rate is proportional to voltage The relationship is close to linear for an interesting range

of voltages - 5V to 2.2V Executing the same # cycles at a lower voltage and a

slower clock speed results in a net power savings Adjust the CPU speed+voltage in response to

scheduling demands

An Energy Metric for CPUs MIPJ : Millions of instructions per joule

= MIPS/WATTS No effect on changes in clock speed Opportunity For Quadratic Energy Savings As the clock speed is reduced by n, energy per cycle

can be reduced n2

Three methods to achieve this Voltage reduction Reversible logic Adiabatic switching

An Energy Metric for CPUs -cont’d

Voltage Reduction E/clock is directly proportional to V2

Lower-voltage, slower-clock chip; less energy per cycle Reducing The Energy Consumption

The same # of cycles but lower voltage Ex: a task with 100ms deadline Method 1 50ms - full speed; 50ms - idle Method 2 100ms - half speed at half voltage Energy consumption: 4:1

Approach of This Paper Energy Saving Technique

The fine grain control of CPU clock speed Running slower and at reduced voltage

Evaluation Using trace-driven simulation

Goal To evaluate the energy savings To measure the effect of running too slow

Trace Data From the UNIX scheduler Workloads

S/W dev., documentation, e-mail, simulation, ... Typing, scrolling

Time stamp: microsecond Sleep events: wait on hard, soft events

Hard events: disk wait, page fault Soft events: keystroke, awaiting network packets

Soft idle can be eliminated by rescheduling Hard idle is mandated by a wait on a device

Assumptions

Simulation Soft events belongs to idle periods No reordering of trace data events Using no energy when idle Taking no time to switch speeds No consideration of > 30 second period of greater than 90%

idleness Lower bound to practical speed:

1.0 <=> 5 V 0.66 <=> 3.3 V 0.44 <=> 2.2 V 0.2 <=> 1.0 V

Scheduling Algorithms OPT (unbounded-delay perfect-future)

Taking the entire trace Stretching all the run times to fill all the idle times Imaginary batch job with perfect knowledge Impractical & undesirable Bad response time

FUTURE (bounded-delay limited-future) Taking the future trace of a small window Window sizes: 1 ms ~ 400 sec Impractical but desirable Good response time on a window of 10 to 50 ms

Scheduling Algorithms -cont’d PAST (bounded-delay limited-past)

Looking a fixed window into the past Assuming the next window will be like the previous one Examine % busy during the pervious interval and adjust speed for the

next interval Excess cycles can build up if speed (+voltage) is set too low. =>

Penalty metric Excess Cycle Penalty At each interval, count up left over cycles that accumulated because

you ran too slow Switch to full speed if there were more excess cycles than idle time in

the previous interval Hard idle (page fault, disk request) cannot be squeezed

Trace Driven Simulation

Trace Points Sched: context switch away a process Idle on: enter the idle loop Idle off: leave idle loop to run a process Fork: create a new process Exec: overlay a new process with another program Exit: process termination Sleep: wait on an event Wakeup: notify a sleeping process

Traces Short runs during specific tasks, editing etc. Long runs of several hours

Evaluation: The Results of Three Algorithms

QuickTime™ and a decompressor

are needed to see this picture.

Evaluation: Minimal Voltage & The Excess Cycles

QuickTime™ and a decompressor

are needed to see this picture.

Frequency: All the excess cycles <= x-val., but > previous x-val.Excess cycles: time to run unfinished instructions at full speedLower min vol. => more cases where excess cycles build up

=> accumulate in longer interval => peak extends to right

Evaluation: Interval Length & Excess Cycles

QuickTime™ and a decompressor

are needed to see this picture.

Peak in excess cycles shifts right as interval len. IncreasesLonger scheduling interval => more excess cycles built up

Evaluation: Different Minimum Voltage Limits

QuickTime™ and a decompressor

are needed to see this picture.

2.2 V is almost as good as 1.0 VRelative savings for diff. min. voltages

Evaluation: Changing The Inverval Length

QuickTime™ and a decompressor

are needed to see this picture.

A longer adjustment period results in more savings

Evaluation: Average Excess Cycles [1]

QuickTime™ and a decompressor

are needed to see this picture.

Lower min. voltage => more excess cyclesLonger intervals => accumulate more excess cyclesEnergy savings is function of the interval size

Evaluation: Average Excess Cycles [2]

QuickTime™ and a decompressor

are needed to see this picture.

Discussion & Future Work Feedback source other than idle time To classify jobs into

Background, periodic, and foreground Schd. Order: periodic, foreground, background

No Reordering vs. Reordering Unless a large job mix, reordering not significant.

I/O Wait Model: Hard/Soft Thinking valid but good to verify

Conclusions Preliminary Results On CPU Scheduling To Reduce

CPU Energy Usage Scheduling jobs at different clock rates. Trace Driven Simulation OPT / FUTURE / PAST PAST with a 50ms window

2.2 Volts => 5.0 Volts: This range provides good savings with moderate penalty

Power savings up to 50% (3.3V), 70% (2.2V) The Tortoise Is More Efficient Than The Hare.