scheduling for reduced cpu energy m. weiser, b. welch, a. demers, and s. shenker
TRANSCRIPT
Introduction
The Energy Saving of A Typical Laptop Computer Backlight and display & Disk
Turning off after a period of no use CPU
Simple power-down-when-idle techniques Another way?
Opportunities Dynamically varying chip speed & energy
consumption Cooperation of the operating system scheduler
When to use full power & when not
CPU Energy Q = C * V
Charge = Capacitance * Volts E = Q * V
Energy = Charge * Volts When charging a capacitor
E =1/2 C V2 => Energy spent charging a gate is
proprotional to the square of the voltage
Reducing CPU Energy Empirically, lower-voltage circuits have longer settling
times SGS-Thompson 8051 CMOS microcontroller 6 Mhz @ 5V 4.5 MHz @ 3.3 V 3 MHz @ 2.2 V
=> max clock rate is proportional to voltage The relationship is close to linear for an interesting range
of voltages - 5V to 2.2V Executing the same # cycles at a lower voltage and a
slower clock speed results in a net power savings Adjust the CPU speed+voltage in response to
scheduling demands
An Energy Metric for CPUs MIPJ : Millions of instructions per joule
= MIPS/WATTS No effect on changes in clock speed Opportunity For Quadratic Energy Savings As the clock speed is reduced by n, energy per cycle
can be reduced n2
Three methods to achieve this Voltage reduction Reversible logic Adiabatic switching
An Energy Metric for CPUs -cont’d
Voltage Reduction E/clock is directly proportional to V2
Lower-voltage, slower-clock chip; less energy per cycle Reducing The Energy Consumption
The same # of cycles but lower voltage Ex: a task with 100ms deadline Method 1 50ms - full speed; 50ms - idle Method 2 100ms - half speed at half voltage Energy consumption: 4:1
Approach of This Paper Energy Saving Technique
The fine grain control of CPU clock speed Running slower and at reduced voltage
Evaluation Using trace-driven simulation
Goal To evaluate the energy savings To measure the effect of running too slow
Trace Data From the UNIX scheduler Workloads
S/W dev., documentation, e-mail, simulation, ... Typing, scrolling
Time stamp: microsecond Sleep events: wait on hard, soft events
Hard events: disk wait, page fault Soft events: keystroke, awaiting network packets
Soft idle can be eliminated by rescheduling Hard idle is mandated by a wait on a device
Assumptions
Simulation Soft events belongs to idle periods No reordering of trace data events Using no energy when idle Taking no time to switch speeds No consideration of > 30 second period of greater than 90%
idleness Lower bound to practical speed:
1.0 <=> 5 V 0.66 <=> 3.3 V 0.44 <=> 2.2 V 0.2 <=> 1.0 V
Scheduling Algorithms OPT (unbounded-delay perfect-future)
Taking the entire trace Stretching all the run times to fill all the idle times Imaginary batch job with perfect knowledge Impractical & undesirable Bad response time
FUTURE (bounded-delay limited-future) Taking the future trace of a small window Window sizes: 1 ms ~ 400 sec Impractical but desirable Good response time on a window of 10 to 50 ms
Scheduling Algorithms -cont’d PAST (bounded-delay limited-past)
Looking a fixed window into the past Assuming the next window will be like the previous one Examine % busy during the pervious interval and adjust speed for the
next interval Excess cycles can build up if speed (+voltage) is set too low. =>
Penalty metric Excess Cycle Penalty At each interval, count up left over cycles that accumulated because
you ran too slow Switch to full speed if there were more excess cycles than idle time in
the previous interval Hard idle (page fault, disk request) cannot be squeezed
Trace Driven Simulation
Trace Points Sched: context switch away a process Idle on: enter the idle loop Idle off: leave idle loop to run a process Fork: create a new process Exec: overlay a new process with another program Exit: process termination Sleep: wait on an event Wakeup: notify a sleeping process
Traces Short runs during specific tasks, editing etc. Long runs of several hours
Evaluation: The Results of Three Algorithms
QuickTime™ and a decompressor
are needed to see this picture.
Evaluation: Minimal Voltage & The Excess Cycles
QuickTime™ and a decompressor
are needed to see this picture.
Frequency: All the excess cycles <= x-val., but > previous x-val.Excess cycles: time to run unfinished instructions at full speedLower min vol. => more cases where excess cycles build up
=> accumulate in longer interval => peak extends to right
Evaluation: Interval Length & Excess Cycles
QuickTime™ and a decompressor
are needed to see this picture.
Peak in excess cycles shifts right as interval len. IncreasesLonger scheduling interval => more excess cycles built up
Evaluation: Different Minimum Voltage Limits
QuickTime™ and a decompressor
are needed to see this picture.
2.2 V is almost as good as 1.0 VRelative savings for diff. min. voltages
Evaluation: Changing The Inverval Length
QuickTime™ and a decompressor
are needed to see this picture.
A longer adjustment period results in more savings
Evaluation: Average Excess Cycles [1]
QuickTime™ and a decompressor
are needed to see this picture.
Lower min. voltage => more excess cyclesLonger intervals => accumulate more excess cyclesEnergy savings is function of the interval size
Discussion & Future Work Feedback source other than idle time To classify jobs into
Background, periodic, and foreground Schd. Order: periodic, foreground, background
No Reordering vs. Reordering Unless a large job mix, reordering not significant.
I/O Wait Model: Hard/Soft Thinking valid but good to verify
Conclusions Preliminary Results On CPU Scheduling To Reduce
CPU Energy Usage Scheduling jobs at different clock rates. Trace Driven Simulation OPT / FUTURE / PAST PAST with a 50ms window
2.2 Volts => 5.0 Volts: This range provides good savings with moderate penalty
Power savings up to 50% (3.3V), 70% (2.2V) The Tortoise Is More Efficient Than The Hare.