arm vision for thermal management and energy aware ... · 12/7/2015 · arm vision for thermal...
TRANSCRIPT
ARM Vision for Thermal Management and Energy Aware Scheduling on Linux
Charles Garcia-Tobin, Software Power Architect, ARM
Thomas Molgaard, Director of Product Management, ARM
ARM Tech Symposia China 2015
November 2015
© ARM 2015 2
Agenda
Intelligent Power Allocation (IPA) - update
Energy Aware Scheduling (EAS)
Future vision of IPA and EAS
© ARM 2015 3
Mobile users spend time on a range of mobile
applications*:
38% on web browsing and Facebook
32% on gaming
16% on audio, video and utility
3 main workload categories:
Short bursts of high intensity
Long periods of sustained high intensity
Low intensity
Goal: optimal performance and thermal
control for each category
Mobile Application Workloads
Measured on a Quad Cortex-A7 Symmetric Multiprocessing platform
* Source: Flurry Analytics Time
Time
Time
Pow
er
Pow
er
Pow
er
Web browsing – Bursty workload
Gaming – Sustained workload
Audio Playback
© ARM 2015 4
SoC
SoC
ARM Intelligent Power Allocation (IPA)
IPA (PID control algorithm)
Tdie
Tskin Performance Requests
Allocated Performance
big LITTLE GPU
big LITTLE GPU
Power to Heat
Dynamic Allocation by:
•Performance required
•Thermal headroom
Real-time CPU & GPU
Performance requests
© ARM 2015 5
ARM IPA Advancements
Proactive vs. Reactive Thermal
Management
Continuously adapting response based on
power consumption and thermal headroom
Closed-loop control uses PID algorithm for
accurate temperature control
Dynamic Partitioning vs. Fixed
Optimally allocates power to CPU or GPU
depending on current workload
Merged in Linux-4.2
Will benefit all operating systems based on
Linux, no patches required
IPA Temp
Time
© ARM 2015 6
Agenda
Intelligent Power Allocation (IPA) - update
Energy Aware Scheduling (EAS)
Future vision of IPA and EAS
© ARM 2015 7
Challenges in Scheduling, Performance and Power
SoC CPU topologies are becoming more varied,
accommodating different power/performance
budgets:
SMP, multi-cluster SMP, ARM® big.LITTLE™ technology.
Per core/per cluster DVFS
(Dynamic Voltage & Frequency Scaling)
Linux power management frameworks have
evolved separately so are uncoordinated and hard
to tune
We are developing a flexible unified
upstream solution.
_All_ policy, all metrics, all averaging should happen at the scheduler power saving level, in a single place, and then the scheduler should directly drive the new low level idle state driver mechanism.
Ingo Molnar (31 Mar 2013)
© ARM 2015 8
How are we Solving This?
Introduce generic energy awareness in
upstream Linux:
1. Integrate Idle, DVFS, scheduler big.LITTLE support
2. Clean design rather than short-cuts
3. Based on measureable energy model data rather
than magic tunables
4. Support future CPU topologies
5. Maintained in upstream Linux, reduced software
maintenance costs.
EAS
Energy
model driven
scheduling
Scheduler
driven
DVFS Analysis
& Tuning
flows
Tools
Performance
enhancements
Idle CPU
improvements Simple
tunability
Linaro
development
ARM
development
© ARM 2015 9
What is EAS – the Energy Model
Compute
capacity
(Performance)
Capacity Utilization
0 1 2 3
Max capacity
big
Waking task
?
CPU
cluster little big
Current
capacity
Max capacity
little Placing task on cpu1:
P-state change for
cpu0 and cpu1.
Placing task on cpu3:
No P-state changes.
CPU 0, 1
CPU 2, 3
Little
core
Big
core
Power
© ARM 2015 10
With scheduler task utilization tracking
DVFS can be notified immediately when
CPU utilization changes = improved
responsiveness.
Scheduler-driven DVFS
Capacity Utilization
0 1 2 3
Max
capacity big
Waking task
?
CPU
cluster little big
Current
capacity
Max
capacity
little
cpu 1
utilization
Time
Now Next sample
Capacity
Current
Utilization
© ARM 2015 11
SchedTune
Current:
A set of governor-specific tunables.
Goal:
Single tunable to bias the energy/performance
trade-off.
Prototypes:
Global boost tunable:
/proc/sys/kernel/sched_cfs_boost
Task group (cgroup) based tuning:
/sys/fs/cgroup/stune/<group>/schedtune.boost
cpu
utilization
Time
Capacity
Performance
margin
Current
Utilization
© ARM 2015 12
Results – RFCv5 @ ARM TC2
Energy Performance
0
20
40
60
80
100
120
rt-app 6% rt-app 13% rt-app 19% rt-app 25% sysbench
Mainline
+capacity awareness
+EAS0
20
40
60
80
100
120
sysbench
Lower is better. Higher is better.
© ARM 2015 13
Agenda
Intelligent Power Allocation (IPA) - update
Energy Aware Scheduling (EAS)
Future vision of IPA and EAS
© ARM 2015 14
HMP big.LITTLE
big DVFS
LITTLE DVFS
CPU Thermal
GPU Thermal
GPU DVFS
2014:
HMP
Improving Power Efficiency, Performance, and Time to Market
Building a Platform for Power and Performance Management
© ARM 2015 15
HMP big.LITTLE
big DVFS
LITTLE DVFS
CPU Thermal
GPU Thermal
GPU DVFS
IPA Gen 1 System Thermal
GPU DVFS
HMP big.LITTLE
big DVFS
LITTLE DVFS
2014:
HMP
2015:
HMP, IPA
Improving Power Efficiency, Performance, and Time to Market
Building a Platform for Power and Performance Management
© ARM 2015 16
HMP big.LITTLE
big DVFS
LITTLE DVFS
CPU Thermal
GPU Thermal
GPU DVFS
IPA Gen 1 System Thermal
GPU DVFS
HMP big.LITTLE
big DVFS
LITTLE DVFS
EAS Gen 1 big.LITTLE CPU DVFS
IPA Gen 1 System Thermal
GPU DVFS
2014:
HMP
2015:
HMP, IPA
2016:
EAS Gen 1, IPA
Improving Power Efficiency, Performance, and Time to Market
Building a Platform for Power and Performance Management
© ARM 2015 17
HMP big.LITTLE
big DVFS
LITTLE DVFS
CPU Thermal
GPU Thermal
GPU DVFS
IPA Gen 1 System Thermal
GPU DVFS
HMP big.LITTLE
big DVFS
LITTLE DVFS
EAS Gen 1 big.LITTLE CPU DVFS
IPA Gen 1 System Thermal
GPU DVFS
EAS Gen 2 big.LITTLE CPU DVFS
Advanced Performance control
IPA Gen 2 System Thermal
GPU DVFS
2014:
HMP
2015:
HMP, IPA
2016:
EAS Gen 1, IPA
Future:
EAS+IPA Gen 2
Improving Power Efficiency, Performance, and Time to Market
Building a Platform for Power and Performance Management
© ARM 2015 18
EAS brings together DVFS control and CPU selection
+ Simpler to tune
EAS adds power and performance models to the
Scheduler
+ Supports wide range of topologies
+ Core and OPP selection based on thread load and models
+ Minimal power for required throughput
- Thread performance requirement is inferred only from its load
ARM Intelligent Power Allocation (IPA)
+ Maximizes performance within thermal envelope
+ IPA Generation 1 available in Linux 4.2 kernel
- Only loosely coupled with EAS
EAS Generation 1, IPA – in Development
EAS Gen1
IPA Gen1
GPU
C
PU
O
the
r
Power
Allocator
Perf.
Limiter
Power
Model
Perf
Limiter
Power
Model
Perf
Limiter
Power
Model
DVFS CPU
Selection
Power
Model
Perf
Model
Thread
load
© ARM 2015 19
EAS + IPA generation 2 - Future
Tight integration with thermal
management
IPA provides scheduler with aggregate CPU
power budget
Advanced performance control
Aims to introduce quality of service per
thread controls
Interface would be exposed to middleware
Guided by application performance hints
EAS + IPA Gen 2
GPU
C
PU
O
the
r
Power
Allocator
Perf.
Limiter
Power
Model
Perf
Limiter
Power
Model
Perf
Limiter
DVFS CPU
Selection
Perf
Model
Thread
load
Power
Model
Ad Perf
Control
© ARM 2015 20
Summary
ARM has been driving innovation in Linux
‘Intelligent Power Allocation’. Upstreamed in Linux 4.2 (30-Aug-2015)
‘Energy Aware Scheduling’ for advanced SoC’s
ARM Intelligent Power Allocation is designed to maximise performance:
Proactively adjusts available power budget, based on device temperature
Allocates power dynamically between CPU/GPU, based on workload
Energy Aware Scheduling
Integrates CPU capacity awareness, Energy model, DVFS & Idle into mainline Linux scheduler
Designed to support a wide range of topologies
Prototypes running today
Further integration and improvements being investigated