recent power management enhancements in dpdk€¦ · 4 usage options • bare metal –librte_power...
TRANSCRIPT
x
Recent Power Management
Enhancements in DPDKDAVID HUNT, CHRIS MACNAMARA
2
Updates Since Last Time
• Quick reminder…. feature review
• Updates & Discussions to follow
• How does an application let something know how busy it is?
• New telemetry mechanism target polling workloads
• How do we get more performance for the same power?
• New hardware mechanisms for giving frequency where it’s needed
3
Existing DPDK Power Capabilities
Many use cases, support for direct control, virtualized architecture
Virtual Machine
DPDK application
virtio-serialPolicy
Control
VM Power Manager on host
DPDK Sample applications
Linux Power Governor
CPU0 CPU1
VCPU0 VCPU1
CPU7CPU6
DPDK Sample applicationsOn host
Time of dayPacket Arrival Rate
Librte_power APIs
Librte_power APIs
NIC PF
NIC VF
Vsistats
SamplePolicyLibrte_power APIs
4
Usage Options
• Bare Metal – librte_power APIs
• Turn up/down frequency to save energy, performance via
• Virtualized Applications
• App in VM sends change frequency command to host (vm_power_manager)
• App in VM sends policy (time of day) to host (vm_power_manager)
• Virtualized Environment with no Application Instrumentation
• Using CPU counters, can tell idle or busy, not how busy…
Docs on the way :
https://dpdk-power-docs.readthedocs.io/en/latest/sample_app_ug/vm_power_management.html
5
Existing DPDK Power Features
Challenge / Problem DPDK Solution / Status
L3fwd power using C states (updates
coming)
Sample app
Traffic always running, always on cores. Save
power when low traffic, boost when busy.
Added core Frequency State APIs including Turbo
Boost
Virtualized Software Architecture: High
latency of direct requests to change
frequency
Move to policy based control: Time of day / Packet
Arrival Rate
App Agnostic mechanism to detect when
DPDK is 100% polling and no packets or work
Sample code: Branch prediction ratio used as
trigger to detect idle -> modify power
Pin DPDK threads/lcores to high priority cores Pinning relevant workloads to Turbo Cores
Power Policies for Containers FIFO interface to Power Manager that accepts
policies via JSON
Librte_power APIs and Sample Apps
6
New DPDK Features Since Last Time!
Challenge / Problem DPDK Solution / Status
Exposing how busy an application is to external
systems, no standard method
Introduce new telemetry for busyness metrics,
patch to DPDK, collectd
Some workloads need more performance to help
balance a multi-core workload
Add support for Intel® Speed Select Technology
– Base Frequency – pinning to high priority cores
Enhanced security around Container
communication to Power Manager
Updates completed
New triggers and capabilities enabling new use cases
Determining
Busyness of
Polling
Applications
….
And How to Use It
8
1. Busy Indication Used to Save Energy
DPDK
100% Polling
Remote/Slow Loop Reaction (s/m/h)
Application Load
DPDK
100% Polling
DPDK
100% Polling
VM Container Bare Metal
collectd
Telemetry
Agent
DPDK plugin
CPU Power Controls
Adjust
Frequency to
match load
Busyness
Telemetry
Node Agent
9
2. Busy Indication to Detect Overload
DPDK
100% Polling
Remote/Slow Loop Reaction (s/m/h)
Overload
DPDK
100% Polling
DPDK
100% Polling
VM Container Bare Metal
Collectd
Telemetry
Agent
Dpdk plugin
CPU Power Controls
Traffic steering
Busyness
Telemetry
Load Balancing
Turbostat
pluginpkgwatts
Node Agent
10
Pushed Patches To Support This (telemetry)
@Init
Application calculates what
constitutes 100% busy
@development
Application implements
algorithm for it’s own ‘busyness’
@run
Application populates
relevant metrics
@run
Third party app pulls metrics
(collectd, dpdk_telemetry.py)
• Released as part of DPDK 19.08
• In addition to per-port metrics, telemetry lib
now has global metrics per app (busyness,
number of polls, etc.).
• L3fwd-power implements example algorithm
to demonstrate populating the new metrics.
• dpdk-telemetry.py now shows the new
global stats
• New plugin for “collectd” being upstreamed
to be able to view DPDK telemetry.
11Use telemetry to make informed decisions
Allowing applications publish how busy they are
• Patch to telemetry library to allow for
global stats, not just port specific
• Patch to l3fwd-power to add telemetry
mode to publish busyness
• New plugin for collectd to read this
telemetry
• White paper in progress to demonstrate
use case for putting it all togetherPlatform
collectd100% Polling
e.g DPDK
VM
GrafanaNetwork Load
Controller
Load Balancer
No
de
or A
pp
lica
tion
Ov
erlo
ad
ind
ica
tion
Traffic
Busy
Indicati
onMetrics
PkgWatt
s
1
1
23
4 Steering
decision
Network Platform
12
Use case for busyness telemetry
• Alert generated when busyness goes over 80%
• Network Load Controller splits traffic on each alert
• Each traffic stream goes to different destination
• Load balancing takes effect
Orchestration now DPDK Busyness visibility
Rx Bytes across all VMs / Containers
Alerts
Rx bytes across all VMs / Containers
‘busyness’ across all VMs / Containers
1 stream 2 streams 3 streams 4 streams
Maximising
Performance
within the same
Power Envelope
14
Prioritization?
Performance Prioritization Opportunity
DPDK Based Networking Workloads
• Unbalanced workloads – NFV Data Plane or Control Plane, OVS
• Pipeline software architectures
• Frequency bound workloads
• Priority threads for run to completion
• Packet distribution / workload distribution in SW
• PMD consolidation
15Unlock performance bottlenecks
Re-balancing power & frequency to enhance
DPDK performance
• A CPU mode allows SW to configure an
asymmetric core frequency.
• The placement of key workloads on higher
frequency enabled cores can result in an
overall system workload increase as compared
to deploying the CPU with symmetric core
frequencies.
• 6-8 High Priority cores, depending on CPU.
• Intel® SST-BF allows this configuration
16
Application Interface
@Init
Application pins critical
workloads to high priority cores
@init
Application queries capabilities
of each available core
@run
Application runs as normal
• Additional bit returned by
rte_power_get_capabilities indicating High Priority
cores:
Implemented in l3-fwd-power sample
17
No DPDK Application or Workload Mods
• Just launch…using EAL launch to map lcores to physical cores
• It was hidden (to us anyway)
• https://doc.dpdk.org/guides/linux_gsg/linux_eal_parameters.html
“
”
Thank You
Chris MacNamara ([email protected])
David Hunt ([email protected])
Acknowledgements
Lee Daly, Reshma Pattan, Anatoly Burakov, Liang Ma
+ “Collectd” team https://github.com/collectd