profiling multicore systems to maximize core utilization
DESCRIPTION
Profiling Multicore Systems to Maximize Core Utilization – Colin Walls Underutilization of cores in a multicore system can be considered a bug. As your system incorporates more cores, you need to make sure that all the cores are being utilized fully. Un-expected inter-actions between processes, the operating system, and resources can prevent cores from delivering peak performance. In this session explore how to profile what each core is doing, which processes are running on each core, and understand where core utilization falls below optimum values.TRANSCRIPT
mentor.com/embedded
Colin Walls
Profiling Multicore Systems to
Maximize Core Utilization
Multicore Drives Complexity
*Source: VDC Research Group, STRATEGIC INSIGHTS 2012: EMBEDDED SOFTWARE & TOOLS MARKET, TRACK 2: Embedded Software Engineering Technologies, VOLUME 3: Software Development & Multicore Tool.
Almost two-thirds of all future projects plan to use multi-core
or multi-processor devices!
Previous Project Current Project Next Two Years
23.9%
32.2%37.2%
Open Source OS Use
Current Project Next Two Years0%
70%
62.7% 65.1%
Multicore & Multi-processor Use
Manual Debugging
Complexity Stresses Timing
Single ProcessorSingle Core
Bare MetalRTOS
Single Application
Manual Debugging
Multi - ProcessorMulti - Core
Complex OS
Multiple Applications
Different Defects
Complex Debugging
!
Debuggers: Stop and Stare
Debuggers are indispensable, but they only show a snapshot.
From this photo, can you tell if this building will be completed on schedule?
– How long does it usually take this worker?
– Would better tools help?
– Are other workers sitting idle?
Construction Worker by Rubber Dragon
5
Tracing, Instrumenting, Logging
Historically, tracing involved a hardware instrument– Or on-chip logic
– Buffer size limited
– Completely non-intrusive
– Ideal in ISS
Instrumenting application code– Adding custom code
– Maybe condition compile
– Debugging with printf()
Logging option with many RTOSes
Photo by woodleywonderworks
Beyond Debuggers
Answering the higher-level questions require information that traditional interactive debuggers lack:
– Tracing historical state
– Application awareness
Tracing can help find:– race conditions
– latencies
– bugs that don't cause traps
– systems where stopping the worldisn't feasible
... in both application and platform code
HRB, Analyzer, Sep 2012 7
Trace Data Sources – Linux Trace Toolkit
Sourcery Analyzer focuses on LTTng to record and collect trace data on Linux.
– Mature, high-performance tracing system for Linux
– Can record both kernel and userspace events
– Low overhead
Linux Trace Toolkit - next generation
Sourcery Analyzer with LTTng Architecture
8
C/C++ Application
Linux Kernel
LTTng Consumer Daemon
Storage
memoryflashdisk
network FS
hostLinux target
Sourcery Analyzer netw
ork
LTTng 2.0 Attributes
Tracepoints
• Low overhead• No trap or system call
required• Suitable for use in
realtime systems• Inactive tracepoints have
negligible overhead
Common Trace Format
• New compact binary format
• Flexible data layout• Network streamable• Size and seek optimized
for very large trace files
Deployment
• Loadable kernel module (2.6.38+)• Companion target side daemons and libraries
Linux Kernel Tracing - 3.6.6
250+ Tracepoints
Sourcery Analyzer - Not Just A Trace Viewer
Trace viewing tools depend on users to find the patterns.
Sourcery Analyzer focuses on analysis. Task-centric Analysis Agents calculate and display the higher-level patterns.
Analysis Agents
Event List
Viewing Trace Data
Sourcery Analyzer inherited its engine from Mentor's high-end hardware design tools.
– high-performance event database
– sophisticated measurement tools
– variety of visualization types
Visualize event payloads, notjust events.
Lamborghini Engine by Dr. Warner
Customizability is Important Most developers are working on the
application, but most debugging tools provide only platform awareness.
application
hardware
operating system
platform
where most work occurs
Sourcery Analyzer
out-of-the-box Analysis Agents
customized Analysis Agents
Stock 3rd‑party
Tools
To compensate, developers often cobble together in-house debugging tools.
Mentor Embedded Sourcery Analyzer provides platform visibility and a rich platform for user-developed analysis tools.
In-house Tools
Analysis Agents
• Out-of-box access to powerful analysis routines
• Ships with library of 15 popular agents
• One-click flow to automatically generate pre-processed analysis views
• Ability to also create and add customized agents to the library
14
Software thread state
Scheduling
CPU utilization
IRQ rate
Page fault rate
Function call flow
CPU state
Filesystem activity
Network activity
Thread migration rate
or add your own
Sourcery Analyzer Graph Types
Multicore Utilization
Statistical
Step
State
Flow
Scatter
Histogram Spectral
Tick
Digital
Floating
Real World Example
Old Design
RTOS, single-core
New Design
Linux, multicore
16
max
imum ~200
ms
aver
age
~150ms
min
imum ~40
ms
max
imum 7000+
ms
aver
age
~150ms
min
imum ~40
ms
HRB, Analyzer, Sep 2012 17
Diagnosing Problems: Real-time Response
Common problem: a real-time deadline is occasionally, but rarely, missed.
Approach:– Instrument the start/stop measurement points (e.g. IRQ and application's “read”
function).
– Run the test workload.
– Use Sourcery Analyzer to highlight only the missed deadlines.
– Correlate those occurrences with other system activities on the timeline.
– If more detailed data is needed, add instrumentation and repeat.
OK Not OKUser-specified budget