tutorial: finding hotspots on a remote linux* systemretesting your code for comparative analysis,...

24
Tutorial: Finding Hotspots on a Remote Linux* System Intel ® VTune™ Amplifier for Systems Linux* OS C++ Sample Application Code Document Number: 330219-001 Legal Information

Upload: others

Post on 05-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Tutorial: Finding Hotspots on a RemoteLinux* SystemIntel® VTune™ Amplifier for Systems Linux* OS

C++ Sample Application Code

Document Number: 330219-001

Legal Information

Page 2: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application
Page 3: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

ContentsLegal Information................................................................................ 5Overview..............................................................................................7

Chapter 1: Navigation Quick Start

Chapter 2: Finding HotspotsPrepare Your Target Device....................................................................... 13Cross Build and Load the Sampling Drivers..................................................14Prepare Your Sample Application................................................................15Run Advanced Hotspot Analysis................................................................. 16View Your Results.................................................................................... 18

Chapter 3: Summary

Chapter 4: Key Terms

Contents

3

Page 4: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Tutorial: Finding Hotspots on a Remote Linux* System

4

Page 5: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Legal InformationNo license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by thisdocument.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties ofmerchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising fromcourse of performance, course of dealing, or usage in trade.

This document contains information on products, services and/or processes in development. All informationprovided here is subject to change without notice. Contact your Intel representative to obtain the latestforecast, schedule, specifications and roadmaps.

The products and services described may contain defects or errors which may cause deviations frompublished specifications.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by thisdocument.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties ofmerchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising fromcourse of performance, course of dealing, or usage in trade.

This document contains information on products, services and/or processes in development. All informationprovided here is subject to change without notice. Contact your Intel representative to obtain the latestforecast, schedule, specifications and roadmaps.

The products and services described may contain defects or errors which may cause deviations frompublished specifications.

Intel processor numbers are not a measure of performance. Processor numbers differentiate features withineach processor family, not across different processor families. Go to: Learn About Intel® Processor Numbers

Software and workloads used in performance tests may have been optimized for performance only on Intelmicroprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specificcomputer systems, components, software, operations and functions. Any change to any of those factors maycause the results to vary. You should consult other information and performance tests to assist you in fullyevaluating your contemplated purchases, including the performance of that product when combined withother products.

Cilk, Intel, the Intel logo, Intel Atom, Intel Core, Intel Inside, Intel NetBurst, Intel SpeedStep, Intel vPro,Intel Xeon Phi, Intel XScale, Itanium, MMX, Pentium, Thunderbolt, Ultrabook, VTune and Xeon aretrademarks of Intel Corporation in the U.S. and/or other countries.

*Other names and brands may be claimed as the property of others.

Microsoft, Windows, and the Windows logo are trademarks, or registered trademarks of Microsoft Corporationin the United States and/or other countries.© 2015 Intel Corporation.

5

Page 6: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Tutorial: Finding Hotspots on a Remote Linux* System

6

Page 7: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Overview

Discover how to use Advanced Hotspots Analysis of the Intel® VTune™ Amplifier for Systems tounderstand where your embedded application is spending time by identifying hotspots - the most time-consuming program units. Advanced Hotspots Analysis is useful to analyze the performance of both serialand parallel applications. The Intel VTune Amplifier for Systems supports analysis of remote Linux*applications running on regular or embedded Linux systems, but this tutorial will focus on embeddedplatforms.

About This Tutorial This tutorial uses the sample tachyon and guides you through the basic stepsrequired to use the GUI to analyze the code for hotspots by means of remotedata collection.

Estimated Duration • 20 minutes: Preparing your host and target device for use• 15 minutes: Preparing your sample application and analyzing it

Learning Objectives After you complete this tutorial, you will be able to find hotspots by:

• Preparing Your Target Device• Cross Build and Load Sampling Drivers• Preparing Your Sample Application, tachyon• Running an Advanced Hotspot Analysis• Viewing Your Results

More Resources • The Intel Developer Zone is a site devoted to software development tools,resources, forums, blogs, and knowledge bases, see http://software.intel.com

• The Intel Software Documentation Library is part of the Intel Developer Zoneand is an online collection of Release Notes, User and Reference Guides,White Papers, Help, and Tutorials for Intel software products, http://software.intel.com/en-us/intel-software-technical-documentation

• For troubleshooting the creation and installation of the sep drivers, seehttp://software.intel.com/en-us/articles/troubleshooting-issues-with-sep-in-the-embedded-tool-suite-intel-system-studio

Start Here

7

Page 8: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Tutorial: Finding Hotspots on a Remote Linux* System

8

Page 9: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Navigation Quick Start 1 The Intel® VTune™ Amplifier for Systems provides information on code performance for users

developing serial and multithreaded applications on supported embedded platforms. VTune Amplifier helpsyou analyze algorithm choices and identify where and how your application can benefit from availablehardware resources. It reports your most significant problems thereby showing you the best ways to utilizeyour available optimization schedule and resources.

VTune Amplifier for Systems Graphical User Interface (GUI) AccessThe VTune Amplifier installation includes shell scripts that you can run in your terminal window to set uprequired environment variables:

1. From the installation directory, enter source amplxe-vars.sh.

This script sets the PATH environment variable that specifies locations of the product's graphical userinterface and command line utilities.

NOTEFor the VTune Amplifier for Systems installed as part of Intel System Studio, the default<install_dir> is:

For super-users: /opt/intel/system_studio_<version>/vtune_amplifier_<version>_for_systemsFor ordinary users: $HOME/intel/system_studio_<version>/vtune_amplifier_<version>_for_systemsFor the standalone VTune Amplifier for Systems installed without Intel System Studio, the default<install_dir> is:

For super-users: /opt/intel/vtune_amplifier_for_systems_<version>For ordinary users: $HOME/intel/vtune_amplifier_for_systems_<version>

2. You can modify your login shell to include these important shell variables. For example, if you use thebash shell, you can add this line to your $HOME/.bashrc: source /opt/intel/vtune_amplifier_<version>_for_systems/amplxe-vars.sh

3. Enter amplxe-gui to launch the product graphical interface.

9

Page 10: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Configure and manage projects and results, and launch new analyses from the primarytoolbar. Click the Project Properties button on this toolbar to manage result file locations.Newly completed and opened analysis results along with result comparisons appear in theresults tab for easy navigation.

Use the VTune Amplifier menu to control result collection, define and view project properties,and set various options.

The Project Navigator provides an iconic representation of your projects and analysisresults. Click the Project Navigator button on the toolbar to enable/disable the ProjectNavigator.

Click the (change) link to select a viewpoint, a preset configuration of windows/panes for ananalysis result. For each analysis type, you can switch among several viewpoints to focus onparticular performance metrics. Click the yellow question mark icon to read the viewpointdescription.

Switch between window tabs to explore the analysis type configuration options and collecteddata provided by the selected viewpoint.

Use the Grouping drop-down menu to choose a granularity level for grouping data in the grid.

Use the filter toolbar to filter out the result data according to the selected categories.

Next step: Finding Hotspots

1 Tutorial: Finding Hotspots on a Remote Linux* System

10

Page 11: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Finding Hotspots 2 Use the Intel® VTune™ Amplifier for Systems to identify and analyze hotspot functions in your serial or

parallel embedded application by performing a series of steps in a workflow. This tutorial guides you throughthese workflow steps while using a sample ray-tracer application named tachyon that runs on yourembedded device.

To optimize the performance of your embedded application, you must first understand its currentperformance qualities as it runs on the embedded device. You then modify the application based on thatperformance data, and, check the new performance metrics to compare the results. You can repeat this cycleuntil the results match your performance goals.

When you check the performance of your application you run advanced sampling-based performance analysison the application as it runs. These analyses help you identify performance hotspots and bottlenecks. If theyare not where you expect them, then you can rewrite your code accordingly and test again. Running ananalysis after each change allows you to verify that each change results in the desired improvement.Running multiple analysis checks also allows a comparison with the initial unchanged run to determine apoint of diminishing returns.

To obtain this important sampling-based performance data, you compile and run your application in asupported, embedded development environment. Then you define and launch a profiling agent, called aremote data collector, which runs on the embedded device. This remote data collector then records specifiedperformance data collected from your running application.

Then this performance information is automatically transferred to a host system where you can view andanalyze it, and plan your optimization strategy and its implementation based on your available time andresources. Your embedded application must be cross compiled and present on this host system as well, sothat your results will accurately reflect the function names and the line numbers in your code. While thereare several supported embedded OS versions, this tutorial focuses on the Yocto Project* 1.* environment.The tachyon sample code has been optimized for the Yocto Project environment. Additional information canbe found at https://www.yoctoproject.org/about. If you choose to run this tutorial on an embeddedsystem with a different Linux* OS distribution, you will need to provide your own sample application, kernelversion, and kernel source directory.

To summarize, for this tutorial you will collect data on your embedded system with the VTune Amplifier GUIamplxe-gui and SSH communication, started from the host system.

Copying the kernel and drivers from your host to your target system is a one-time setup procedure, afterwhich you can run multiple data collection sessions and view and compare the results.

11

Page 12: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Once you have collected performance data you make modifications to your code to improve its performanceprofile, and test again.

NOTEThis tutorial focuses on obtaining the baseline results for Advanced Hotspots Analysis and the tachyonsample application. For more information on the iterative process of testing, modifying, improving, andretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code

To find hotspots in your application complete these activities:

Step 1: Prepareyour targetdevice

• Build a Yocto* Project kernel• Install a target package including remote collectors• Configure ssh for a no-password connection

Step 2: Crossbuild and loadsampling drivers

• Cross build and load the sampling driver (sep)

Step 3: Prepareyour sampleapplication

• Cross compile tachyon for use• Copy tachyon to your Yocto Project target

Step 4: RunAdvancedHotspot Analysis

• Use the Intel VTune Amplifier for Systems GUI to set up your remoteconfiguration

• Run Advanced Hotspot Analysis

Step 5: Viewyour results

• View your results in Intel VTune Amplifier for Systems

Next step: Prepare Your Target Device

2 Tutorial: Finding Hotspots on a Remote Linux* System

12

Page 13: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Prepare Your Target DeviceUse the following steps to set up your target device after you have installed the VTune Amplifier for Systemson your host.

NOTEYou will not be able to identify time-consuming code in your application using Advanced HotspotsAnalysis if the nmi_watchdog interrupt capability is enabled on your target system, which preventscollecting accurate event-based sampling data. You will have to disable nmi_watchdog interrupt, orsee the "Troubleshooting" section of the product documentation for details.

1. If you have not yet done so, download the Yocto Project* version appropriate for your system.

A list of supported host system distributions and required packages for each distribution is availablehere: http://www.yoctoproject.org/docs/current/ref-manual/ref-manual.html#intro-requirements. For a list of all Yocto Project versions, see https://www.yoctoproject.org/downloads/yocto-project. The Yocto Project Quick Start document for your selected versionprovides detailed installation and configuration steps. The Quick Start document and all otherdocumentation is available from https://www.yoctoproject.org/documentation.

In this tutorial, we are using the Yocto Project version 1.2.1. Kernel version and source directoryinformation provided in the examples are specific to this version.

NOTEThe tachyon sample code used in this tutorial has been optimized for the Yocto Project environment.Other Linux distributions can be analyzed using VTune Amplifier, but you will need to provide yourown application. To run the tutorial using a different Linux distribution, be sure to note the kernelversion and kernel source directory for use in building the VTune Amplifier drivers.

2. Copy the required package archive located at /opt/intel/vtune_amplifier_for_system/targeton your host system to the /opt/intel directory on your target system and unzip it.

• linux32\vtune_amplifier_target_x86.tgz for x86 systems• linux64\vtune_amplifier_target_x86_64.tgz for 64-bit systems

NOTEUnzip both x86 and x86-64 packages if you plan to run and analyze 32-bit processes on 64-bitsystems.

a. Copy the file to the target system using the following command:scp -r <filename> root@<IP address>:/opt/intel/

b. Extract the file on the target system using the following command:tar -xvsf <filename>

NOTEYou can find detailed instructions for setting up your target Linux system in the Preparing a TargetLinux* System for Remote Analysis online help topic at https://software.intel.com/en-us/linux_target_setup.

3. Configure ssh to work in password-less mode so it does not prompt for a password on each invocation.To do this, use the key generation utility on the host system.

a. Generate the key with an empty passphrase:host> ssh-keygen

Finding Hotspots 2

13

Page 14: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

b. Copy the key to the target system:host> cat ~/.ssh/id_dsa.pub | ssh user@target "cat >> ~/.ssh/authorized_keys"You will need the target user password to complete this operation. If this command completessuccessfully, you will not require it afterwards.

Make sure that only the owner (root) has read/write/execute permissions to the $HOME/.ssh/directory and that such a directory exists. In these examples target can be a hostname or IPaddress.

c. After you set the password-less mode, run a command to verify that a password is not requiredanymore. For example:host> ssh user@target ls

NOTEAn example of building a Yocto project and installing it is available at the Intel® Developer Zonehttps://software.intel.com/en-us/forums/topic/507002.

Next step: Cross Build and Load Sampling Drivers

Cross Build and Load the Sampling DriversBuild the sampling drivers for your target environment on your Linux* host and transfer them to the target,where you load them into the kernel you customized for this purpose. If you do not build the drivers for yourspecific device by version and build number, your driver will not load; or, if it loads, it will not work. To findthe kernel-version, see $KERNEL-SRC-DIR/include/generated/utsrelease.h. To find what version of thekernel is currently running, use the uname -a command on the target.

1. Change into the source directory: cd /opt/intel/vtune_amplifier_for_systems/sepdk/src2. Build the sampling driver using the following command:

./build-driver -ni --c-compiler=<compiler>\--kernel-src-dir=<kernel source location>--kernel-version=<kernel version>\--make-args="PLATFORM=x32 ARITY=smp"--install-dir=<install target location>

For example:./build-driver -ni --c-compiler=i586-poky-linux-gcc\--kernel-src-dir=~/yocto/poky-denzil-7.0/build/tmp/work/\fri2_noemgd-poky-linux/linuxyocto3.2.11+git1\+5b4c9dc78b5ae607173cc3ddab9bce1b5f78129b_1+7\6dc683eccc4680729a76b9d2fd425ba540a483-r1/linux-fri2-noemgd-\standard-build --kernel-version=3.0.24-yocto-standard\--make-args="PLATFORM=x32 ARITY=smp" --install-dir=../prebuilt

3. Once the driver files are built, copy them from your host to your target machine using the followingcommands:host> cd /opt/intel/vtune_amplifier_for_systemshost> scp -r sepdk root@<IP address>:/home/root

4. Load the sampling drivers on your target machine using the following commands:target> cd /home/root/sepdk/srctarget> ./insmod-sep3 -re

For example, the command output could look like the following:Checking for PMU arbitration service(PAX)...detected.PAX service is accessible to users in group "0"Executing: insmod ./sep3_15-x32-3.0.24-yocto-standardsmp.koCreating /dev/sep3_15 base devices with major number 251...done.Creating /dev/sep3_15 percpu devices with major number 250 ... done.The sep3_15 drivers has been successfully loaded.

2 Tutorial: Finding Hotspots on a Remote Linux* System

14

Page 15: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Checking for vtsspp driver ... not detected.Executing: insmod ./vtsspp/vtsspp-x32-3.0.24-yocto-standardsmp.ko gid=0 mode=0666The vtsspp driver has been successfully loaded.

For some embedded Linux systems the insmod-vtsspp command may not work. In that event, youcan load the kernel module directly by using insmod:

./insmod-sep3 -recd /home/root/sep/sepdk/src/vtssppinsmod vtsspp.ko

5. Confirm that the driver has been installed:

lsmod | grep sepsep3_10 80108 0lsmod | grep vtssppvtsspp 295740 0

NOTEYou can find detailed instructions for installing your Linux target drivers at the online documentation:Preparing a Target Linux* System for Remote Analysis at https://software.intel.com/en-us/linux_target_setup.

Next step: Prepare Your Sample Application

Prepare Your Sample ApplicationThe Intel® VTune™ Amplifier for Systems release includes sample code called tachyon for you to compileand use on the target system. The tachyon sample code included with your distribution is modified for theYocto* environment. The needed changes to the Makefiles listed in this section have been completed in themakefiles located in your distribution, which are included as examples. After compiling tachyon, copy theapplication to your target.

1. On the host Linux* system, change directories so you can untar the sample code:cd /~yocto

2. Unarchive (untar) the tachyon sample application:tar xvzf /opt/intel/vtune_amplifier_for_systems/samples/en/C++/tachyon_vtune_amp_xe.tgz

3. Open the top-level Makefile.

The line containing CXX has been commented out. In the lower level tachyon/common/gui/Makefile.gmake file, the following lines have been added:

4. If the host system is x86_64, you must comment some lines in the Makefile:#ifeq ($(shell uname -m),x86_64#Arch=intel64 #CXXXFLAGS+= -m64#elseArch=ia32CXXFLAGS+= -m32#endif

Finding Hotspots 2

15

Page 16: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

5. Source important environmental variables:source /opt/poky/1.2/environment-setup-i586-poky-linux

6. Compile the tachyon code:make

7. Copy the tachyon binary, the dat folder and the libtbb.so folder to an appropriate location on yourtarget system where the executable can find it.scp tachyon_find_hostspots dat lbbtbb.so root@<IP address>:<target location>For example:scp tachyon_find_hotspots dat libtbb.so root@target_ip:/usr/local/sbin

Next step: Run Advanced Hotspot Analysis

Run Advanced Hotspot AnalysisThe following steps show you how to launch the Intel® VTune™ Amplifier for Systems GUI and create a newproject.

1. Run amplxe-gui. Refer to the steps in Navigation Quick Start to set the appropriate environmentvariables if you have not already done so.

2. Click New Project and enter an identifying project name such as tachyon1 so that you can distinguishthis project from other projects. Keep or change the default project file Location: and click CreateProject.

3. Set up the analysis target.

2 Tutorial: Finding Hotspots on a Remote Linux* System

16

Page 17: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

a. Select remote Linux (SSH) for the target system.b. Specify the user name and the host name or IP address of the remote system you are profiling via

SSH.c. Enter the full path for the target binary in the Application field. In this example the path is /

home/root/tachyon_find_hotspots.d. Enter any the path to the data file in the Application parameters field. In this example, the path

is /home/root/dat/balls.dat.When collecting data remotely, the VTune Amplifier looks for the collectors on the target device in itsdefault location: /opt/intel/vtune_amplifier_201x_for_systems.<package_num>. It alsotemporary stores performance results on the target system in the /tmp directory. If you followed thesteps detailed in Prepare Your Target Device, then the collectors were installed in the default location. Ifyou installed the target package to a different location and need to specify another temporary directory,make sure to configure your settings from the Analysis Target tab for your project.• Use the VTune Amplifier installation directory on the remote system option to specify the

path to the VTune Amplifier on the remote system. If default location is used, the path is providedautomatically.

• Use the Temporary directory on the remote system option to specify a non-default temporarydirectory.

• Alternatively, use the -target-install-dir and -target-temp-dir options from the commandline.

4. Click Choose Analysis to switch to the Analysis Type tab.5. Select the Advanced Hotspots analysis type. You will notice communication with the remote system

before the Analysis Type screen appears.

Finding Hotspots 2

17

Page 18: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

6. Click the Start button to launch the Advanced Hotspots Analysis session.

The VTune Amplifier sets up the now passwordless SSH connection to your target device and launches thetarget application. It collects Advanced Hotspots data with default settings, and then copies those resultsback to the host.

Next step: View Your Results

View Your ResultsAfter the target device sends the tachyon results - usually within a minute or two - the results appear onyour display:

Next step: Prepare your own embedded applications for analysis using the VTune Amplifier toview hotspots.

2 Tutorial: Finding Hotspots on a Remote Linux* System

18

Page 19: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Summary 3 You have completed the Finding Hotspots tutorial. Here are some important things to remember when

using the Intel® VTune™ Amplifier for Systems to analyze your code for hotspots:

Step Tutorial Recap Key Tutorial Take-aways

1. Prepare yourtarget device

You installed a stable Yocto Projectkernel; copied the appropriate VTuneAmplifier for Systems files to yourtarget system; and setup apassword-less connection.

• Download and extract an appropriatetoolchain from the Yocto Project web siteand create an installation area.

• Build a Yocto Project kernel for yourtarget.

• Configure ssh so there is no passwordrequest for file transfers between yourserver and target.

2. Build and loadthe samplingdrivers

You compiled the sampling drivers onyour host system and loaded themon your target system.

• Compile sampling drivers and transferthem to your target for use.

3. Prepare yoursampleapplication

You extracted the tachyon code and,if necessary, modified it for use inyour specific embedded environment.

• Unarchive tachyon in the /~yoctodirectory.

• View the necessary changes to the toplevel Makefile.

• View the necessary changes to the lowerlevel Makefile.gmake.

4. Run AdvancedHotspot Analysis

You ran the VTune Amplifier GUI toconfigure and launch AdvancedHotspot Analysis on the tachyoncode on your target device. It ran onyour target and the results were sentvia ssh back to your server.

• Launch the GUI using the amplxe-guicommand.

• Use the Analysis Target tab to chooseand configure your analysis target.

• Use the Analysis Type tab to choose,configure, and run the AdvancedHotspot Analysis.

5. View yourresults

You viewed the Advanced Hotspotsanalysis on the tachyon applicationin the VTune Amplifier for SystemsGUI.

• You can also use the VTune Amplifiercommand-line interface by running theamplxe-cl command to test your codefor hotspots and regressions. For detailssee the Command-line Interface Supportsection in the VTune Amplifier onlinehelp.

Optimization Notice

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors foroptimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, andSSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, oreffectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain

19

Page 20: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Optimization Notice

optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer tothe applicable product User and Reference Guides for more information regarding the specific instructionsets covered by this notice.

Notice revision #20110804

3 Tutorial: Finding Hotspots on a Remote Linux* System

20

Page 21: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Key Terms 4

baseline : A performance metric used as a basis for comparison of the application versions before and afteroptimization. Baseline should be measurable and reproducible.

CPU time : The amount of time a thread spends executing on a logical processor. For multiple threads, theCPU time of the threads is summed. The application CPU time is the sum of the CPU time of all the threadsthat run the application.

CPU usage: A performance metric when the VTune Amplifier identifies a processor utilization scale,calculates the target CPU usage, and defines default utilization ranges depending on the number of processorcores.

Utilization Type

Defaultcolor

Description

Idle All CPUs are waiting - no threads are running.

Poor Poor usage. By default, poor usage is when the number of simultaneouslyrunning CPUs is less than or equal to 50% of the target CPU usage.

OK Acceptable (OK) usage. By default, OK usage is when the number ofsimultaneously running CPUs is between 51-85% of the target CPU usage.

Ideal Ideal usage. By default, Ideal usage is when the number of simultaneouslyrunning CPUs is between 86-100% of the target CPU usage.

Elapsed time : The total time your target ran, calculated as follows: Wall clock time at end ofapplication – Wall clock time at start of application.

finalization : A process during which the Intel® VTune™ Amplifier converts the collected data to a database,resolves symbol information, and pre-computes data to make further analysis more efficient and responsive.

hotspot: A section of code that took a long time to execute. Some hotspots may indicate bottlenecks andcan be removed, while other hotspots inevitably take a long time to execute due to their nature.

Advanced Hotspots Analysis: A non-default analysis type used to understand the application flow ofcontrol and to identify hotspots, that works directly with the CPU without the influence of the bootedoperating system. VTune Amplifier creates a list of functions in your application ordered by the amount oftime spent in a function. It also detects the call stacks for each of these functions so you can see how the hotfunctions are called. VTune Amplifier uses a low overhead (about 5%) user-mode sampling and tracingcollection that gets you the information you need without slowing down the application executionsignificantly.

A target is an executable file you analyze using the Intel® VTune™ Amplifier.

host system : The Linux* server on which you install amplxe-gui and from which you launch yourapplication analysis and view those results.

target system: The supported, embedded device on which you install sampling drivers and run theapplication you are running performance analysis on.

21

Page 22: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

viewpoint : A preset result tab configuration that filters out the data collected during a performanceanalysis and enables you to focus on specific performance problems. When you select a viewpoint, you selecta set of performance metrics the VTune Amplifier shows in the windows/panes of the result tab. To select therequired viewpoint, click the (change) link and use the drop-down menu at the top of the result tab.

4 Tutorial: Finding Hotspots on a Remote Linux* System

22

Page 23: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

IndexHHotspot Analysis, Run Advanced16

NNavigation Quick Start9

RResults, View Your18

SSample Application, Prepare Your15Sampling Drivers, Cross Build and Load14Summary19

TTarget Device, Prepare Your13

Index

23

Page 24: Tutorial: Finding Hotspots on a Remote Linux* Systemretesting your code for comparative analysis, see Tutorial: Finding Hotspots - C++ Sample Code To find hotspots in your application

Tutorial: Finding Hotspots on a Remote Linux* System

24