intel® cluster studio xe introduction to intel® cluster studio xe...

36
Software & Services Group, Developer Products Division Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. Introduction to Intel® Cluster Studio XE (ICS XE) Intel® Cluster Studio XE

Upload: others

Post on 25-Jun-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Introduction to

Intel® Cluster Studio XE (ICS XE)

Intel® Cluster Studio XE

Page 2: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Notices

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. This document contains information on products in the design phase of development. All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps.

Intel, VTune, Cilk, Xeon and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.

*Other names and brands may be claimed as the property of others

Copyright© 2012 Intel Corporation. All rights reserved.

Page 3: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intention of this Lesson

• Learn how to use each tool easily and effectively

• This lesson is kind of a “Getting Started Guide”

showing explicit command lines

• Provide self learning and reference material that

can be referenced when needed for user codes

• Many users will not need to know much more than

it is covered here to work effectively with the tools

Page 4: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Agenda

• ICS XE Overview – description of each tool

• Intel® MPI

• Intel® Trace Analyzer and Collector – MPI visualization

• Intel® Inspector XE – threading and memory checking

• Intel® VTune™ Amplifier XE – performance and threading profiler

• Labs – practical exercises

• Summary

Page 5: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Cluster Studio XE Overview

• Intel® Cluster Studio XE is a collection of tools for developing, debugging, tuning and maintaining HPC applications

• Intel® Inspector XE and Intel® VTune™ Amplifier XE extend the Cluster Tool Kit. These tools are especially useful for hybrid applications combining shared memory threading and distributed memory message passing

Page 6: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Cluster Studio XE Overview

Intel® Composer XE

• Intel® Cluster Studio XE includes the Intel C, C++ and Fortran Compiler

• Many additional features of Intel® Composer XE go beyond classical compilers, for example: correctness checking and libraries like the Intel® Math Kernel Library (MKL)

• The Intel Fortran compiler implements key Fortran 2008 features like Coarray Fortran, and almost a complete implementation of Fortran 2003

Page 7: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Cluster Studio XE Overview

Intel® MPI

• Intel® MPI is derived from MPICH2

• Many additional features make Intel® MPI more user

friendly compared to MPICH2

• Intel® MPI Library provides best of class performance:

– Extensive performance tuning on key algorithms for

example: collective operations

– Intel® MPI package includes the MPITUNE tool for

automatic choice of best algorithms and settings

Page 8: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Cluster Studio XE Overview

Intel® Trace Analyzer and Collector (ITAC)

• ITAC is a tool for understanding Intel® MPI program behavior, finding bottlenecks and performance analysis

• ITAC is more than a profiler because it visualizes temporal behavior of MPI routines showing dependencies and load imbalances

• A correctness checking library is also available

• ITAC is easy to use, simply invoke it by setting an extra flag to mpirun/mpiexec or by setting an environment variable without changing your application or your run scripts

Page 9: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Cluster Studio XE Overview

Intel® Inspector XE

• contains thread- plus additional memory checking

• finds race conditions among threads and potential deadlocks by evaluating all potential memory access patterns

• may be used for hybrid HPC applications combining threading and MPI

• finds the root causes of problems and does not just show symptoms!

Page 10: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Cluster Studio XE

Intel® VTune™ Amplifier XE

• combines classical program performance analysis and thread profiling

• For hybrid HPC applications the threaded part may be investigated by Intel® VTune™ Amplifier XE. Each MPI process may be profiled with some limitation on the event counters

• For each MPI rank a result directory for further analysis with the Intel® VTune™ Amplifier XE GUI is generated. The result generation can be limited to a subset of ranks

Page 11: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Cluster Studio XE – environment setting

• For most Intel tools and tool kits there is an environment script defining all necessary paths etc.

Cluster Studio XE has a similar script:

$ source /opt/intel/ics/<version>/bin/ictvars.sh

After sourcing this, no additional paths to binaries and libs have to be specified. It is recommended to include

the above line into .bashrc

Page 12: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intro to Intel® MPI – Compilation (1/4)

• A simple test program is part of the IMPI distribution. Versions for C, C++ and Fortran are available. We may go with the C version: $ cp $I_MPI_ROOT/test/test.c

$ mpiicc -o test.x test.c

mpiicc is the wrapper script for icc. For gcc it is mpicc. Also available mpiifort, mpiicpc and mpicxx

Page 13: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intro to Intel® MPI – Run test.x (2/4)

• Intel® MPI provides mpirun script like mpich1: $ mpirun –n <nprocs> ./test.x

• This works on a single node and systems with automatic node file settings. For more nodes we usually need to define a hostfile with a single node name per line:

• $ mpirun -f <hostfile> -n <nprocs> ./test.x

Page 14: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intro to Intel® MPI – Run test.x (3/4)

• test.x prints out rank and hostname for each MPI process

• More information about basic settings for this run may be obtained by setting: $ export I_MPI_DEBUG=5 and run the program again. The environment variable will be propagated to all ranks automatically.

Page 15: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel MPI – Output of test.x

Page 16: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® MPI – simple process placement

• Intel® MPI by default pins processes to cores, sockets and nodes. The default strategy might change in the future but the user may enforce different mappings

• The most easy way is to use the process per node flag: $ mpirun –ppn <nprocs-per-node> –n <nprocs> ./test.x

<nprocs-per-node> == 1 is round robin with next process on next node. The effect of other settings may be explored with test.x

Page 17: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® MPI – ppn = 1

Page 18: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Trace Analyzer and Collector (ITAC)

• ITAC may be applied without touching the program or environment. One way to get a first trace is: $ mpirun –trace –n <nprocs> ./test.x

• Alternatively, just set the preload library and run without the –trace flag: $ export LD_PRELOAD=libVT.so $ mpirun –f <hostfile> -n <nprocs> ./test.x this is actually what the flag does internally. This methodology may be applied to situations with complex run scripts not knowing where the mpirun is actually executed. Note: this does not work for statically linked Intel® MPI (not recommended).

Page 19: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

• ITAC will generate several files inside the directory

where you started mpirun. Just start traceanalyzer

in this directory:

$ traceanalyzer test.x.stf

• Alternatively there is a Windows version of

traceanalyzer contained in the Linux ICS package.

Introduction to ITAC – view trace

Page 20: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

• After starting ITAC a window showing a basic timing profile for MPI and

Application will be displayed. Right click on the red MPI bar to show the profiling for each used MPI routine: ungroup MPI

ITAC: Function profile

Page 21: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

ITAC: Event Timeline

• Most important view of ITAC is the Event Timeline. This shows the temporal development of MPI routines and messages: Charts -> Event Timeline

or just: Ctrl+Alt+E

Page 22: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

ITAC: Event Timeline

Page 23: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

ITAC: Correctness Checker

• Correctness Checker validates MPI correctness. It uses another library but may be started like the ordinary ITAC: $ mpirun –check –n <nprocs> ./test.x or $ export LD_PRELOAD=libVTmc.so

$ mpirun –n <nprocs> ./test.x

Page 24: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

• Intel® Inspector XE offers memory checking and correctness checking for threaded applications. For MPI applications we may use it in the following way:

$ source /opt/intel/inspector_xe/inspxe-vars.sh intel64

$ mpirun –n <N> inspxe-cl --result-dir <result dir> --collect <mode> -- <MPI executable>

The command line version inspxe-cl is used as the MPI executable. Lab example: $ mpirun –n 4 inspxe-cl --result-dir insp_mi3 --collect mi3 -- ./poisson.x

$ mpirun –n 4 inspxe-cl --result-dir insp_ti3 --collect ti3 -- ./poisson.x

mi3 and ti3 are the most demanding memory and threading modes.

Intel® Inspector XE

Page 25: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

• After running the MPI program result directories should appear with the previously defined base name and indexed with MPI rank.

• Results may be viewed as ASCII output: $ inspxe-cl --report problems --r insp_mi3.0 or by using the Intel® Inspector XE GUI: $ inspxe-gui insp_mi3.0 Results may also be transferred to a Windows* computer and viewed by the Windows* version of Intel® Inspector XE

Intel® Inspector XE -- results

Page 26: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Inspector XE GUI – (memory) -- Summary

Identified Problems:

double click to open

source view

Filter by different criteria

Source code locations where problem occurs

Page 27: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Inspector XE GUI – (memory) -- Source

Call stack for both source

locations

Code location, source file,

function

More detailed Source

Page 28: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Inspector XE (threading) -- Summary

Identified Problems:

double click to open

source view

Filter by different criteria

Source code locations where problem occurs

Page 29: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

• Intel® VTune™ Analyzer XE provides detailed information timings and core events. It can also provide insight into the behavior of threaded applications:

$ source /opt/intel/vtune_amplifier_xe/amplxe-vars.sh

$ mpirun –n <N> amplxe-cl --result-dir <result dir> --collect <mode> -- <MPI executable>

The command line version amplxe-cl is used as the MPI executable. Detailed Lab example: $ mpirun –n 4 amplxe-cl --result-dir axe_ho -collect hotspots -- ./poisson.x

$ mpirun –n 4 amplxe-cl --result-dir axe_co –c concurrency -- ./poisson.x

hotspots and concurrency are predefined analysis types. Concurrency only makes sense with additional threading

Intel® VTune™ Amplifier XE

Page 30: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

• After running the MPI program result directories should appear with the previously defined base name and indexed by MPI rank.

• Results may be viewed as ASCII output: $ amplxe-cl --report hotspots -r axe_ho.0 or by using the Intel® Vtune™ Amplifier GUI: $ amplxe-gui axe_ho.0 Results may also be transferred to Windows Laptop and viewed by the Windows* version of Intel® Vtune™ Amplifier XE

Intel® VTune™ Amplifier XE -- results

Page 31: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® VTune™ Amplifier XE – Hotspots

Timing and #Threads

Hotspots ordered by CPU time

Compute node info

Page 32: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Vtune™ Amplifier XE – Bottom up

Function, Stack, Timing –

double click for source

view

OpenMP Threads

Function stack details

Page 33: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Intel® Vtune™ Amplifier GUI – Source

Source with timing per

source line

OpenMP Threads

Function stack details

Page 34: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Summary

• Short Intel® Cluster Studio XE Overview with single slide per tool showing history and HPC related purpose of each separate tool that is part of Intel® Cluster Studio XE

• Short introduction into each tool showing explicit command lines for most simple usage scenarios. These introductions can be viewed as “Quick Start Guides“. The emphasis is on simplicity.

• Support for Hybrid Threading + MPI application by inclusion of new MPI aware versions of Intel® Inspector XE and Intel® VTune™ Amplifier XE

• Labs – practical exercises

Page 35: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Performance Caveats and Notes

• Performance varies with each application, regardless of the technology and methods used.

• Certain types of HPC applications are amenable to acceleration and it is important to understand their characteristics.

• Once an application is identified to take advantage of acceleration, the high level and low level techniques are expected to work equally well.

37

Page 36: Intel® Cluster Studio XE Introduction to Intel® Cluster Studio XE …community.hartree.stfc.ac.uk/access/content/group/admin... · 2013-04-15 · This document contains information

Software & Services Group, Developer Products Division

Copyright© 2012, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.

Optimization Notice

38