intel tools to optimize hpc systems
DESCRIPTION
Intel Software Conference 2014 Werner Krotz-Vogel [email protected]TRANSCRIPT
Intel tools to optimize HPC systems May 2014
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Agenda
Intel® Developer Products Overview
Intel® Parallel Studio XE and Cluster Studio XE 2013 Overview
What’s new with XE 2015 Beta ?
Where to get ?
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Web App Performance Client System Technical Computing
Deploy apps on multiple platforms
using one codebase
Native cross-platform C++ development for multimedia apps and
more
Create fast, efficient embedded & mobile devices/systems in
less time
Improve application performance, scalability and
reliability
Intel® Developer Products
Intel® XDK
Intel® Quark
Intel® INDE
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
• Industry-leading performance from advanced compilers
• Comprehensive libraries
• Parallel programming models
• Insightful analysis tools
More Cores. Wider Vectors. Performance Delivered. Intel® Parallel Studio XE and Intel® Cluster Studio XE
Serial Performance
Scaling Performance
Efficiently Multicore Many-core
128 Bits
256 Bits
512 Bits
50+ cores
More Cores
Wider Vectors Task & Data
Parallel Performance
Distributed Performance
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Phase Product Feature Benefit
Build
Intel® Composer XE Compilers, Performance and Threading Libraries Out of the box performance
Intel® MPI Library† High Performance Message Passing (MPI) Library Interconnect independence
Intel® Advisor XE Threading Prototyping Tool (Studio XE products only) Simplifies parallel application design
Verify & Tune
Intel® VTune™ Amplifier XE Performance Profiler Find performance bottlenecks
Intel® Inspector XE Memory & Threading Dynamic and Static Analysis Code quality, improved security
Intel® Trace Analyzer & Collector† MPI Performance Profiler Find performance bottlenecks in
cluster-based applications
Efficiently Produce Fast, Scalable and Reliable Applications
Intel® Parallel Studio XE 2013 and Intel® Cluster Studio XE 2013 Service Pack 1
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Support for Latest Intel Processors and Coprocessors
† Hardware events for new processors added as new processors ship. †† Analysis runs on multicore processors, provides analysis for multicore and many-core processors.
New Product Announcements Embargoed until September 4,
8am Pacific Time
Intel® Haswell microarchitecture
Intel® Broadwell microarchitecture
Intel® Xeon Phi™ coprocessor
Intel® C++ and Fortran Compiler
✔ ✔ ✔
Intel® TBB library ✔ ✔ ✔
Intel® MKL library ✔ ✔ ✔
Intel® MPI library ✔ ✔ ✔
Intel® VTune™ Amplifier XE† ✔ ✔ ✔
Intel® Inspector XE†† ✔ ✔ ✔
Now with Windows* support
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Intel® C++ and Fortran Compiler
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Intel® C++, Intel® Fortran, with Performance Libraries Intel® Composer XE
Industry leading application performance, serial and parallel
Intel compilers: Intel Fortran and Intel C++ with Intel® Cilk Plus
Intel Performance Libraries Intel® Threading Building Blocks Intel® Math Kernel Library Intel® Integrated Performance Primitives
Architecture support: IA 32, Intel 64, Intel® Xeon Phi™ product family, Intel compatible processors
Compatibility Windows: Visual* C++ and Visual Studio* 2008, 2010, 2012 Linux, Mac OS X, including Mountain Lion: gcc and, for C++ Eclipse & Xcode for Mac
Performance Compatibility Support Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Leadership Application Performance More Performance for your C++ applications
Just recompile Uses Intel® AVX and Intel® AVX2 instructions Intel® Xeon Phi™ product family support (Linux) Intel® Cilk™ Plus: Tasking and vectorization
More Performance for your Fortran applications
Just recompile Intel® Xeon Phi™ product family: Linux compiler, debugger support Access to Intel® AVX and Intel® AVX2 instructions (-xa or /Qxa) Auto-parallelizer & directives to access SIMD instructions Coarrays & synchronization constructs support parallel programming Loop optimization directives: VECTOR, PARALLEL, SIMD More control over array data alignment (align arrayNbytes) New in 2013 XE SP1 release: more Fortran 2008 support
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Up to 4x Faster Performance with Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Support
SSE / SSE2
AVX-512
AVX / AVX2
Enables higher performance for the most demanding computational tasks
Intel® Compilers and Intel® Math Kernel Library will be updated in Q4 with AVX-512 support - Significant leap to 512-bit SIMD support
- Increased compatibility with AVX
- One byte longer EVEX prefix, enabling additional
functionality
- First implemented in the future Intel® Xeon Phi™ coprocessor, code named Knights Landing
4x up to
faster
2x
up to
faster
Peak single precision floating point performance
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Support for offload to Intel® Graphics Technology Redesign of optimization reports including vectorization report New icl/icl++ compilers on OS X* Full C++11 language support: virtual overrides inheriting constructors, deprecation of exception specifications user defined literals thread_local
Full Fortran 2003 support (Parameterized Derived Types added) Fortran 2008 Blocks support Almost all OpenMP* 4.0 (only missing user-defined reductions) Keyword versions of SIMD pragmas added _Simd, _Safelen, _Reduction
Use arithmetical and logical operators with SIMD data types (like __m128)
What’s New in Intel Composer XE 2015 Beta
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
-ansi-alias enabled by default at –O2 and above on Linux C++ -fast/-Ofast enables –fp-model fast=2 gcc-compatible function multiversioning aligned_new header Fortran option –init=snan to initialize all uninitialized SAVEd scalar and
array variables of type REAL and COMPLEX to signaling NaNs __intel_simd_lane() intrinsic to represent simd lane number in a SIMD
vector function Compiler option –no-opt-dynamic-align to disable generation of multiple
code paths depending on alignment of data Improved lambda function debugging Permit non-contiguous data transfers on #pragma offload gdb* debugger supports Fortran (Intel® Debugger removed) Ability to create custom install packages from online install
New Features Overview
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Released Linux* operating systems supported: Fedora* 20 Red Hat Enterprise Linux* 6 SUSE LINUX Enterprise Server* 11 Ubuntu* 12.04 LTS (64-bit only), 13.10 Debian* 6.0, 7.0
Also intending to support these operating systems†: Fedora 21 Red Hat Enterprise Linux 7 SUSE LINUX Enterprise Server 12 Ubuntu* 14.04 LTS
Note the following are now not supported in this release: Red Hat Enterprise Linux 5 SUSE LINUX Enterprise Server 10
Supported Platforms in 2015 Beta
† These operating systems are not released as of the date of this presentation. Intel Composer XE does not support operating systems until after they are officially released. Refer to product release notes for support details.
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Every report (-opt-report, -vec-report, -par-report, -openmp-report) now put under single –opt-report interface
Other report options will still work, but report information generated and the way it’s generated will map to new model
Output now defaults to a one report file per generated object model. Can be changed using –opt-report-file=<filename|stderr|stdout>.
Report information designed to be more readable and actionable
Redesigned Optimization Reports
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Intel® Math Kernel Library
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Intel® Math Kernel Library (Intel® MKL) • Vectorized and threaded for highest performance on all Intel
and compatible processors • De facto standard APIs for simple code integration • Compatible with all C, C++ and Fortran compilers • Royalty-free, per developer licensing for low cost deployment
#1 used math library
in the world Source: Evans Data 2011- 2013 WW Developer Surveys
Just Link to the Next Intel® MKL Version to Realize New Processor Performance 16
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Cluster PARDISO Intel® Direct Sparse Solver for Cluster (Intel® CPardiso) is a powerful tool set for solving
system of linear equations with sparse matrix of millions rows/columns size. Intel® CPardiso provides an advanced implementation of the modern algorithms and could be considerate an expansion of Intel MKL Pardiso on cluster computations
Atom optimizations (for Airmont) For the BLAS and FFT Domains
S/C/Z/DGEMM improvements on small matrix sizes
What’s New in Intel MKL 11.2 Beta
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Intel MKL Cookbook recipes New document with recipes for assembling Intel MKL routines for solving complex
problems
Verbose mode for BLAS and LAPACK MKL Verbose mode provides information about usage of MKL routines called by
customers (set environment variable MKL_VERBOSE=1)
What’s New in Intel MKL 11.2 Beta
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Intel® Inspector XE
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Static Analysis Code & Security Errors
Dynamic Analysis Memory Errors
Intel® Inspector XE - Deliver More Reliable Applications Intel® Inspector XE and Intel® Parallel Studio XE family of suites
20
Threading Errors
Static Analysis & Pointer Checker are only available in the Parallel Studio XE family of suites. Not sold separately.
Pointer Checker Pointer Errors
Intel®
Inspector XE alone
Added bonus features in Intel®
Parallel Studio XE and Intel® Cluster Studio XE suites
Intel Inspector XE dynamically instruments & runs the application and watches for errors. Use any build, any compiler (debug build is best).
Intel compiler inspects source. Use any compiler for production.
Intel compiler run time checks. Use any compiler for production.
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
What’s New in Intel Inspector XE 2015 Beta
Improved On-Demand Leak Reporting and Memory Growth Control!
New Memory usage graph – Get real-time information about memory in use on your system!
Thread Checking performance improved by 3X – with a reduction in memory footprint as well!
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Intel® Advisor XE
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
1) Analyze it.
3) Tune it.
4) Check it.
5) Do it!
2) Design it. (Compiler ignores these annotations.)
Design Then Implement Intel® Advisor XE – Threading Prototyping Tool
23
Less Effort, Less Risk, More Impact
Design Parallelism • No disruption to regular
development • All test cases continue to
work • Tune and debug the design
before you implement it
Implement Parallelism
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
What’s New in Intel Advisor XE 2015 Beta Improved Viewing and Advanced
Modeling of Suitability information!
New Target Platforms option – See modeling based on Xeon or Xeon Phi!
New Iteration Space Modeling section – Run a smaller sample and see what happens when you scale up!
New Task details option - Information about differences between iterations moved to its own view for additional clarity!
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners. 25
Intel Confidential
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Where to get?
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Intel® Parallel Studio XE Suites Leading development suites for application performance
27
Create fast, reliable code
Intel® Cluster
Studio XE
Intel® Parallel
Studio XE
Ana
lysi
s
● ● Intel® VTune™ Amplifier XE - Performance Profiler
● ● Intel® Inspector XE - Memory & Thread Analyzer
● ● Static Analysis & Pointer Checker - Find Coding & Security Errors
● ● Intel® Advisor XE - Threading Prototyping Tool
● Intel® Trace Analyzer & Collector - MPI Optimizing Tool
Com
pile
rs
&
Libr
arie
s
● ● Intel® Compiler - Optimizing Compiler for C, C++ and Fortran
● ● Intel® Integrated Performance Primitives† - Media and Data Optimizations
● ● Intel® Threading Building Blocks† - Parallelize Applications for Performance
● ● Intel® Math Kernel Library - High Performance Math
● Intel® MPI Library - Flexible, Efficient and Scalable Messaging
† Available for C, C++ only C, C++ only and Fortran only versions of Parallel Studio XE are also available.
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Pricing and Availability
Includes Intel® C++ Composer XE
Intel® Fortran Composer
XE
Intel® Inspector XE
Intel® VTune™ Amplifier XE
Intel® MPI
Library
Intel® Trace Analyzer and
Collector
Price
Intel® Parallel Studio XE • • • • $2,299
Intel® Cluster Studio XE • • • • • • $2,949
Additional configurations including, floating and academic, are available at:
http://intel.com/software/products
Intel Confidential — Do Not Forward
Copyright© 2014, Intel Corporation. All rights reserved. *Other brands and names are the property of their respective owners.
Legal Disclaimer & Optimization Notice
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Copyright © , Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon, Xeon Phi, Core, VTune, and Cilk are trademarks of Intel Corporation in the U.S. and other countries.
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
Intel Confidential