hpc performance & development tuning tools for scientists to go parallel faster with allinea
DESCRIPTION
TRANSCRIPT
Get Performance on Intel® Xeon Phi™ with Allinea MAP and Allinea DDT
Discovering bottlenecks without pain
… we develop new antibiotics faster than bacteria develop resistance
... every household can prototype and evolve their own 3D-printed designs
… accurate simulation of the natural world is taken for granted
In my Parallel Universe…
… create parallel development tools for scientists:
So I decided to…
We’re accelerating the pace of scientific progress
HPC on the critical path to progress
Single Core Era Multi-Core Era Many-Core Era
Constraints :- Power- Complexity of algorithms
Constraints :- Power- Parallel software
availability- Scalability
Constraints :- Programming models
Per
form
ance
Time(years)
• Parallel profiler designed for:
‒ C/C++, Fortran
‒ MPI code
Interdependent or independent processes
‒ Multithreaded code
Monitor the main threads for each process
‒ Accelerated codes
GPUs, Intel® Xeon Phi™
• Improve productivity :‒ Helps you detect performance issues quickly and easily
‒ Tells you immediately where your time is spent in your source code
‒ Helps you to optimize your application efficiently
Allinea MAPIncrease application performance
• Support for I/O metrics
‒ I/O can be a major bottleneck in HPC systems
‒ Find the optimal configuration for your file system.
Benefit : Broader profiling and analysis capabilities to solve
even more performance issues.
• Support for Intel® Xeon Phi™
‒ Already supported on Allinea DDT
‒ Officially extended to profiling
Benefit : Ensure you are getting the best performance from
new technology.
Allinea MAP 4.2New features in 2013
NEW
Optimizing for Intel® Xeon Phi™Where do you start?
“Code that’s well-optimized for the host usually performs pretty well on the cards”
- Almost everybody
Optimizing for Intel® Xeon Phi™But what matters?
Vectorization
Other stuff
Performance
Optimizing for Intel® Xeon Phi™Is my code well-vectorized?
… maybe?
Allinea Performance ReportsIs my code well-vectorized?
Optimizing for Intel® Xeon Phi™Is my code well-vectorized?
… maybe?
Optimizing for Intel® Xeon Phi™Is my code well-vectorized?
… maybe?
Not in this loop
(16.5% of total time)
• Full, graphical debugger designed for :
‒ C/C++, Fortran, Intel® Xeon Phi™, UPC, …
‒ MPI, OpenMP and mixed-mode code
• Unified interface with Allinea MAP :
‒ Just what you need when you’ve added
OpenMP and now everything segfaults!
‒ One interface eliminates learning curve
‒ Spend more time on your results
• Slash your time to develop :
‒ Reproduces and triggers your bugs instantly
‒ Helps you easily understand where issues come from quickly
‒ Helps you to fix them as swiftly as possible
Allinea DDTUnified interface for debugging
Allinea at the forefront of science with COSMOS and Intel® Xeon Phi™
“While I was porting CAMB to offload certain parts of it to Intel® Xeon
Phi™, I wasted weeks debugging it because the offloads were basically
opaque. I only had print statements to help me.”
Allinea at the forefront of science with COSMOS and Intel® Xeon Phi™
“Using DDT's new offload debugging I can now look at the offload
code and look at the state of the array on the Intel® Xeon Phi™ side
before it is manipulated”
Allinea at the forefront of science with COSMOS and Intel® Xeon Phi™
Fix is easy - either set NOCOPY->IN or just set the thing
to zero on the MIC side which is probably cheaper.”
Allinea at the forefront of science with COSMOS and Intel® Xeon Phi™
“I’m now using MAP – it shows that the code is fairly well vectorised at 70%.
This will have to be improved a bit to get the most out of the coprocessors.”
• Ten years of high-quality development tools
‒ Leading in HPC software tools market worldwide
‒ Global customer base
• Making parallel programming accessible to the widest range of
scientists and programmers
‒ Design an unrivaled productive and easy-to-use development
environment…
‒ … To help you reach the highest level of performance and scalability
‒ Define a new standard of customer support
Allinea Software
Summary
The premier Intel® Xeon Phi™ development environment from Allinea
– Is your code ready for Intel® Xeon Phi™? Run a Performance Report!
– See which loops are important to vectorize with Allinea MAP
– Stay productive with full profiling and debugging on both host and
coprocessor
– Powerful unified interface with industry-leading technical support to help
you get the job finished faster
Visit us at our booth #1719 to see this in action!
Enter our Performance Reports competition to win a Kindle Fire every day!