intel hpc update
DESCRIPTION
Presentation from the HPC event at IBM Denmark - September 2013, CopenhagenTRANSCRIPT
INTEL CONFIDENTIAL
Intel® HPC Portfolio
and Roadmap Update
Gareth Tucker
IBM Technical Account Manager EMEA
INTEL CONFIDENTIAL
This slide MUST be used with any slides removed from this presentation
Legal Disclaimers
2
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information.
Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps.
The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. This document contains information on products in the design phase of development.
All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copyright © 2013, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.
INTEL CONFIDENTIAL
Legal Disclaimers Continued Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different
processor families. Go to: http://www.intel.com/products/processor_number
Intel® HT Technology available on select Intel® processors. Requires an Intel® HT Technology-enabled system. Consult your system manufacturer.
Performance will vary depending on the specific hardware and software used. For more information including details on which processors support HT
Technology, visit http://www.intel.com/info/hyperthreading.
Intel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, virtual machine monitor (VMM). Functionality,
performance or other benefits will vary depending on hardware and software configurations. Software applications may not be compatible with all
operating systems. Consult your PC manufacturer. For more information, visit http://www.intel.com/go/virtualization
No computer system can provide absolute security under all conditions. Intel® Trusted Execution Technology (Intel® TXT) requires a computer system
with Intel® Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXT-compatible
measured launched environment (MLE). Intel TXT also requires the system to contain a TPM v1.s. For more information, visit
http://www.intel.com/technology/security
Requires a system with Intel® Turbo Boost Technology. Intel Turbo Boost Technology and Intel Turbo Boost Technology 2.0 are only available on select
Intel® processors. Consult your PC manufacturer. Performance varies depending on hardware, software, and system configuration. For more
information, visit http://www.intel.com/go/turbo
Intel® AES-NI requires a computer system with an AES-NI enabled processor, as well as non-Intel software to execute the instructions in the correct
sequence. AES-NI is available on select Intel® processors. For availability, consult your reseller or system manufacturer. For more information,
see http://software.intel.com/en-us/articles/intel-advanced-encryption-standard-instructions-aes-ni/
Intel, Intel Xeon, the Intel Xeon logo and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States
and other countries. Other names and brands may be claimed as the property of others
3
This slide MUST be used with any slides removed from this presentation
INTEL CONFIDENTIAL
Legal Disclaimers: Performance Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as
measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other
sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on
the performance of Intel products, Go to: http://www.intel.com/performance/resources/benchmark_limitations.htm.
Relative performance is calculated by assigning a baseline value of 1.0 to one benchmark result, and then dividing the actual benchmark result for the baseline platform
into each of the specific benchmark results of each of the other platforms, and assigning them a relative performance number that correlates with the performance
improvements reported.
SPEC, SPECint, SPECfp, SPECrate. SPECpower, SPECjAppServer, SPECjEnterprise, SPECjbb, SPECompM, SPECompL, and SPEC MPI are trademarks of the Standard
Performance Evaluation Corporation. See http://www.spec.org for more information.
SAP and SAP NetWeaver are the registered trademarks of SAP AG in Germany and in several other countries. See http://www.sap.com/benchmark for more information.
Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or
configuration may affect actual performance.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and
MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the
results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the
performance of that product when combined with other products.
4
This slide MUST be used with any slides with performance data removed from this presentation
INTEL CONFIDENTIAL
Legal Disclaimers INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or go to: http://www.intel.com/design/literature.htm
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Intel does not control or audit the design or implementation of third party benchmarks or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmarks are reported and confirm whether the referenced benchmarks are accurate and reflect performance of systems available for purchase.
Relative performance is calculated by assigning a baseline value of 1.0 to one benchmark result, and then dividing the actual benchmark result for the baseline platform into each of the specific benchmark results of each of the other platforms, and assigning them a relative performance number that correlates with the performance improvements reported.
SPEC, SPECint, SPECfp, SPECrate. SPECpower, SPECjAppServer, SPECjbb, SPECjvm, SPECWeb, SPECompM, SPECompL, SPEC MPI, SPECjEnterprise* are trademarks of the Standard Performance Evaluation Corporation. See http://www.spec.org for more information. TPC-C, TPC-H, TPC-E are trademarks of the Transaction Processing Council. See http://www.tpc.org for more information.
Hyper-Threading Technology requires a computer system with a processor supporting HT Technology and an HT Technology-enabled chipset, BIOS and operating system. Performance will vary depending on the specific hardware and software you use. For more information including details on which processors support HT Technology, see here
Intel® Turbo Boost Technology requires a Platform with a processor with Intel Turbo Boost Technology capability. Intel Turbo Boost Technology performance varies depending on hardware, software and overall system configuration. Check with your platform manufacturer on whether your system delivers Intel Turbo Boost Technology. For more information, see http://www.intel.com/technology/turboboost
Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families: Go to: Learn About Intel® Processor Numbers
Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications. All dates and products specified are for planning purposes only and are subject to change without notice
Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps.
Xeon® is a trademark of Intel Corporation in the U.S. and/or other countries.
Copyright © 2013 Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon and Intel Core are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. All dates and products specified are for planning purposes only and are subject to change without notice
5
This slide MUST be used with any slides with performance data removed from this presentation
*Other names and brands may be claimed as the property of others.
INTEL CONFIDENTIAL
Optimization Notice
6
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations
that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSE3 instruction
sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any
optimization on microprocessors not manufactured by Intel.
Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors.
Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer
to the applicable product User and Reference Guides for more information regarding the specific instruction
sets covered by this notice.
Notice revision #20110804
This slide MUST be used with any slides with performance data removed from this presentation
INTEL CONFIDENTIAL
Intel® Xeon® Processor
E5-2600 v2 Product Family
Solve Real Problems, Deliver Real Results
INTEL CONFIDENTIAL
Tick-Tock Development Model: Sustained Microprocessor Leadership
8
Intel® Core™ Microarchitecture
New Micro-
architecture
Xeon® 5300
65nm
TOCK
Xeon® 5400
New
Process Technology
45nm
TICK
Intel® Microarchitecture Codename Nehalem
New Micro-
architecture
Xeon® 5500
45nm
TOCK
Xeon® 5600
32nm
New
Process Technology
TICK
Intel® Microarchitecture Codename Sandy
Bridge
Xeon® E5- 2600
32nm
New Micro-
architecture
TOCK
22nm
New
Process Technology
TICK
Intel® Microarchitecture Codename Haswell
Haswell
22nm
New Micro-
architecture
TOCK
Future
14nm
New
Process Technology
TICK
Xeon® E5- 2600 v2
Latest Micro-architecture on Leading Process Technology
Intel® Xeon® Processor E5-2600 v2 Product Family
INTEL CONFIDENTIAL
.2
Secure
Improved security with Intel® Secure Key &
Intel® OS Guard for additional HW embedded
security plus enhanced AES-NI
Efficient
Leading 22nm manufacturing process
reduces power usage. Supports Intel® Node
Manager & Intel® Data Center Manager
Software
Up to 12 cores and 30MB cache
expected to deliver up to 30%1 more
performance in same power envelope
vs previous generation
Powerful
At the Heart of a Modern Data Center
Intel ® Xeon ® E5-2600 v2 product family
9 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are
measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult
other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
1 Source: Intel internal measurements. {SPECint*_rate_base2006, 28 March 2013, E5-26xxv2 (12C, 2.5GHz,) vs. E5-2600 (8C, 2.9Ghz, ). Results have been simulated and are provided for informational purposes only. Results were derived using simulations run on an architecture simulator or model.
Any difference in system hardware or software design or configuration may affect actual performance. Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps.. For
more information go to http://www.intel.com/performance
*Other names and brands may be claimed as the property of others.
Intel® Xeon® Processor E5-2600 v2 Product Family
INTEL CONFIDENTIAL
Unrelenting Focus on Power Efficiency
1. Source: Intel internal measurements: [Baseline Configuration and Score on SPECPower_ssj2013* benchmark. Idle power based on , Intel® Xeon ® processor E5- 26xx v2 (12C, 2.5GHz, 95W), 28 March 2013]. Results have been simulated and are provided for informational purposes only. Results were derived using simulations run on an architecture simulator or model. Any difference in system hardware or software design or configuration may affect actual performance. Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps. For more information go to http://www.intel.com/performance
*Other names and brands may be claimed as the property of others.
10
Active Power
Delivering up to 45%1 power efficiency improvements through enhanced fine grain
power controls and 22nm tri-
gate process
Dynamic Power
Efficient Turbo that
intelligently adapts to peak
workloads conditions and
disengages when Memory
and I/O are the bottlenecks
Idle Power
Low leakage process
technology and power gating
technology contribute to Idle
Power of up to 23%1 lower than previous generation
Xeon E5-2600 v2
INTEL CONFIDENTIAL
REAL Performance Where it Counts Xeon E5-2600 v2
11
IMPROVED
integrated IO (PCIe 3.0)
NEW
security features
NEW
virtualization feature
IMPROVED
faster memory
50% MORE
cores / threads
50% MORE
last-level cache
~30%1 less
idle power
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are
measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult
other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
1. Source: Intel internal measurements: [idle power, Intel® Xeon ® processor E5- 26xx v2 (12C, 2.5GHz, 95W), 28 March 2013]. Results have been simulated and are provided for informational purposes only. Results were derived using simulations run on an architecture
simulator or model. Any difference in system hardware or software design or configuration may affect actual performance. Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intel’s
current plan of record product roadmaps. For more information go to http://www.intel.com/performance
* Other names and brands may be claimed as the property of others
Intel® Xeon® Processor E5-2600 v2 Product Family
INTEL CONFIDENTIAL
Reduce Bottlenecks
1. Source: Intel internal measurements of average time for an I/O device read to local system memory under idle conditions comparing Intel® Xeon® processor E5-2600 product family (230 ns) vs. Intel® Xeon® processor 5500 series (340 ns). See notes in backup for configuration details .2. Source: 8 GT/s and 128b/130b encoding in PCIe* 3.0 specification enables double the interconnect bandwidth over the PCIe* 2.0 specification (www.pcisig.com/news_room/November_18_2010_Press_Release/). * Other names and brands may be claimed as the property of others with Intel® Integrated I/O
12
Unleash the full I/O capabilities of Xeon® E5
with Intel® Ethernet X540 Server10GbE Adapter
or Intel® True Scale 7300 series HCAs
Better Together
NETWORKING STORAGE APPLIANCES Increase I/O Performance
HPC TRADING Reduce I/O Latency LARGE SCALE
ANALYTICS
with Intel® Integrated I/O
Intel® Xeon® Processor E5-2600 v2 Product Family
INTEL CONFIDENTIAL
13
Intel® Xeon® processor E5-2600 v2 Product Family
Socket compatible replacement
for Intel® Xeon® processor
E5-2600 product family
Up to 12 cores and 30MB
cache expected to deliver up to
50%1 more performance in
same power envelope
Improved security with Intel®
Secure Key & Intel® OS Guard
for additional HW embedded
security
Up to 30MB Shared Cache
Intel Xeon Processor E5-2600 v2
Integrated PCI Express* 3.0
Up to 40 lanes per socket
4 channels of up
To DDR3 1866
MHz memory
* Other names and brands may be claimed as the property of others
1 1Baseline Configuration and Score on SPECVirt_sc2013* benchmark. Please contact your Intel representative to obtain Intel’s current plan of record product roadmaps. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance
Intel® Xeon® Processor E5-2600 v2 Product Family
World Record Performance
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Source: http://newsroom.intel.com/community/intel_newsroom/blog/2013/06/17/intel-powers-the-worlds-fastest-supercomputer-reveals-new-and-future-high-performance-computing-technologies 14
• E5-2600 v2 featured in the #1 supercomputer “Milky Way-2” on the
Top500 list
• With 12 cores running up to 2.7 GHz, E5-2600 v2 delivers 259 GFlops
per socket, a 56% increase over the previous generation
• E5-2600 v2 also in 2 other supercomputers on the Top500 list - #54
and #329
INTEL CONFIDENTIAL 15
Advanced
Standard
LGA2011 (E5) LGA2011 (E5)
Workstation Only SKU
Segment Optimized
2.5MB/c cache More cache as noted 8.0 GT/s QPI Intel® HT
Technology DDR3-1866 Intel® Turbo Boost
Technology
Low Power
Basic
Intel® Xeon® Processor E5-2600 v2 Product Family
6C 80W 2.1GHz E5-2620 v2
4C 80W 2.5GHz E5-2609 v2
10C 115W 2.5GHz E5-2670 v2
8C 95W 2.0GHz E5-2640 v2
4C 80W 1.8GHz E5-2603 v2
6C 80W 2.6GHz E5-2630 v2
10C 130W 3.0GHz E5-2690 v2
10C 115W 2.8GHz E5-2680 v2
8C 95W 2.6GHz E5-2650 v2
10C 95W 2.2GHz E5-2660 v2
12C 130W 2.7GHz 30M E5-2697 v2
12C 115W 2.4GHz 30M E5-2695 v2
8C 150W 3.4GHz E5-2687W v2
8C 130W 3.3GHz 25M E5-2667 v2
6C 130W 3.5GHz 25M E5-2643 v2
4C 130W 3.5GHz 15M E5-2637 v2
10C 70W 1.7GHz E5-2650L v2
6C 60W 2.4GHz E5-2630L v2
2.5MB/c cache 10C 8.0 GT/s QPI 6C 7.2 GT/s QPI DDR3-1600 Intel® HT
Technology Intel® Turbo Boost
Technology
6C-15MB, 8C-20MB cache
7.2 GT/s QPI Intel® HT
Technology DDR3 1600 Intel® Turbo Boost
Technology
8C-20MB, 10C-25MB cache
8.0 GT/s QPI Intel® HT
Technology DDR3-1866 (skt R) DDR3-1600 (skt B2) Intel® Turbo Boost
Technology
4C-10MB, 2C-5MB cache
6.4 GT/s QPI DDR3 1333
15
Tray only SKUs T
T
T
T
T
INTEL CONFIDENTIAL 16
Advanced
Standard
LGA2011 (E5) LGA2011 (E5)
Workstation Only SKU
Segment Optimized
2.5MB/c cache More cache as noted 8.0 GT/s QPI Intel® HT
Technology DDR3-1866 Intel® Turbo Boost
Technology
Low Power
Basic
6C 80W 2.1GHz E5-2620 v2
4C 80W 2.5GHz E5-2609 v2
10C 115W 2.5GHz E5-2670 v2
8C 95W 2.0GHz E5-2640 v2
4C 80W 1.8GHz E5-2603 v2
6C 80W 2.6GHz E5-2630 v2
10C 130W 3.0GHz E5-2690 v2
10C 115W 2.8GHz E5-2680 v2
8C 95W 2.6GHz E5-2650 v2
10C 95W 2.2GHz E5-2660 v2
12C 130W 2.7GHz 30M E5-2697 v2
12C 115W 2.4GHz 30M E5-2695 v2
8C 150W 3.4GHz E5-2687W v2
8C 130W 3.3GHz 25M E5-2667 v2
6C 130W 3.5GHz 25M E5-2643 v2
4C 130W 3.5GHz 15M E5-2637 v2
10C 70W 1.7GHz E5-2650L v2
6C 60W 2.4GHz E5-2630L v2
2.5MB/c cache 10C 8.0 GT/s QPI 6C 7.2 GT/s QPI DDR3-1600 Intel® HT
Technology Intel® Turbo Boost
Technology
6C-15MB, 8C-20MB cache
7.2 GT/s QPI Intel® HT
Technology DDR3 1600 Intel® Turbo Boost
Technology
8C-20MB, 10C-25MB cache
8.0 GT/s QPI Intel® HT
Technology DDR3-1866 (skt R) DDR3-1600 (skt B2) Intel® Turbo Boost
Technology
4C-10MB, 2C-5MB cache
6.4 GT/s QPI DDR3 1333
16
Tray only SKUs T
T
T
T
T
Intel® Xeon® Processor E5-2600 v2 Product Family
INTEL CONFIDENTIAL 17
Advanced
Standard
LGA2011 (E5) LGA2011 (E5)
Workstation Only SKU
Segment Optimized
2.5MB/c cache More cache as noted 8.0 GT/s QPI Intel® HT
Technology DDR3-1866 Intel® Turbo Boost
Technology
Low Power
Basic
6C 80W 2.1GHz E5-2620 v2
4C 80W 2.5GHz E5-2609 v2
10C 115W 2.5GHz E5-2670 v2
8C 95W 2.0GHz E5-2640 v2
4C 80W 1.8GHz E5-2603 v2
6C 80W 2.6GHz E5-2630 v2
10C 130W 3.0GHz E5-2690 v2
10C 115W 2.8GHz E5-2680 v2
8C 95W 2.6GHz E5-2650 v2
10C 95W 2.2GHz E5-2660 v2
12C 130W 2.7GHz 30M E5-2697 v2
12C 115W 2.4GHz 30M E5-2695 v2
8C 150W 3.4GHz E5-2687W v2
8C 130W 3.3GHz 25M E5-2667 v2
6C 130W 3.5GHz 25M E5-2643 v2
4C 130W 3.5GHz 15M E5-2637 v2
10C 70W 1.7GHz E5-2650L v2
6C 60W 2.4GHz E5-2630L v2
2.5MB/c cache 10C 8.0 GT/s QPI 6C 7.2 GT/s QPI DDR3-1600 Intel® HT
Technology Intel® Turbo Boost
Technology
6C-15MB, 8C-20MB cache
7.2 GT/s QPI Intel® HT
Technology DDR3 1600 Intel® Turbo Boost
Technology
8C-20MB, 10C-25MB cache
8.0 GT/s QPI Intel® HT
Technology DDR3-1866 (skt R) Intel® Turbo Boost
Technology
4C-10MB, 2C-5MB cache
6.4 GT/s QPI DDR3 1333
17
Tray only SKUs T
T
T
T
T
Intel® Xeon® Processor E5-2600 v2 Product Family
INTEL CONFIDENTIAL 18
Advanced
Standard
LGA2011 (E5) LGA2011 (E5)
Workstation Only SKU
Segment Optimized
2.5MB/c cache More cache as noted 8.0 GT/s QPI Intel® HT
Technology DDR3-1866 Intel® Turbo Boost
Technology
Low Power
Basic
6C 80W 2.1GHz E5-2620 v2
4C 80W 2.5GHz E5-2609 v2
10C 115W 2.5GHz E5-2670 v2
8C 95W 2.0GHz E5-2640 v2
4C 80W 1.8GHz E5-2603 v2
6C 80W 2.6GHz E5-2630 v2
10C 130W 3.0GHz E5-2690 v2
10C 115W 2.8GHz E5-2680 v2
8C 95W 2.6GHz E5-2650 v2
10C 95W 2.2GHz E5-2660 v2
12C 130W 2.7GHz 30M E5-2697 v2
12C 115W 2.4GHz 30M E5-2695 v2
8C 150W 3.4GHz E5-2687W v2
8C 130W 3.3GHz 25M E5-2667 v2
6C 130W 3.5GHz 25M E5-2643 v2
4C 130W 3.5GHz 15M E5-2637 v2
10C 70W 1.7GHz E5-2650L v2
6C 60W 2.4GHz E5-2630L v2
2.5MB/c cache 10C 8.0 GT/s QPI 6C 7.2 GT/s QPI DDR3-1600 Intel® HT
Technology Intel® Turbo Boost
Technology
6C-15MB, 8C-20MB cache
7.2 GT/s QPI Intel® HT
Technology DDR3 1600 Intel® Turbo Boost
Technology
8C-20MB, 10C-25MB cache
8.0 GT/s QPI Intel® HT
Technology DDR3-1866 (skt R) Intel® Turbo Boost
Technology
4C-10MB, 2C-5MB cache
6.4 GT/s QPI DDR3 1333
18
Tray only SKUs T
T
T
T
T
Intel® Xeon® Processor E5-2600 v2 Product Family
INTEL CONFIDENTIAL INTEL CONFIDENTIAL - USE UNDER NDA ONLY UNLESS TAGGED “PUBLIC AT LAUNCH”. *Other names and brands may be claimed as the property of others
Intel® Xeon® Processor E5-2600 v2 Product Family
Neusoft CT Intel® Advance Vector Extensions – Float 16
“The Intel Xeon Processor E5-2600 V2 generates complex CT images about
45 percent faster than the previous Intel Xeon Processor E5-2600. A study
that took 20 minutes can now be completed in just 11. Multiply that by the
hundreds of patients in a busy clinic and the time savings stretch to hours
per day. We also see performance gains of up to 1.54X using Float16
Instructions for select workloads. This is absolutely a preferred hardware
platform for Neusoft CT.”
Shuangxue Li, Vice President of Neusoft Medical Systems Co.,
Ltd., General Manager of Diagnostic Imaging Systems Division
19
www.neusoft.com
Intel AVX
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
1.60
1.80
On Intel®
Xeon®
Processor E5-
2600 V2
series without
AVX Float 16
Optimization
On Intel®
Xeon®
Processor E5-
2600 V2
series with
AVX Float 16
Optimization
1.54X
Intel® AVX reduces the diagnosis latency, and helps doctors to make the right decision in the
shortest time
The Higher the Better
Neusoft CT scanners depend, in part, on the efficiency of the
underlying image-generation software and its ability to deliver
high quality images quickly.
Benefits of the Intel® Xeon® processor E5 v2 family—Fast image
generation reduces wait times for patients and medical teams in
busy clinical settings.
INTEL CONFIDENTIAL INTEL CONFIDENTIAL - USE UNDER NDA ONLY UNLESS TAGGED “PUBLIC AT LAUNCH”. *Other names and brands may be claimed as the property of others
Intel® Xeon® Processor E5-2600 v2 Product Family
0
10
20
30
40
50
60
70
80
SunGard ALM Benchmark 5.8.6 Risk Analytics
“Having successfully migrated to Intel® Xeon® E5-2600 v2, we have seen
significant increases in processing power. We have run the large scenario
simulation engine that is able to take advantage of the increased number
of cores in the new Intel platform; the new platform increased our
performance by more than 38%. These improved results come at a time
when our customers are demanding faster results with even greater
granularity.”
Joe Sass, Director of Product Strategy for SunGard’s Ambit ALM business
20
SunGard’s Asset & Liability Risk Management solution provides
complete multidimensional analysis of the balance sheet,
incorporating interest rate risk, income simulation and market
valuation using deterministic and stochastic modeling.
www.sungard.com
Better Information and Analysis Means Better Decisions. SunGard ALM Risk Management Solutions
Elapsed time in sec
1.38x Faster
Finance
Intel® Xeon®
processorE5-2600
Intel® Xeon®
processor E5-2600 V2
INTEL CONFIDENTIAL INTEL CONFIDENTIAL - USE UNDER NDA ONLY UNLESS TAGGED “PUBLIC AT LAUNCH”. *Other names and brands may be claimed as the property of others
Intel® Xeon® Processor E5-2600 v2 Product Family
Paradigm GeoDepth* v2011.3 Seismic Imaging
“Our continued investment in the optimization of the GeoDepth software,
leveraging the Intel compliers and the Intel MKL library, enables our
customers to take immediate advantage of the 50% increase in compute
cores in this latest generation of Intel Xeon processors.” Duane Dopkin,
Executive Vice President, Technology.”
Duane Dopkin, Senior Vice President, Technology
GeoDepth* is the leading system for 3D and 2D velocity model building and
seismic imaging in time and depth. Through the integration of interpretation,
velocity analysis, model building, model updating, model validation, depth
imaging and time-to-depth conversion, GeoDepth provides the continuity
needed to produce high-quality, interpretable images consistent with other
available data.
Key Intel® Xeon® processor E5 v2 advantage—Increased memory speed
reduces communications overhead; 24-cores per two-socket server provides
application scalability; Intel® Hyper-Threading Technology and Intel® Turbo
Boost Technology provide much higher price performance.
Customer Benefits—Geophysicists can choose to apply the scalable
performance improvements to produce higher resolution images of the
subsurface, or to improved throughput of their existing workload.
21
www.paradigm.com
ENERGY
1
1
1.4
5
1.5
9
CSFWMIG Benchmark CRAM Benchmark
Paradigm 2011.3 Benchmarks
Relative Performance Higher is better
8 core Intel® Xeon® processor E5-2670
12 core Intel® Xeon® processor E5-2697 V2
Scalable performance for high resolution seismic imaging
INTEL CONFIDENTIAL INTEL CONFIDENTIAL - USE UNDER NDA ONLY UNLESS TAGGED “PUBLIC AT LAUNCH”. *Other names and brands may be claimed as the property of others
Intel® Xeon® Processor E5-2600 v2 Product Family
Star-CCM+* Engineering Analysis (Multi-Disciplinary)
“We redesigned the front end of our chassis almost exclusively using
simulations with CD-adapco software, the redesign added about 50 lbs.
of down force which is enough extra grip to give us about a tenth of a
second a lap.”
Andy Hogg, Aerodynamics manager, Michael Waltrip Racing
22
www.cd-adapco.com
HPC
17.7
13.0 11.5
0.000
2.000
4.000
6.000
8.000
10.000
12.000
14.000
16.000
18.000
20.000
STAR-CCM+ 8.04.007
Lemans 17M
Iteration time, sec - lower is better
Faster performance; High quality digital cinema; Faster time to Market
Ivy Bridge
12-core * 2.7
GHz
Sandy Bridge
8-core @ 2.7
GHz
Ivy Bridge
10-core @ 2.8
GHz
1.53x Faster
CD-adapco STAR-CCM+* provides comprehensive support for solving
complex engineering problems involving flow (of fluids or solids), heat
transfer and stress. It helps engineers automate workflows to perform
iterative design studies with minimal user interaction.
Faster simulation runtimes reduce simulation/prototyping timelines, to
improve design quality, and speed time to market.
Increased memory bandwidth of the E5-2600 v2 series allows better
utilization of its computational resources, significantly improving
run times.
INTEL CONFIDENTIAL
Intel® Xeon Phi™
Co-processor Update
23
24
PARALLELISM IS THE PATH FORWARD Intel is Your Roadmap
• More cores and more threads per core
• Wider Vector instructions
• Higher memory bandwidth
• Common languages, directives,
libraries & tools
• Complements Intel® Xeon® processors
• Performance and energy efficiency for
most workloads
• Parallel, Serial, Multicore + Vector
• Robust security and reliability
• Flexible foundation for growth
and innovation
Most Commonly
Used Parallel Processor
Optimized for
Highly Parallel Application
25
3xxx Family Outstanding Parallel Computing Solution
Performance/$ leadership 3120P 3120A
5xxx Family Optimized for High Density
Environments Performance/watt leadership
5110P 5120D
7xxx Family Highest Performance, Most
Memory Performance leadership
7120P 7120X
16GB GDDR5
352GB/s
>1.2TF DP
8GB GDDR5
>300GB/s
>1TF DP
225-245W
6GB GDDR5
240GB/s
>1TF DP
PRODUCT LINEUP Intel® Xeon Phi™ Coprocessor
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components,
software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the
performance of that product when combined with other products. For more information go to http://www.intel.com/performance 25
26
* Theoretical acceleration using a highly-parallel Intel® Xeon Phi™ coprocessor versus a standard multi-core Intel® Xeon® processor
BIG GAINS FOR PARALLEL APPLICATIONS
Efficient vectorization,
threading, and parallel
execution drives higher
performance for
many applications
Fraction Parallel
% Vector
Performance
7.00
5.00
3.00
1.00
1.00
0.20
0.00
0.40
0.60
0.80
0%
100%
50% 75%
25%
27
145X
FASTER
67.097 SECONDS
0.46 SECONDS
2.3X FASTER
0.197 SECONDS
STEP 2.
USE COPROCESSORS
Run all or part of the optimized code on Intel® Xeon
Phi™ coprocessors
Current Performance
STEP 1.
OPTIMIZE CODE
Parallelize and vectorize code and continue to run on
multi-core Intel Xeon processors
STARTING POINT
Typical serial code running on multi-core
Intel® Xeon® processors
PARALLELIZING FOR HIGH PERFORMANCE A Two Step Process
27
340X FASTER
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are
measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other
information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Source: Intel Measured results as of October 26, 2012 Configuration Details: Please reference slide speaker notes. For more information go to http://www.intel.com/performance
28
Intel® Xeon Phi™ product family
Based on Intel® Many Integrated Core (Intel® MIC)
architecture
Leading performance for highly parallel workloads
Common Intel® Xeon® programming model seamlessly
increases developer productivity
Launching on 22nm with >50 cores
Single Source
Compilers and Runtimes
Intel® Xeon® processor
Ground-breaking real-world application
performance
Industry-leading energy efficiency
Meet HPC challenges and scale
for growth
Highly-parallel Processing for Unparalleled Discovery Seamlessly solve your most important problems of any scale
28
Other brands and names are the property of their respective owners.
Shown at SC’12, November 2012
29
A GROWING ECOSYSTEM Developing today on Intel® Xeon Phi ™ Co-processors
30
• Application: Sinopec iCluster PSDM is a key module in the
Sinopec iCluster* seismic imaging system. The split step
fourier prestack depth migration (SSF PsDM) algorithm is
ideal for mild lateral velocity variations.
It provides one-way approximation with wave
propagation performed in the frequency domain.
• Status: In house code
• Usage Model: Offload
• Demonstrated Results: Dramatic scaling (5.3x) over
baseline using two server nodes, each with two Intel®
Xeon® processors and two Intel® Xeon Phi™ coprocessors
PERFORMANCE PROOF POINT:
ENERGY INDUSTRY Sinopec iCluster PSDM
30
“This will provide an amazing boost for the performance of the Sinopec iCluster seismic imaging system.”
Zhao Gaishan VP of Sinopec Geophysical Research Institute, November, 2012
1 1.06
5
0
1
2
3
4
5
6
• 2S Intel ® Xeon® processor E5-2680 + 2 Intel® Xeon Phi™ Coprocessor
Two node (pre-production HW/SW)
• 2S Intel® Xeon® processor E5-2680
Speedup (Higher is Better)
• Intel® Xeon Phi™ Coprocessor (pre-production HW/SW)
SOURCE: INTEL RESULTS AS OF JULY, 2013
Code Optimization Strategies:
software.intel.com/en-us/articles/optimize-seismic-imaging-processing-on-intel-xeon-phi
31
• Application: Black-Scholes financial modeling requires
raw computational power plus high bandwidth
between
execution cores and memory
• Status: Case Study available
• Highlights: Dramatic scaling for both single- and
double-precision computations
• Demonstrated Results:
• Intel® Xeon Phi™ coprocessor streaming store
provides optimized cache and bandwidth usage
• Intel Xeon Phi coprocessor fast transcendental
functions exp2(), log2() increase performance
and accuracy on SP
• Intel® Xeon® processors also benefit from using
exp2()/log2()
• Compiler based code generation enables plain
C++ code, which delivers higher
performance than vector intrinsics
PERFORMANCE PROOF POINT:
FINANCIAL SERVICES Black-scholes formula valuation
31
Speedup (Higher is Better)
1 1
5.81
2.85
0
1
2
3
4
5
6
7
Single Precision Double Precision
• 2S Intel® Xeon® processor E5-2670
• 2S Intel Xeon processor E5-2670 + Intel® Xeon Phi™ Coprocessor (pre-production HW/SW)
SOURCE: INTEL MEASURED RESULTS AS OF JULY, 2013
Read the Case Study: software.intel.com/en-us/articles/case-study-achieving-superior-performance-on-black-scholes-valuation-computing-using
32
• Application: Extends MATLAB with a toolbox for
employing spatially adaptive sparse grids in a flexible,
modular way
• Status: not released yet
• Workload Characteristics:
• Uses MPI, intrinsics, offload pragmas, and OpenMP
• Floating-point intensive and has a complex, non-linear
kernel
• Innermost loop contains an if statement for efficient
handling of high-dimensional grid boundaries
(reduces computational complexity)
- Demonstrated Results:
• Highlight: Performance scales well to four
Intel® Xeon Phi™ coprocessors per node
• Supports symmetric configurations with Intel® Xeon®
processors and Intel® Xeon Phi™ coprocessors
PERFORMANCE PROOF POINT:
GOVERNMENT AND ACADEMIC RESEARCH LRZ/TUM SG++
32
1
3.92
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
Speedup (Higher is Better)
• 2S Intel® Xeon® processor E5-2670
• 2S Intel Xeon processor E5-2670 + 2* Intel® Xeon Phi™ Coprocessors (pre-production HW/SW)
SOURCE: THIRD PARTY MEASURED RESULTS AS OF NOVEMBER, 2012
33
• Application: Weather Research and Forecasting
(WRF)
• Status: WRF V3.5 was released 4/18/13
• Code Optimization:
• Approximately two dozen files with less than
2,000 lines of code were modified (out of
approximately 700,000 lines of code in about
800 files, all Fortran standard compliant)
• Most modifications improved performance for
both the host and the co-processors
• Performance Measurements: Pre release of WRF
3.5 (V3.5Pre) and NCAR supported CONUS2.5KM
benchmark (a high resolution weather forecast)
• Acknowledgments: There were many contributors
to these results, including the National Renewable
Energy Laboratory and The Weather Channel
Companies
PERFORMANCE PROOF-POINT:
WEATHER AND CLIMATE RESEARCH WRF v3.5
33
1
1.4
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Speedup (Higher is Better)
• 2S Intel® Xeon® processor E5-2670 with eight-node cluster configuration
• 2S Intel® Xeon® processor E5-2670 +
Intel® Xeon Phi™ coprocessor (pre-production HW/SW) with eight-node cluster configuration
SOURCE: INTEL MEASURED RESULTS AS OF JULY, 2013
34
Next Generation
Intel® Xeon Phi™ Product Family (Codenamed Knights Landing)
All products, computer systems, dates and figures specified are preliminary based on current expectations, and are subject to change without notice.
Available in Intel cutting-edge 14
nanometer process
Stand alone CPU or PCIe coprocessor
– not bound by ‘offloading’
bottlenecks
Integrated Memory - balances
compute with bandwidth
Parallel is the path forward - Intel is your roadmap
34 Note that code name above is not the product name
INTEL CONFIDENTIAL
Intel® HPC Fabric Update
35
36
HPC Expertise
Intellectual Property
World-class Interconnects
HPC Expertise
Fabric Management & Software
Highest Performance, Scalable IB
Products
Low-latency Ethernet Switching
Data Center Ethernet Expertise
High Radix & Low Radix Switch
Products
Market Leading Compute & Ethernet
Products
Platform Expertise
Intel’s
Comprehensive Connectivity and
Fabric
Portfolio
Unprecedented Rate of Innovation in HPC Fabric
Next Front of System Innovation: Fabrics
Other brands and names are the property of their respective owners.
37
Connectionless
- Minimal on-adapter state
- No Chance of cache misses as the cluster/fabric scales
- Maintains low end-to-end latency, even at scale
PSM Layer
- Performance Scaled Messaging light weight interface between MPI (Message
Passing Interface) and the InfiniBand device driver
- High MPI message rate performance
- Excellent short message efficiency
- Collective performance at scale without requiring special/hardware
acceleration
Intel® True Scale HPC Fabric
Key Differentiators
38
Problem:
• Power – System IO Interface Adds “10s Of Watts”
Incremental Power
• Cost & Density – More Components On A Server
Node
• Scalability – Processor Capacity & Memory
Bandwidth Scaling Faster Than System IO
Bandwidth
The Advantages of Fabrics Integration
Solution:
• Removing The System IO Interface From The
Fabrics Solution Reducing Power
• An Integrated Fabrics Results In Fewer Components
On The Server Node
• An Integrated Fabric Balances Fabric and Compute
Scaling Application Performance & Efficiency
Intel® Processor
Today
Tomorrow
Fabric Controller
System IO Interface (PCIe) Fabric Interface
Intel® Processor Fabric Interface
32 GB/sec 10-20 GB/sec
100+ GB/sec
Fabric Controller
Fabrics Integration Required to Scale Performance & Power
Thank You