infinite performance - intel · case study intel® xeon® processor e5-2680 intel® true scale...

Click here to load reader

Upload: trinhkhanh

Post on 29-Aug-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • CASE STUDY

    IntelXeonProcessor E5-2680IntelTrue Scale Fabric

    High-Performance Computing

    Infinite performance

    CHALLENGES

    Strong tools. Equip researcher with some of the best resources to conduct complexcomputational simulations

    Performance testing. Evaluate core applications performance and scalability when runningon the latest Intel technology

    SOLUTIONS

    Processing power. Servers tested were powered by Intel Xeon processors E5-2680

    Inter-connectivity. Server nodes are connected using Intel True Scale Fabric based on aquad data rate (QDR) InfiniBand network

    TECHNOLOGY RESULTS

    Execution speed. Chroma* and Hybrid Monte Carlo* (HMC*) applications were sped upalmost 16-fold when using 16 nodes connected by InfiniBand

    More capacity. Intel True Scale Fabric based on QDR-80 doubles the node bandwidthcapacity and increases message passing interface rate by a factor of seven1

    BUSINESS VALUE

    Growth potential. Core applications can run more simulations and support more userswhen running on a compute cluster with more nodes that is supported by Intel TrueScale Fabric based on QDR-80

    Higher quality. Scientists can generate higher standards of research to drive the universityscompetitiveness and good reputation

    Striving for excellence

    As a leader in scientific research in Portugal, the University of Coimbra values its computingresources very highly. Paulo Silva, a post-doctoral fellow at the universitys Center for Com-putational Physics, explains: Computer simulations provide a new way of conducting scientificresearch, enabling us to understand many phenomena that would be very difficult or impossibleto study through traditional experimentation or where our knowledge of the underlying theoryis incomplete. Across the university, but especially in Silvas department, many scientistsrely on these computational simulations to conduct their work. High-performance computing(HPC) is therefore very important for maintaining our high standards of excellence and com-petitiveness in these fields of research, adds Silva.

    Committed to staying abreast of the latest and most compelling HPC solutions, the universitychose to work with Intel to carry out a series of evaluations. The goal was to test the perform-ance on Intel technology (including the Intel Xeon processor E5 family and Intel True ScaleFabric based on InfiniBand) of the universitys core quantum chromodynamics (QCD) applications.

    The applications to be tested were Chroma, HMC and Landau*. Although Landau is an ap-plication developed by Silva, these are all based on the Chroma library developed by theUnited States Quantum Chromodynamics (USQCD) organization. This body is a collaboration ofscientists developing and using large-scale computers for calculations in lattice QCD, whichhelp them understand the results of particle and nuclear physics experiments around QCD,the theory of quarks and gluons.

    University of Coimbra evaluates performance and scalability benefits of the latest Inteltechnology Since 1290, the University of Coimbra has been one of Portugals leading higher education institutions. Its 24,500 students are supportedby eight faculties, with study subjects ranging from art to engineering. The university is also the founding member of the Coimbra Groupof European research universities. Wanting to provide some of the best possible research tools to students and industry research projects,the university tested the new Intel Xeon processor E5-2680 with Intel True Scale Fabric based on InfiniBand* to underpin and connectits server cluster and key applications.

    A new cluster with powerful

    processors like the Intel Xeon

    processor E5-2680, connected

    by an InfiniBand network, would

    allow our computational scientists

    to boost the quality of their work.

    Paulo Silva,Post-Doctoral Fellow,University of Coimbra

  • Performance plus connectivity

    For the testing with Chroma and HMC, CoimbraUniversity used an infrastructure based atthe Intel lab in Swindon, UK. It was composedof a cluster of 16 nodes with Intel Xeonprocessors E5-2680 and Intel True ScaleFabric based on a QDR InfiniBand network.

    A slightly different set-up was needed forLandau. Typical InfiniBand solutions use onlyone InfiniBand card (HCA) per node. In dual-socket architectures, only one socket withits integrated PCIe bus has direct access tothe HCA. The other socket, though, needs totransit the processor socket-to-socket bus,since it does not have direct access to thefirst sockets PCIe bus and its attached Infini-Band adapter. This can have a significantimpact on the message passing interface(MPI) rate and latency performance of acompute cluster, and can thereby seriouslyimpact the performance of some applications,such as Landau. Consequently, Landautested a version of the Intel True ScaleFabric in QDR-80 configuration, which usesa dual-rail InfiniBand implementation .

    Compelling results

    During testing, Coimbra University found thatthe time required to execute the Chroma andHMC applications was reduced more than15.7-fold when going from one node to 16nodes, showing that the QDR Intel True ScaleFabric offered an almost direct scalabilityfor these applications. This combination ofscalability and performance was a key fea-ture of the combination of the Intel TrueScale Fabric with Intel Xeon processor-basedtechnology.

    The communication patterns of Chroma andHMC showed a need for extensive small MPImessage throughput and collective perform-

    ance, both of which benefit from Intel TrueScale Fabric.

    The Landau application proved to be morecommunication-intensive, especially in termsof MPI message rate size. In such cases, theteam at Coimbra University found that havinga single InfiniBand card on the node may limitits scalability. However, as soon as it testedthe application on Intel True Scale Fabric inQDR-80 mode, the results showed again adirect link between node availability andapplication performance.

    Intel True Scale Fabric in QDR-80 mode, usestwo cards per node in a dual rail mode con-figuration. Each Intel True Scale Fabric adapteris connected to the PCIe bus associated witheach processor socket. This implementationhas two main benefits. First, it doubles thebandwidth capacity of a node when comparedto a single-rail QDR solution. It also givesboth processor sockets direct access to theirattached HCA, improving by up to seven timesthe MPI performance of the nodes, based onthe tests and simulations run by the Univer-sity of Coimbra. For this reason, the testingof the Landau application using Intel TrueScale Fabric in QDR-80 mode showed thatit offered a performance improvement ofas much as 40 percent at 16 nodes.

    Coimbra University tested the Landau ap-plication in a reduced bandwidth configura-tion of 20Gbps. This testing showed that theapplication was not bandwidth sensitive. Thekey factor determining performance was theMPI message rate, which is one of the mainimprovements offered by Intel True ScaleFabric in QDR-80 mode.

    Research improvements

    For these QCD applications, Intel True ScaleFabric offers an effective way to scale up

    the capacity of the applications, enabling usto benefit from the performance of the IntelXeon processor E5-2680, Silva observes. Invery particular cases, the application needshigh-message-rate performance in order toscale, for which using a single InfiniBand cardcan be a limiting factor. In such cases, Intel hasdeveloped Intel True Scale Fabric in QDR-80mode, which enables the performance ofapplications to be scaled up with additionalnodes in the compute cluster.

    He also adds: For large parallel computersimulations, a good interconnecting networkis critical for the scalability of the applica-tions we use. Without the Intel True ScaleFabric version of InfiniBand, the use of alarge number of cores would not have thesame benefit for the performance of ourapplications.

    A new cluster with powerful processors likethe Intel Xeon processors E5-2680, connectedby an InfiniBand network, would allow ourcomputational scientists to boost the qualityof their work, he concludes. The universityhopes to implement such a solution soon,with a view to creating a high-quality com-puting cluster that would enable it to takepart in the Europe-wide PRACE* HPC grid,as well as benefitting its own researchers.

    Find the solution thats right for your organ-ization. Contact your Intel representative,visit Intels Business Success Stories for ITManagers (www.intel.co.uk/Itcasestudies)or explore the Intel.co.uk IT Center(www.intel.co.uk/itcenter).

    Lessons learned

    Coimbra University needs to provide itsresearchers with the optimum tools tocarry out competitive research. In-depth testing showed the universitythat while strong performance is criti-cal, its impact can be further enhancedby adding scalable interconnectivitythrough InfiniBand technology.

    Leading research center demonstratesthe combined benefits of Intel technologyand InfiniBand for HPC

    Copyright 2013 Intel Corporation. All rights reserved. Intel, the Intel logo, Intel Xeon and Xeon inside are trademarks of Intel Corporation in the U.S. and other countries.This document and the information given are for the convenience of Intels customer base and are provided AS IS WITH NO WARRANTIES WHATSOEVER, EXPRESS OR IMPLIED,INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT OF INTELLECTUAL PROPERTY RIGHTS. Receipt orpossession of this document does not grant any license to any of the intellectual property described, displayed, or contained herein. Intel products are not intended for use inmedical, lifesaving, life-sustaining, critical control, or safety systems, or in nuclear facility applications.1 Intel does not control or audit the design or implementation of third party benchmark data or Web sites referenced in this document. Intel encourages all of its customers tovisit the referenced Web sites or others where similar performance benchmark data are reported and confirm whether the referenced benchmark data are accurate and reflectperformance of systems available for purchase.

    Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and Mobile-Mark, are measured using specific computer systems, components, software, operations, and functions. Any change to any of those factors may cause the results to vary. Youshould consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combinedwith other products. For more information go to http://www.intel.com/performance

    *Other names and brands may be claimed as the property of others. 0213/JNW/RLC/XX/PDF 328732-001EN