13. hpc and big data report final

5

Click here to load reader

Upload: robert-robinson

Post on 07-Dec-2015

219 views

Category:

Documents


4 download

DESCRIPTION

HPC

TRANSCRIPT

Page 1: 13. HPC and Big Data Report Final

Version 0.1, 26 September 2012, John Barr

HPC and Big Data Report

ICT COMPETITIVENESS WEEK

HPC AND BIG DATA

WORKSHOP REPORT 17-20 SEPTEMBER 2012

RAPPORTEUR: JOHN BARR, INDEPENDENT ANALYST ([email protected])

Page 2: 13. HPC and Big Data Report Final

Version 0.1, 26 September 2012, John Barr

HPC and Big Data Report

INTRODUCTION

This report is a summary of the presentations, discussion and conclusions of the HPC and Large Data breakout session during the ICT Competitiveness Week, September 17-20, 2012 in Brussels.

High Performance Computing is an instrument for innovation. Development of the European HPC industry and greater use of HPC systems, services and applications provide an opportunity for Europe to increase its competitiveness in world markets.

HPC is used to simulate complex processes and has been described as the third pillar of science, complementing theory and experimentation. HPC provides a better understanding of the fundamental sciences, supports more accurate weather and climate predictions, helps develop breakthrough medical treatments and enables the design, development and manufacture of more efficient, effective and profitable products ranging from widgets to computers to aircraft.

TRENDS

The transition to the next generation of supercomputers1 is not just more of the same but bigger and faster, but represents a discontinuity in the technology deployed by the ICT industry and thus provides significant risks for the incumbent (mainly American) HPC vendors, and a realistic opportunity for the European HPC industry to expand its offerings.

Scalability of systems is now extremely important, for both entry level and high end HPC systems. At the entry level new generations of microprocessors deliver increased performance by using more and more processor cores. While the software technology to exploit this exists today, few applications and few programmers are able to use this increased capability for HPC applications. At the high end, exascale systems will require applications that exploit millions of parallel threads, but the software development tools required to build such applications do not yet exist.

The emergence of Big Data in mainstream computing has been driven by the availability of exponentially increasing volumes of real time and archive data from the internet, social networks, pervasive sensors, scientific instruments, medical records, media archives and large scale e-commerce. Updated analytical approaches are required in order to understand and process Big Data. Scientists, engineers and other long-time users of HPC platforms have a set of highly scalable tools and technologies that can profitably be redeployed to tackle the Big Data problem.

One option for providing a cost-effective infrastructure to process Big Data is to use Cloud Computing facilities. This is, indeed, a potential solution for some classes of problems, and can be cost-effective – but it is not a panacea for all large, complex problems. Some problems require regular communication between the many nodes in a parallel processing system, but most Cloud implementations do not use high performance networks, and cannot guarantee the location of neighboring components of a calculation. Other problems can be broken into many independent, parallel components and so appear to be well suited to a Cloud environment, but they require vast

1 Current high end supercomputers deliver performance measured in Petaflop/s (1015 floating point calculations per

second, or flop/s). The next generation of supercomputers is known as Exascale, it will deliver Exaflop/s (1018 flop/s), and will process Exabytes of data. The major stumbling block in reaching this goal by the end of the decade is power consumption, which has increased faster than performance in recent years. An improvement in power-efficiency in excess of a factor of 50 is needed if Exascale systems are to be viable. A reduction in power consumption of this order has implications that change the way electronic components are built, how they are connected, and how the resultant systems are programmed.

Page 3: 13. HPC and Big Data Report Final

Version 0.1, 26 September 2012, John Barr

HPC and Big Data Report

quantities of data to be transferred to and from the Cloud – which takes longer than the computation itself. The class of application that does run effectively in the Cloud is embarrassingly parallel problems that are very easy to separate into a number of parallel tasks, and that operate on relatively small data sets.

An ideal e-Infrastructure to address European needs for High Performance Computing would include capability systems (i.e. high end supercomputers that can meet the needs of the most onerous applications), capacity systems (i.e. to handle throughput of a large number of smaller, related tasks – this can include Cloud Computing), and application or topic specific platforms to meet the specific needs of some domains (e.g. life sciences or weather/climate forecasting).

PROBLEM STATEMENT

At the core of High Performance Computing is parallel processing and the handling of large data sets. Today’s low cost, multicore processors and the broad emergence of Big Data brings the potential of HPC to the masses, but the majority of staff, especially in SMEs, lack the skills required to exploit it.

Europe has a strong software base, but it needs to be modernized to respond to the needs of new applications. Furthermore, applications typically have a life of 10 or 20 years, making it difficult to fully exploit the latest hardware without a significant, costly, reengineering exercise.

Europe increasingly relies on HPC. It uses 24% of worldwide HPC systems, but supplies less than 5% of these systems. Europe is therefore exporting skills, jobs and revenue in a strategically important technology segment.

SWOT ANALYSIS

STRENGTHS

Europe has leading capabilities in the design of power-efficient microelectronics. This has been driven by needs of the mobile market, but the requirements of the next generation of HPC systems include a dramatic reduction in power consumption, bringing a new opportunity to European industry.

Europe is a major producer of leading HPC software tools, libraries and applications in areas such as climate modelling, life sciences, engineering simulation and in the fundamental sciences.

PRACE provides services on leadership-class HPC systems to enable high impact scientific discovery and engineering research and development to enhance European competitiveness for the benefit of society. PRACE delivers world class computing and data management resources and services, complemented by training and collaboration with the HPC supply industry.

WEAKNESSES

Europe has been under-investing in HPC compared not only to the US which is the leading supplier and user of HPC systems, but also Japan and emerging HPC nations such as China and India.

Support from the HPC research community for industry is limited, with access to large scale HPC systems and technology transfer used more as research tools rather than in support of production capabilities.

Page 4: 13. HPC and Big Data Report Final

Version 0.1, 26 September 2012, John Barr

HPC and Big Data Report

The low cost of today’s multicore processors combined with the need to process pervasive Big Data, brings the need for HPC to a much wider audience. However, the number of staff with HPC skills and understanding is small, which limits Europe’s ability to exploit this opportunity.

OPPORTUNITIES

Targeted pre-commercial procurement, co-design and the European Technology Platform ETP4HPC could energise the European HPC supply side as Exascale technologies are developed between now and the end of the decade.

Disruptive Exascale technologies can cascade into mainstream ICT including enterprise data centres and mobile devices. A Petascale capability in a single, affordable cabinet (compared with the tens of millions of Euros and hundreds of cabinets required today) can transform the use of high end HPC in industry.

We have too much data and struggle to make sense of much of it! The automated use of HPC facilities is required to turn data into information, and information into knowledge.

THREATS

Established and emerging countries are investing heavily in HPC. For example, Japan has built the largest HPC system in the world, while Russia, China and India have all announced significant investments in their HPC technology programs.

Adoption of HPC by SMEs has a costly initial hurdle to overcome. There is a ‘dark demand’ for HPC within the SME community. That is, many SMEs could benefit from the use of HPC, but don’t understand the potential of HPC or do not have the skills to leverage it.

KEY FACTORS OF SUCCESS

The provision of HPC skills and capabilities, aligned with vertical expertise within geographical regions can bring the benefits that HPC delivers to a wider SME audience.

Application development is a European strength. However, unless these applications respond to the needs of knowledge production from Big Data and unless they integrate features available in the latest HPC systems, we will fall behind our international competitors.

The range of native European HPC components, systems, software and services available needs to be expanded through the use of Pre-Commercial Procurement and a strong R&D program for HPC. An ambitious industrial strategy by the ETP4HPC platform should guide R&D efforts.

RECOMMENDATIONS

These three recommendations are part of a holistic plan to increase demand for HPC, provide better support for SMEs, and increase the capability of the European HPC supply industry.

1. Promote European and national/regional support for the many thousands of SMEs that can profitably leverage HPC in market segments such as banking, manufacturing and life sciences, including the ability to go local in the provision of expertise to SMEs in order to close the skills gap.

Page 5: 13. HPC and Big Data Report Final

Version 0.1, 26 September 2012, John Barr

HPC and Big Data Report

2. A program of application modernisation for compute and data intensive simulations in traditional market segments such as engineering, as well as emerging areas that process Big Data. This will ensure that applications can leverage the best of new technologies in order to remain competitive.

3. The next generation of HPC combines Big Compute and Big Data. It will deliver Exascale performance in supercomputers, Petascale performance at a departmental level, and a Terascale capability in your pocket. The industry can be invigorated through the use of PCP and a fully-fledged R&D program for HPC.