extreme scale parallel and distributed systems – high performance computing systems current no. 1...

Extreme scale parallel and distributed systems

– High performance computing systems• Current No. 1 supercomputer Tianhe-2 at 33.86 petaflops• Pushing toward exa-scale computing by 2020, 32 times bigger than Tianhe-

2 (almost need to double the speed every year).• Many issues ranging from applications to systems such power, resilience,

networking, applications.

– Cloud computing data centers: Amazon EC2• Hugh push to move computing/storage to the cloud computing

infrastructure• Extreme scale to achieve the scale of economics• Applications are more diverse

– Networking infrastructure needs significant improvement– Security

– Big data platforms: hadoop cluster?• Huge hype• Not clear what is beyond the traditional HPC and cloud

computing platforms.

Issues related to extreme scale systems

• How to use the systems– Programming paradigms• What changes when the scale becomes big?

• How to build the systems– Hardware and systems issues• What changes when the scale becomes big?

Programming for extreme scale PDS

• Ease of use .vs. performance

• Distributed memory programming– Message Passing Interface (MPI)– Mapreduce (Hadoop)

• Hybrid shared memory and distributed memory programming– Matching the architecture -- CMP+SMP clusters– Hybrid OpenMP+MPI

• GPU/MIC programming and hybrid programming– More potential to achieve exa-scale within power limit– GPU, MIC– Hybrid GPU/MIC + MPI

Architecture/interconnects

• Extreme-scale PDSs are Internet-in-a-building– Traditional networking issues: topology, routing, flow

control, congestion control

Architecture/interconnects

• Current and Emerging network architectures– InfiniBand and 10/100-G E (technology)– Openflow and software defined networks

(network architecture)

– Recent topology/routing proposals for extreme scale systems• Achieving performance requirement with the budget

constraints.

System software and communication sub-systems

– Parallel IO systems– Topology aware job allocation and node mapping– Communication protocols– One-sided .vs. two-sided communications– Collective communication algorithms

Performance models and evaluation methods

• Performance modeling techniques for networks/systems/applications.

• Workload characterization.• Application tracing• Challenges in simulation and modeling of large

scale systems using realistic workloads

Resilience and power-awareness

• System and application resilience techniques and analysis

• Fault tolerance techniques in hardware and software

• Resource management for system resilience and availability.

• Energy efficient HPC• Energy efficient data centers

This course

• Targets students who are interested in research and development in large scale sytems.– Go through the recent advances in these subjects,

and bring you up-to-date in research in this area in general.

– Introduce software, algorithmic, and analytical tools and techniques that are necessary to perform research in this area.

extreme scale parallel and distributed systems – high performance computing systems current no. 1...

Documents

a 1.7 petaflops warm-water- cooled system: operational...

tianhe oil group is a comprehensive enterprise of equipment...

the blockchain identity - global risk institute...original...

your tefl capstone project guangzhou tianhe district markham...

sign&depth room 729a, tianhe district, guangzhou,guangdong,...

tianhe-pdf (1)

hybrid system & application - · pdf filenational university...

iasbaba’s prelims current affairs · iasbaba prelims...

integrating phonics into english class september 15, 2015...

hack the gibson - f-secure labs€¦ · 12/9/2013 ·...

report on the tianhe-2a system - icl · 2017-09-29 ·...

protein explorer: a petaflops special purpose computer...

beyond teraflops toward petaflops 2 october 2002...

erik p. debenedictis sandia national laboratories may 16,...

tianhe-2 report by prof. jack dongarra

world’s fastest supercomputer | tianhe - 2

tianhe oil group huifeng petroleum equipment co.pdf2

tianhe sep 2014

architectural considerations for petaflops and beyond bill...

picture of tianhe, the most powerful computer in the world...