clouds for hpc - peoplepeople.cs.bris.ac.uk/~simonm/conferences/isc09/lippert.pdf · hpc for fusion...
Post on 12-Oct-2020
1 Views
Preview:
TRANSCRIPT
mei
nsch
aft
er H
elm
holtz
-Gem
Mitg
lied
d
Clouds for HPC?Potential?
Challenges?Thomas Lippert
Institute for Advanced Simulation
gSession: Cloud Computing and HPC – Synergy or Competition?
Jülich Supercomputing CentreISC09, June 24, 2009
HPC
224.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 2
Germany
Gauss Centre
Europe
PRACE
European Services Tier-0 Principle Partners
1
N ti l S i Ti 1 G l P t
1
National Services Tier-1 General Partners
10
Gauss Alliance
Regional and Tier 2 Regional andRegional andtopical services
Tier-2 Regional andtopical services
100
Local services Tier-3
Local servicesHPC Cloud ?
Grid
1000
Hardware Aspects
Leadership HPC systems Similar to large experimental projectsSimilar to large experimental projects
Machine life cycle of 3 to 5 years
Time scale of know-how: 15 to 30 years
Usage: 24x7 h / week
Most industries are more than 6 years behind leadership HPC
Tendency „Full transparency“ of machine becomes more and more utopia
U d t k h hi lik h i i t d t k th User needs to know her machine like physicist need to know math
Assembler, SSE, MPI, parallelization strategy, scalability
424.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 4
JUGENE JuRoPA + HPC-FF @ JülichJUGENE, JuRoPA + HPC-FF @ Jülich
Cluster computer Bull NovaScale R422-E2Bull NovaScale R422 E21080 nodes, 8640 cores 101 TF peak, Intel Nehalem24 GB memory
IBM Bl G /P
24 GB memoryInfiniband QDR (Mellanox)ParaStation Cluster-OSHPC for Fusion
IBM Blue Gene/P72 racks, 294912 cores1 Petaflop/s peak
Cluster computer SUN-blades2208 nodes, 17664 cores 207 TF k I t l N h l144 Tbyte memory
6 Pbyte disks 25 PByte tape capacity
207 TF peak, Intel Nehalem48 GB memoryInfiniband QDR (SUN M9)P St ti Cl t OS
524.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 5
25 PByte tape capacityHighest scalability
ParaStation Cluster-OSGeneral Purpose HPC
Need of HPC Users
For effective usage of tier-0, tier-1 and tier-2-systems high-level support structures high level support structures
Jülich: more than 50 % of staff works as domain scientists,Jülich: more than 50 % of staff works as domain scientists, mathematician and computer scientist in simulation labs Support Research Community oriented Integrated in community Parallelization has come closer to theory and model
624.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 6
Simulation Labs
core group Core group tasks:
associated community
b
disciplinary research supportive tasks
members
SL Steering Committee
Expansion towards distributed Europeandistributed EuropeanSLscommunity
groups
outreach
724.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 7
outreach
Simulation Labs @ FZJ and KIT@
Simulation Laboratories (New)Earth &Environment
Plasma Physics EnergyBiology Molecular
Systems NanoMikroAstro-Particle
Methods & Algorithms
Parallel Performance
Cross Sectional Teams Research Groups
Quantum Information
NIC Group
Distributed Computing
Cross-Sectional Teams Research Groups
Education & Training Programmes
824.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 8
Education & Training Programmes
Example: Simulation Lab Biology
Research Protein folding & interaction
St t di ti Structure prediction Systems biology
Support Libraries, Bio databases Benchmarking Monte Carlo, FFT docking, Machine learning, g, g
Codes PROFASI, SMMP
Outreach Protein 1LQ7Outreach FZJ: Biological institutes (ISB, INM), Helmholtz Groups Regional: ABC of Life Science Informatics International: UC Berkeley Michigan Tech
924.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 9
International: UC Berkeley, Michigan Tech
National and European User Group
Soft Matter CompositesSoft Matter CompositesDEISADEISAI3HPI3HPJülich InitiativeJülich InitiativeOtherOther
ChemistryChemistryMany Particle PhysicsMany Particle PhysicsElementaryElementary ParticleParticle PhysicsPhysicsBiologyBiology//BiophysicsBiophysicsMaterial ScienceMaterial Science OtherOtherMaterial ScienceMaterial ScienceSoft MatterSoft MatterOtherOther
Proposals for computer time accepted from Germany and Europe
1024.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 1010
p p p y p Peer review by international referees
Cloud @ Jülich
1124.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 11
Germany
Gauss Centre
Europe
PRACE
European Services Tier-0 Principle Partners
1
N ti l S i Ti 1 G l P t
1
National Services Tier-1 General Partners
10
Gauss Alliance
Regional and Tier 2 Regional andRegional andtopical services
Tier-2 Regional andtopical services
100
Local services Tier-3
Local services
SoftComp@ JSC
Grid
1000
Users and Members of SoftComp
Member groups of the NoE SoftComp
fzjg: Forschungszentrum Jülich fzjg: Forschungszentrum Jülich jogu: Johannes Gutenberg Universität Mainz scr: Schlumberger Cambridge Research Limited unid: Heinrich-Heine Universität Düsseldorf unid: Heinrich-Heine Universität Düsseldorf upv: Universidad del Pais Vasco, Euskal Herriko Unibertsitatea utcdr: University of Twente uutr: Utrecht Universityuutr: Utrecht University Ulcrl: Unilever UK Central Resources Limited
German Federal Ministry for Education and Research (BMBF)y ( )
Promotion of applications in economy and science by grid infrastructures
1324.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 13
Our Cloud Computer: SoftComp Linux Cluster
125 compute nodes (500 cores)
Heterogeneous, AMD Opteron
IB and GigE, ParaStation, Unicore
1424.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 14
2.5 TF
AccessAccess Open, extensible, interoperable Strong security, workflow support,
powerful clients, application integration Widely established in academia
(D-Grid DEISA EGI SKIF-GRID(D-Grid, DEISA, EGI, SKIF-GRID, SoftComp) and industry (T-Systems, Philips, 52° North)A d l d h 1200 Average downloads per month ~ 1200
Management of scientific data with metadata and scalable storageg
User-defined execution environments with Clouds and virtualisation
1524.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 151515
1624.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 16
1724.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 17
Parallel Codes
Why parallel?
Results in a shorter time
Results more precise
Local system memory not sufficient
But: Programs have to be adapted to run in parallel modeBut: Programs have to be adapted to run in parallel mode
Required
S t f Si L b d C C tti G Support of SimLab and Cross Cutting Groups
Mathematics
1824.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 18
Performance Analysis
Lessons Learnt:
Challenges for the HPC Cloud P id
Lessons Learnt:
Provider
1924.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 19
To serve beyond tier-3 and desktop applicationsa cloud provider must
• offer leading edge tier-3,2,1,0 high performance systems
• guarantee absolute security
• guarantee absolute privacy
• care for long-term data storage and curation
• guarantee uninterrupted service for critical applicationsguarantee uninterrupted service for critical applications
• actively offer highest level support and research for science communities and industryscience communities and industry
• provide SimLab-like support structure for HPC
2024.6.2009 Cloud Computing and HPC / ISC09 Thomas Lippert, IAS/JSC/FZJ 20
top related