claudio gheller cineca ([email protected])
DESCRIPTION
The DEISA HPC Grid for Astrophysical Applications. Claudio Gheller CINECA ([email protected]). Disclaimer. My background: Computer science in astrophysics My involvement in DEISA: Support to scientific extreme computing projects (DECI) I’m not: A systems espert A networking expert. - PowerPoint PPT PresentationTRANSCRIPT
April 10, 2008, Garching
Claudio GhellerCINECA ([email protected])
The DEISA HPC Grid for
Astrophysical Applications
April 10, 2008, Garching
Disclaimer
My background:
Computer science in astrophysics
My involvement in DEISA:
Support to scientific extreme computing projects (DECI)
I’m not:
A systems espert
A networking expert
April 10, 2008, Garching
Conclusions
DEISA is not Grid computing
It is (super) super computing
April 10, 2008, Garching
The DEISA project: overview
What is:
DEISA (Distributed European Infrastructure for Super-computing Applications) is a consortium of leading national EU supercomputing centres
Goals:
deploy and operate a persistent, production quality, distributed supercomputing environment with continental scope.
When:
The Project is funded by European Commission: May 2004 - April 2008. It has been re-funded (DEISA2): May 2008 – April 2010
April 10, 2008, Garching
The DEISA project: drivers
o Support High Performance Computing.
o Integrate the Europe’s most powerful supercomputing systems.
o Enable scientific discovery across a broad spectrum of science and technology.
o Best exploitation of the resources both at site level and European level
o Promote openness and usage of standards
April 10, 2008, Garching
The DEISA project: what is NOT
o DEISA is not a middleware development project.
o DEISA, actually, is not a Grid: it does not support Grid computing. Rather it supports Cooperative Computing.
April 10, 2008, Garching
BSC, Barcelona Supercomputing Centre, Spain
CINECA, Consorzio Interuniversitario, Italy
CSC, Finnish Information Technology Centre for Science,
Finland
EPCC/HPCx, University of Edinburgh and CCLRC, UK
ECMWF, European Centre for Medium-Range Weather
Forecast, UK
FZJ, Research Centre Juelich, Germany
HLRS, High Performance Computing Centre Stuttgart,
Germany
LRZ, Leibniz Rechenzentrum Munich, Germany
RZG, Rechenzentrum Garching of the Max Planck Society,
Germany
IDRIS, Institut du Développement et des Resources en
Informatique Scientifique – CNRS, France
SARA, Dutch National High Performance Computing,
Netherlands
The DEISA project: core partners
April 10, 2008, Garching
Three activity areas
Networking:
management, coordination and dissemination
Service Activities:
running the infrastructure
Joint Research Activities:
porting and running scientific applications on the DEISA infrastructure
The DEISA project: Project Organization
April 10, 2008, Garching
Deisa Activities, some (maybe too many…) details (1)
Service Activities:• Network Operation and Support. (FZJ leader). Deployment and
operation of a gigabit per second network infrastructure for an European distributed supercomputing platform.
• Data Management with Global file systems. (RZG leader). Deployment and operation of global distributed file systems, as basic building blocks of the "inner" super-cluster, and as a way of implementing lobal data management in a heterogeneous Grid.
• Resource Management. (CINECA leader). Deployment and operation of global scheduling services for the European super cluster, as well as for its heterogeneous Grid extension.
• Applications and User Support. (IDRIS leader). Enabling the adoption by the scientific community of the distributed supercomputing infrastructure, as an efficient instrument for the production of leading computational science.
• Security. (SARA leader). Providing administration, authorization and authentication for a heterogeneous cluster of HPC systems, with special emphasis on single sign-on
April 10, 2008, Garching
Scientific Applications Activities:• JRA1 – Material Science.
– (RZG leader)
• JRA2 – Cosmology.– (EPCC leader)
• JRA3 – Plasma Physics. – (RZG leader)
• JRA4 – Life Science. – (IDRIS leader)
• JRA5 – Industry. – (CINECA leader)
• JRA6 – Coupled Applications. – (IDRIS leader)
• JRA7 – Access to Resources in Heterogeneous Environments. – (EPCC leader)
The DEISA Extreme Computing Initiative
(DECI)See http://www.deisa.org/applications
Deisa Activities, some (maybe too many…) details (2)
April 10, 2008, Garching
JRA2: Cosmological Applications
Goals:
• to avail the Virgo Consortium of the most advanced features of Grid computing by porting their production applications– GADGET and FLASH
• to make an effective use of the DEISA infrastructure
• to lay the foundations of a Theoretical Virtual Observatory
• Leaded by EPCC which works in close partnership with the Virgo Consortium– JRA2 managed jointly by Gavin Pringle (EPCC/DEISA) and Carlos Frenk
(co-PI of both Virgo and VirtU)
– work progressed after gathering clear user requirements from Virgo Consortium.
– requirements and results published as public DEISA deliverables.
April 10, 2008, Garching
Current DEISA status
• variety of systems connected via GEANT/GEANT2 (Premium IP)
• centres contribute 5% to 10% of CPU cycles to DEISA– running projects selected
from the DEISA Extreme Computing Initiative (DECI) calls
Premium IP is a service that offers network priority over other traffic on GÉANT. Premium IP traffic takes priority over all other services .
April 10, 2008, Garching
DEISA HPC systems
IDRIS IBM P4
ECMWF IBM P4
FZJ IBM P4
RZG IBM P4
HLRS NEC SX8
HPCX IBM P5
SARA SGI ALTIX
LRZ SGI ALTIX
BSC IBM PPC
CSC IBM P4
CINECA IBM P5
April 10, 2008, Garching
DEISA technical hints: software stack
• UNICORE is the grid “glue”– not built on Globus
– EPCC developing UNICORE command-line interface
• Other components– IBM’s General Parallel File System
multiclusterGPFS can span different systems over a WAN
recent developments for Linux as well as AIX
– IBM’s Load Leveler for job scheduling
Multicluster Load Leveler can re-route batch jobs to different machines
also available on Linux
April 10, 2008, Garching
DEISA model
• large parallel jobs running on a single supercomputer– network latency between machines not a significant issue
• jobs submitted – ideally - via UNICORE, in practice via Load Leveler– re-routed where appropriate to remote resources
• Single-Sign-On access via GSI-SSH
• GPFS absolutely crucial to this model– jobs have access to data no matter where they run
– no source code changes required
• standard fread/fwrite(or READ/WRITE) calls to Unix files
• also have a Common Production Environment– defines a common set of environment variables
– defined locally to map to appropriate resources
• Eg $DEISA_WORK will point to local workspace
April 10, 2008, Garching
Running ideally on DEISA
• Fill all the gaps
• restart/continue jobs on any machine from file checkpoints– no need to recompile application program
– no need to manually stage data
• multi-step jobs running on multiple machines
• easy access to data for post-processing after a run
April 10, 2008, Garching
Running on DEISA: Load Leveler
IDRIS IBM P4
ECMWF IBM P4
FZJ IBM P4
RZG IBM P4
HLRS NEC SX8
HPCX IBM P5
LRZ SGI ALTIX
CSC IBM P4
CINECA IBM P5
SARA SGI ALTIX BSC IBM PPC
AIXLL-MC
AIXLL
AIXLL-MC
AIXLL-MC
Super-UXNQS II
AIXLL
LINUXLSF
LINUXPBS Pro
AIXLL-MC
AIXLL-MC
LINUXLL
Job
April 10, 2008, Garching
Running ideally on DEISA: Unicore
IDRISFZJ IBM RZGHLRS CINECA SARA
AIXLL-MC
AIXLL
AIXLL-MC
AIXLL-MC
Super-UXNQS II
LINUXLSF
LINUXPBS Pro
AIXLL-MC
AIXLL-MC
LINUXLL
HPCX
AIXLL
LRZ CSC
GatewayECMWF
GatewayFZJ
GatewayIDRIS
GatewayHLRS
GatewayHPCX
GatewayLRZ
GatewayRZG
GatewaySARA
GatewayBSC
GatewayCINECA Gateway
CSC
NJS FZJ IBM P4
IDB UUDB
NJSIDRIS IBM P4
IDB UUDB
NJS HLRS NEC SX8
IDB UUDB
NJS HPCX IBM P5
IDB UUDB
NJS LRZ SGI ALTIX
IDB UUDB
NJS RZG IBM P4
IDB UUDB
NJS SARA SGI ALTIX
IDB UUDB
NJS BSC IBM PPC
IDB UUDB
NJS CINECA IBM P5
IDB UUDB
NJS CSC IBM P4
IDB UUDB
NJS ECMWF IBM P4
IDB UUDB
ECMWF BSC
April 10, 2008, Garching
GPFS Multicluster
SURFnet
UKERNA FUNET
RedIris
GARR1 Gb/s 1 Gb/s
1 Gb/s1 Gb/s
Dedicated 10 Gb/s wavelength
1 Gb/s LSP
RENATER
10 Gb/s
1 Gb/s
DFN10 Gb/s
10 Gb/s
10 Gb/s
10 Gb/s
GÉANT2
Old 1 Gb/s LSP (will be removed soon)
Dedicated 10 Gb/s wavelength (in preparation)
HPC systems mount /deisa/sitename
users read/write directly from/to these file systems
/deisa/idr
/deisa/cne
/deisa/rzg
/deisa/fzj
/deisa/csc
April 10, 2008, Garching
DEISA Common Production Environment (DCPE)
DCPE… what is it?
both a set of software (the software stack) and a generic interface to access the software (based on the Modules tool)
• Required to both offer a common interface to the users and to hide the differences between local installations
• Essential feature for job migration inside homogeneous super-clusters
The DCPE includes:• shells (Bash and Tcsh),• compilers (C, C++, Fortran and Java),• libraries (for numerical analysis, data formatting, etc.),• tools (debuggers, profilers, editors, development tools),• applications.
April 10, 2008, Garching
Modules Framework
o Modules tool chosen because it was well known by many sites and many users
o Public domain software
o Tcl implementation used
Modules:o offer a common interface different software components on different
computers,
o to hide different names and configurations
o to manage individually each software and load only those required into the user environment,
o for each user to change the version of each software independently of the others,
o for each user to switch independently between the current default version of a software to another one (older or newer).
April 10, 2008, Garching
The HPC users’ vision
Initial vision:
“Full” Distributed computingIDRIS IBM P4
ECMWF IBM P4
FZJ IBM P4
RZG IBM P4
HLRS NEC SX8
HPCX IBM P5
SARA SGI ALTIX
LRZ SGI ALTIX
BSC IBM PPC
CSC IBM P4
CINECA IBM P5
Task1
Task2Task3
April 10, 2008, Garching
The HPC users visions
Initial vision:
“Full” Distributed computingIDRIS IBM P4
ECMWF IBM P4
FZJ IBM P4
RZG IBM P4
HLRS NEC SX8
HPCX IBM P5
SARA SGI ALTIX
LRZ SGI ALTIX
BSC IBM PPC
CSC IBM P4
CINECA IBM P5
Task1
Task2Task3
Impossible!!!!
April 10, 2008, Garching
The HPC users vision
Jump computing
IDRIS IBM P4
ECMWF IBM P4
FZJ IBM P4
RZG IBM P4
HLRS NEC SX8
HPCX IBM P5
SARA SGI ALTIX
LRZ SGI ALTIX
BSC IBM PPC
CSC IBM P4
CINECA IBM P5
Task
Task
April 10, 2008, Garching
The HPC users vision
Jump computing
IDRIS IBM P4
ECMWF IBM P4
FZJ IBM P4
RZG IBM P4
HLRS NEC SX8
HPCX IBM P5
SARA SGI ALTIX
LRZ SGI ALTIX
BSC IBM PPC
CSC IBM P4
CINECA IBM P5
Task
Task
Difficult…
HPC applications are… HPC applications!!!
Fine tuned on the architectures
April 10, 2008, Garching
So… what…
Jump computing is useful to reduce queue waiting times.
Find the gap… and fill it… can work, better on homogeneous systems
IDRIS IBM P4
ECMWF IBM P4
FZJ IBM P4
RZG IBM P4
HLRS NEC SX8
HPCX IBM P5
LRZ SGI ALTIXCSC IBM P4
CINECA IBM P5
SARA SGI ALTIX BSC IBM PPC
AIXLL-MC
AIXLL
AIXLL-MC
AIXLL-MC
Super-UXNQS II
AIXLL
LINUXLSF
LINUXPBS Pro
AIXLL-MC
AIXLL-MC
LINUXLL
Job
April 10, 2008, Garching
So… what…
Single image filesystem is a great solution!!!!! (even if moving data…)
IDRIS IBM P4
ECMWF IBM P4
FZJ IBM P4
RZG IBM P4
HLRS NEC SX8
HPCX IBM P5
LRZ SGI ALTIXCSC IBM P4
CINECA IBM P5
SARA SGI ALTIX BSC IBM PPC
AIXLL-MC
AIXLL
AIXLL-MC
AIXLL-MC
Super-UXNQS II
AIXLL
LINUXLSF
LINUXPBS Pro
AIXLL-MC
AIXLL-MC
LINUXLL
DEISA GPFS SHARED FILESYSTEM
April 10, 2008, Garching
So… what…
Usual Grid solution requires to learn new stuff…
Often scientists are not willing to…
DEISA rely on Load Leveler (or other common scheduling systems)… same scripts, same commands you are used to!!!
However, only IBM systems support LL…
The Common Production Environment offers a shared (and friendly) set of tools to the users.
However, compromises must be accepted…
April 10, 2008, Garching
High latency
Low latency
Low integration High integration
Internet GRID
Distributed computing and data grids: EGEE
Capacity cluster
Capacity supercomputer
Distributed supercomputingDEISA
Capability supercomputerEnabling computing
HPC centres
Summing up…
Growing up, DEISA is moving away from a Grid.
In order to fulfill the needs of HPC users, it is trying to become a huge supercomputer.
On the other hand, DEISA2 must lead to a service infrastructure and users’ expectations MUST be matched (no more time for experiments…)
April 10, 2008, Garching
DECI: enabling Science to DEISA
o Identification, deployment and operation of a number of « flagship » applications requiring the infrastructure services, in selected areas of science and technology.
o European Call for proposals in May - June every year. Applications are selected on the basis of scientific excellence, innovation potential and relevance criteria, with the collaboration of the HPC national evaluation committees.
o DECI users are supported by the Applications Task Force (ATASKF), whose objective is to enable and deploy the Extreme Computing applications.
April 10, 2008, Garching
LFI-SIM DECI Project (2006)
Planck (useless) overview:
Planck is the 3rd generation space mission for the mapping and the analysis of the microwave sky: its unprecedented combination of sky and frequency coverage, accuracy, stability and sensitivity is designed to achieve the most efficient detection of the Cosmic Microwave Background ( CMB ) in both temperature and polarisation. In order to achieve the ambitious goals of the mission, unanimously acknowledged by the scientific community to be of the highest importance, data processing of extreme accuracy is needed.
Principal Investigator(s)
Fabio Pasian (INAF- O.A.T.), Hannu Kurki-Suonio (Univ. of Helsinki)
Leading Institution INAF -O.A Trieste and Univ. of Helsinki
Partner Institution(s)o INAF-IASF Bologna, o Consejo Superior de Investigaciones Cientificas (Instituto de Fisica de Cantabria), o Max-Planck Institut für Astrophysik Garching, o SISSA Trieste, o University of Milano, o University “Tor Vergata” Rome
DEISA Home Site CINECA
April 10, 2008, Garching
Need of simulations in Planck
NOT the typical DECI-HPC project !!!
Simulations are used to:
o assess likely science outcomes;
o set requirements on instruments in order to achieve the expected scientific results;
o test the performance of data analysis algorithms and infrastructure;
o help understanding the instrument and its noise properties;
o analyze known and unforeseen systematic effects;
o deal with known physics and new physics.
Predicting the data is fundamental to understand them.
April 10, 2008, Garching
Simulation pipeline
Add foregrounds
Add foregrounds
Generate CMB sky
Add foregrounds
Add foregrounds
Add foregrounds“Observe” sky with
LFI
referencesky maps
Time-OrderedDatacosmological
parameters
frequency sky maps
cosmologicalparameters
Add foregrounds
Add foregrounds
Data reduction
Freq. merge
Comp. sep.
component maps
C(l) evaluationC(l)
Parameter evaluation
Knowledge and details increase over time, therefore the whole computational chain must be iterated many times
instrument parameters
NEED OF HUGE COMPUTATIONAL RESOURCES
GRID can be a solution!!!
April 10, 2008, Garching
Planck & DEISA
DEISA was expected to be used to
o simulate many times the whole mission of Planck’s LFI instrument, on the basis of different scientific and instrumental hypotheses;
o reduce, calibrate and analyse the simulated data down to the production of the final products of the mission, in order to evaluate the impact of possible LFI instrumental effects on the quality of the scientific results, and consequently to refine appropriately the data processing algorithms.
Model 1
Model 2
Model 3
Model N
April 10, 2008, Garching
Outcomes
o Planck simulations are essential to get the best possible understanding of the mission and to have a “conscious expectation of the unexpected”
o They also allow to properly plan Data Processing Centre resources
o The usage of the EGEE grid resulted to be more suitable for such project since it provides fast access to small/medium computing resources. Most of the Planck pipeline is happy with such resources!!!
o However DEISA was useful to produce massive sets of simulated data and to perform and test the data processing steps which requires large computing resources (lots of coupled processors, large memories, large bandwidth…)
o Interoperation between the two grid infrastructures (possibly based on the G-Lite middleware) is expected in the next years