towards a virtual european supercomputing infrastructure vision & issues sanzio bassini
TRANSCRIPT
Towards a Virtual European Supercomputing Infrastructure
Vision & issues
www.deisa.org
Sanzio Bassiniwww.cineca.it
Mission statement
To contribute to a significant enhancement of capacities and capabilities of HPC in Europe, by the integration of national supercomputing infrastructures.
To deploy and operate a distributed multi-terascale European computing platform, based on a very strong coupling of existing national supercomputers. DEISA plans to operate as a virtual European supercomputing centre.
To contribute to the deployment of an extended, heterogeneous Grid computing environment for HPC in Europe, needed to interface the DEISA research infrastructures with the rest of the European IT infrastructures.
Subjects
Technical Global File systems Heterogeneous extension Relations to Grid technologies Deployment of high bandwidth vpn – qos
Other Collaborations & pan-European perspective
Artist’s view of the infrastructure
The DEISA super-cluster
Site A Site B
Site C Site D
High bandwidth vpn – qos
National computingfacility
Extended Gridservices
Two levels of operation
Operating system
Global file systemsMulti-cluster batch managers
The integration of nationalsupercomputers operates at thistechnology level, underneath theGlobus Toolkit
Grid Security Infrastructure
Grid data and resource management
Grid information services
DEISA heterogeneous resourcemanagement interfaces
Unicore, Grid application toolkits, etc
The heterogeneous Grid extensionoperates here.
Grid middleware (Globus Toolkit)
Global File Systems
Global file system
Sophisticated software environment,necessary to provide single systemimage if a clustered computingplatform.
It provide global data management.Data in the GFS is “symmetric” withrespect to all computing nodes.
GFS encapsulate sophisticated distributed computing and Grid technologies.Applications do not need to be modified to benefit from GFS services.
Grid technologies are working in the background, and they are not directlyseen by end users.
The integration concept
Global distributed file systemwith continental scope: the key super-cluster integration technology.
Site A Site B
Site C Site D
VPN - QOS
Co-scheduling and meta-computing
As co-scheduling not yet mature enough to be massively used in national supercomputing centers without disrupting the efficiency of national services…
… DEISA does not rely on meta-computing for massive science production. Instead, it adopts a simple, innovative operational model that relies on communication bandwidth to make the latency problem irrelevant.
Co-scheduling and meta-computing will only be used in selected cases of weakly coupled distributed applications.
The operational model
Global management of a “resource pool”, that will allow:
The availability of an integrated supercomputing environment for trans-national collaborations, providing global data management through global file systems.
Implementation of job migration across sites (transparent to end users) as a way of releasing significant resources in one particular site for demanding applications. Load balancing computational throughput at a European scale.
Support of distributed applications.
Heterogeneous integration
Fundamental strategic choice: no commitment to any particular technology or vendor.
This is an operating system (OS) issue, not a hardware issue.
At this point in time, heterogeneous extension means integration AIX and Linux systems
May be, the only OS whose survival seems guaranteed in the long term.
DEISA and Grid technologies
Large scale integration of IT systems is, for DEISA, primarily a strategic issue
DEISA is technology neutral. Technology choices follow from their capability to adapt to a pre-established operational model, and to provide real services to end users.
Three criteria for technology choices: The necessity of disposing as soon as possible of a stable and reliable
European production platform (the “core” platform). Grid technologies work in the background through global file systems and multi-cluster batch managers.
The necessity interfacing this platform with other systems in Europe (the “outer” Grid environment). Deployment of traditional Grid environments
The preparation of their future evolution by the integration of new Grid technologies as they reach maturity (a specific JRA activity).
USERS issues
Scientific strategies, identification of users communities: These are standard actions carried out by the centres, and an important
wealth of national scientific reports and strategic studies are available today. This national input must be repositioned in a larger European perspective.
Strong deployment of industrial application A specific joint research activity with industries, leaded by FIAT Research
Center in strict connection with DEISA steering commette, is active since the beginning of the project
First priority actions An aggressive communication plan to inform the scientific community of the
new horizons in computational science enabled by DEISA
Measure of success DEISA has to provide measurable impact on computational science. This
issue will be followed by the DEISA Scientific Committee
Monitoring access to resources Standard service in national supercomputing centers. The dedicated inter-
platform network will be included in the list of monitories resources.
Complementarities with other UE projects in the domain of supercomputing
Fully complementary between DEISA and HPC-Europe HCP-Europe has to do with the access service provision
of supercomputing infrastructure More than 5000 users will be hosted by many DEISA
partners to have access to large scale facilities for supercomputing and towards the deployment of a Virtual European Supercomputing Infrastructures
“Single point of access” and “Single point of user support”
Outreach common shared vision
Cooperation with other leading projects
With EGEE Exploitation of middleware technologies for the
access to high end computing facility; Next generation Grid road map; Dissemination;
With US Teragrid Deployment of Global File System Characterization and Implementation constrain of a
global Cyberinfrastructure