infso-ri-508833 enabling grids for e-science egee is a project co-funded by the european commission...

43
INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org EGEE is a project co-funded by the European Commission under contract INFSO-RI- 508833 The EGEE project: building international production g infrastructure

Upload: jimena-bolas

Post on 14-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

INFSO-RI-508833

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833

The EGEE project: building an international production grid infrastructure

Page 2: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

• EGEE - what is it and why is it needed?• Middleware – current and future• Operations – providing a stable service • Networking – enabling collaboration• Summary

The material for this talk has been contributed by many colleagues in the EGEE & LCG projects.

It is heavily based on Bob Jones’ talk at UK AHM 2004.

Page 3: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Technological push?

• Is Grid technology merely a ‘funding opportunity’?• Is it an example of scientists wanting to do something

because they (just about) could?

• Is it in fact a technology driven activity, without any real purpose?

• Consider this diagram

Page 4: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Grids vs. Distributed Computing

• Existing distributed applications:–tend to be specialised systems–intended for a single purpose or user group

• Grids go further and take into account:–Different kinds of resources

Not always the same hardware, data and applications

–Different kinds of interactions User groups or applications want to interact with Grids in

different ways

–Dynamic nature Resources and users added/removed/changed frequently

Page 5: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

What is Grid Computing?

• A Virtual Organisation is:– People from different institutions working to solve a common

goal– Sharing distributed processing and data resources

• Grid infrastructure enables virtual organisations

“Grid computing is coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations” (I.Foster)

Page 6: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

The terms of the problem

• Technological progress produces more sophisticated digital sensors (particle physics detectors, satellites, radio-telescopes, synchrotrons…)

• Much of science is therefore becoming increasingly “data-intensive”

• Huge amounts of data need to be analyzed by large and geographically distributed scientific communities

• Consequently, single computers, clusters or supercomputers are not powerful enough for the necessary calculations and the data processing

Result: access to large facilities is difficult and expensive for the scientific community, particularly in less favoured countries=> increase of the “electronic divide”

Page 7: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

The Grid: a possible solution

• The World Wide Web provides seamless access to information stored in different geographical locations

• The Grid provides seamless access to computing power and data storage capacity distributed over the globe

• Relies on advanced software, called middleware:

– authenticates, authorizes and accounts (AAA)– understands and locates the data which the

scientist needs– distributes the computing processing to

wherever in the world there is available and useful capacity

– sends the results back

The name Grid was chosen by analogy

with the electric power grid

Page 8: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Challenges

• Must share data between thousands of scientists with multiple interests

• Must connect major computer centres, not just PCs (not P2P computing)

• Must ensure that all data is accessible anywhere, anytime• Must grow rapidly, yet remain reliable for more than a decade

• Must cope with different computer centres access policies

• Must ensure data security

Page 9: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Benefits

• Effective and seamless collaboration of dispersed communities, scientific first and then industrial

• Ability to run large-scale applications aggregating thousands of computers, for very wide range of applications

• Transparent access to distributed resources from your desktop

• The term “e-Science” has been coined to express these benefits

• In the vision of the “Knowledge Grid”, the Grid can act as unifying agent between applications and non homogeneous data

Page 10: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

What are the characteristics of a Grid system?

Numerous Resources

Ownership by MutuallyDistrustful Organizations

& Individuals

Potentially FaultyResources

Different SecurityRequirements

& Policies Required

Resources areHeterogeneous

Page 11: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

What are the characteristics of a Grid system?

Numerous Resources

Ownership by MutuallyDistrustful Organizations

& Individuals

Potentially FaultyResources

Different SecurityRequirements

& Policies Required

Resources areHeterogeneous

GeographicallySeparated

Different ResourceManagementPolicies

Connected byHeterogeneous, Multi-Level Networks

Page 12: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

EGEE Overview• Goal:

– Create a world-wide production-quality Gid infrastructure for e-Science

on top of present and future EU Research Networking infrastructure

• Build on: – EU and EU member states major

investments in Grid Technology– International connections (US and AP)– Several pioneering prototype results– Large Grid development teams in EU

require major EU funding effort• Approach

– Leverage current and planned national and regional Grid initiatives and infrastructures

– Work closely with relevant industrial Grid developers, NRENs and US-AP projects

• http://www.eu-egee.org

Applications

Geant-NREN networks

Grid infrastructure

Page 13: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

The Grid: networked data processing centres and ”middleware” software as the “glue” of resources.

Researchers perform their activities regardless geographical location, interact with colleagues, share and access data

Scientific instruments and experiments provide huge amount of data

The (Science) Grid Vision

Page 14: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

In 2 years EGEE will:

• Establish production quality sustained Grid services

– 3000 users from at least 5 disciplines– over 8,000 CPU's, 50 sites– over 5 Petabytes (1015) storage

• Demonstrate a viable general process to bring other scientific communities on board

• Propose a second phase in mid 2005 to take over EGEE in early 2006

Pilot New

Page 15: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

EGEE and LCG

EGEE builds on the work of LCG to establish a grid operations service

• LCG (LHC Computing Grid) - Building and operating the LHC Grid

• A collaboration between:– The physicists and computing specialists

from the LHC experiment – The projects in Europe and the US that

have been developing Grid middleware– The regional and national computing

centres that provide resources for LHC – The research networks

Page 16: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

EGEE Activities

• 48 % service activities (Grid Operations, Support and Management, Network Resource Provision)

• 24 % middleware re-engineering (Quality Assurance, Security, Network Services Development)

• 28 % networking (Management, Dissemination and Outreach, User Training and Education, Application Identification and Support, Policy and International Cooperation)

32 Million Euros EU funding over 2 years started 1st April 2004

Emphasis in EGEE is on operating a productiongrid and supporting the end-users

Page 17: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

• EGEE - what is it and why is it needed?• Middleware – current and future• Operations – providing a stable service • Networking – enabling collaboration• Summary

Page 18: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

gLite

• “gLite” - the new EGEE middleware• Service oriented - components that are :

– Loosely coupled (by messages)– Accessible across network; modular and self-contained;

clean modes of failure– So can change implementation without changing interfaces– Can be developed in anticipation of new uses

• … and are based on standards. Opens EGEE to:– New middleware (plethora of tools now available)– Heterogeneous resources (storage, computation…)– Interact with other Grids (international, regional and national)

Page 19: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Architecture Guiding Principles

• Lightweight (existing) services – Easily and quickly deployable– Use existing services where possible as

basis for re-engineering• Interoperability

– Allow for multiple implementations• Resilience and Fault Tolerance

• Co-existence with deployed infrastructure– Reduce requirements on site components– Co-existence (and convergence) with LCG-2 and Grid3 are essential for the EGEE

Grid service

• Service oriented approach– Follow WSRF standardization– No mature WSRF implementations exist to date so start with plain WS (WS-I)– Provide framework to others so higher-level services can be developed quickly

Architecture: https://edms.cern.ch/document/476451

Page 20: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

• EGEE - what is it and why is it needed?• Middleware – current and future• Operations – providing a stable service

– Needs more than middleware– Organisational, operational infrastructure

• Networking – enabling collaboration• Summary

Page 21: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

User-view of EGEE: a multi-VO Grid

User Interface

Grid services

User Interface

Page 22: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

EGEE: adding a VO

EGEE has a formal procedure for adding selected new user communities (Virtual Organisations):

• Negotiation with one of the Regional Operations Centres

• Seek balance between the resources contributed by a VO and those that they consume.

• Resource allocation will be made at the VO level. • Many resources need to be available to multiple VOs :

shared use of resources is fundamental to a Grid

Page 23: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

SA1 - Operations

• Scale of the production service– April 2004: ~2000 CPUs over ~ 30 sites (LCG-1 → LCG-2)– December 2004: ~8000 CPUs over ~ 80 sites (Migrated to

Scientific Linux) This is far beyond the project milestones!

– Continuous improvements to LCG-2 middleware

• Set-up of CIC/ROCs– Roles/responsibilities defined in execution plans

documented and implemented

• On-going:– Complete set-up of pre-production service– Deployment planning for gLite (EGEE1 M/W version)– Deploy accounting infrastructure

Page 24: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Running the Production Service Grid deployment has entered a new phase• Basic middleware is working

– responsible now for a small fraction of the problems• Outstanding performance/functionality issues

– RLS, RB / little modularity & lack of consistent interfaces …– some solutions are being developed but many cannot be addressed in current

software/architecture - set priorities for new middleware (gLite)• Many operational issues

– mis-configuration, out of date mware, single points of failure, failover, mgmt interfaces …– resources unsuitable for applications needs (e.g. insufficient disk space)– slow response by sites to problems (holiday periods, security concerns)– new middleware will not help for many of these issues - grid partners must think Service

The grid still does not appear as a single coherent facilityapplications must adapt to the current service to gain maximum profit but result has been very effective for LHCb - ~3000 concurrent jobs

Page 25: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

• Operation Management Centre– located at CERN, coordinates

operations and management– coordinates with other grid projects

• Core Infrastructure Centres– behave as single organisations– operate core services (VO specific

and general Grid services)– develop new management tools– provide support to the Regional

Operations Centres

EGEE Operations (I): OMC and CIC

Page 26: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

EGEE Operations: ROC

• Regional Operations Centre responsibilities and roles:– Testing (certification) of new middleware on a variety of

platforms before deployment– Deployment of middleware releases + coordination +

distribution inside the region – integration of ‘Local’ VO– Development of procedures and capabilities to operate the

resources– First-line user support– Bring new resources into the infrastructure and support their

operation– Coordination of integration of national grid infrastructures

Provide resources for pre-production service

Page 27: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Production grid service

Launched Sept’03 with 12 sites, now more than 100 sites and continues to grow

Page 28: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Production grid service

Launched Sept’03 with 12 sites, now more than 100 sites and continues to grow

Page 29: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Production grid service

Launched Sept’03 with 12 sites, now more than 100 sites and continues to grow

Page 30: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Grid projects

Many Grid development efforts — all over the world

•NASA Information Power Grid•DOE Science Grid•NSF National Virtual Observatory•NSF GriPhyN•DOE Particle Physics Data Grid•NSF TeraGrid•DOE ASCI Grid•DOE Earth Systems Grid•DARPA CoABS Grid•NEESGrid•DOH BIRN•NSF iVDGL

•EuroGrid (Unicore)•DataTag (CERN,…)•DataGrid (CERN, ...)•Astrophysical Virtual Observatory•GRIP (Globus/Unicore)•GRIA (Industrial applications)•GridLab (Cactus Toolkit)•CrossGrid (Infrastructure Components)•EGSO (Solar Physics)

•UK e-Science Grid•Netherlands – VLAM, PolderGrid•Germany – UNICORE, Grid proposal•France – Grid funding approved•Italy – INFN Grid•Eire – Grid proposals•Switzerland - Network/Grid proposal•Hungary – DemoGrid, Grid proposal•Norway, Sweden - NorduGrid

Page 31: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Authentication, Authorisation

• Authentication

– User obtains certificate from CA– Connects to UI by ssh– Downloads certificate– Invokes Proxy server – Single logon – to UI - then Secure Socket

Layer with proxy identifies user to other nodes

• Authorisation - currently

– User joins Virtual Organisation– VO negotiates access to Grid nodes and

resources (CE, SE)– Authorisation tested by CE, SE:

gridmapfile maps user to local account

UI

CA

VO mgr

Personal

VO database

Gridmapfiles

On CE, SE nodes

SSL

(proxy)

VO service

Page 32: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

JRA3 - EGEE Authentication Scheme- EUGridPMA

• Policy Management Authority: “Club” of trusted Certification Authority managers www.eugridpma.org

Green: CA Accredited Yellow: being

discussedOther Accredited CAs: DoEGrids (US) GridCanada ASCCG (Taiwan) CERN Russia (HEP) FNAL Service CA (US) Israel Pakistan

Greece: Hellasgrid CA (AUTH)

Page 33: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

• EGEE - what is it and why is it needed?• Middleware – current and future• Operations – providing a stable service • Networking – enabling collaboration

– Current application communities

• Summary

Page 34: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Bringing new applications to the grid

1. Outreach events inform people about the grid / EGEE

2. Application experts discuss specific characteristics with the users

3. Migrate application to EGEE infrastructure with the help of EGEE experts

4. Initial deployment for testing purposes

5. Production usage - user community contributes computing resources for heavy production demands - “Canadian dinner party”

Page 35: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

NA3 – User training and induction

• NA3 has been involved in more than 130 training events across the world– (including the GGF and other grid schools)– ~2000 people trained

induction; application developer; advanced; activity retreats– Material archive online with ~1000 presentations

• Strong links made with GILDA testbed and use of GENIUS portal– Regularly used as part of tutorials

• Essential element of the virtuous cycle for new communities– Training is one of the first things new communities need

• Process for handling feedback defined– Helping to improve material and organisation

• Roadmap for future event planned– Open to new suggestions– Produced status report and update training plan taking into account lessons

learned• On-going

– Plan for next EGEE M/W (gLite) training

Page 36: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

EGEE User Support: infrastructure

General approach: 3 main support centers to guarantee coverage 24/7 and 365 day support and provide a single point of contact to customers and to local Grid operations.

To ensure 24x7 support, it was decided to have 3 GGUS teams in different time zones. GGUS started off at Forschungszentrum Karlsruhe in GermanyGermany in 2003 and has had a partner group at Academia Sinica in TaiwanTaiwan since April 2004. A third partner in North AmericaNorth America will complete the 24 hours cycle.

Page 37: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

EGEE User Support: infrastructure The ROCs and VOs and the other project wide groups such as the Core

Infrastructure Center (CIC), middleware groups (JRA), network groups (NA), service groups (SA) will be connected via a central integration platform provided by GGUS.

This central helpdesk keeps track of all service requests and assigns them to the appropriate support groups. In this way, formal communication between all support groups is possible. To enable this, each group has to build only one interface between its internal support structure and the central GGUS application.

Page 38: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

• More about applications and communities in the next talk

Page 39: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

• EGEE - what is it and why is it needed?• Middleware – current and future• Operations – providing a stable service • Networking – enabling collaboration

– Current application communities– Enabling new and effective use of EGEE

• Summary

Page 40: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Who else can benefit from EGEE?

• EGEE Generic Applications Advisory Panel:

– For new applications

• EU projects: MammoGrid, Diligent, SEE-GRID …

• Expression of interest: Planck/Gaia (astroparticle), SimDat (drug discovery)

Page 41: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Intellectual Property• The existing EGEE grid middleware (LCG-2) is

distributed under an Open Source License developed by EU DataGrid

– Derived from modified BSD - no restriction on usage (academic or commercial) beyond acknowledgement

– Same approach for new middleware (gLite)

• Application software maintains its own licensing scheme

– Sites must obtain appropriate licenses before installation

Page 42: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

Summary

• EGEE is the first attempt to build a worldwide Grid infrastructure for data intensive applications from many scientific domains

• A large-scale production grid service is already deployed and being used for HEP and BioMed applications with new applications being ported

• Resources & user groups will rapidly expand during the project

• A process is in place for migrating new applications to the EGEE infrastructure

• A training programme has started with events already held

• Prototype “next generation” middleware is being tested (gLite)

• Plans for a follow-on project are being discussed

Page 43: INFSO-RI-508833 Enabling Grids for E-sciencE  EGEE is a project co-funded by the European Commission under contract INFSO-RI-508833 The

Enabling Grids for E-sciencE

INFSO-RI-508833

When will the grid disappear?

• Two possibilities:

• 1. Grids will not fulfill their promise and fade into being a niche distributed computing domain

• 2 Grids will become ubiquitous and easily usable – transparent to the user and so ‘disappear’– Following the trajectory of other networked services