a european open science cloud - embl
TRANSCRIPT
EIROforumITWorkingGroup24November2015
This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
AEuropeanOpenScienceCloudAbstractThisdocumentoutlinesthepositionofEIROforumonaEuropeanOpenScienceCloud.ItexplorestheessentialcharacteristicsofaEuropeanOpenScienceCloudifitistoaddressthebigdataneedsofthelatestgenerationofResearchInfrastructures.Thehigh‐levelarchitectureandkeyservicesaswellastheroleofstandardsisdescribed.Agovernanceandfinancialmodeltogetherwiththeroles of the stakeholders, including commercial service providers and downstream businesssectors,thatwillensureaEuropeanOpenScienceCloudcaninnovate,growandbesustainedbeyondthecurrentprojectcyclesisdescribed.AbouttheEIROforumEIROforum partners are intergovernmental research organisations – CERN, ESA, EMBL, ESO,EuroFusion,EuropeanXFEL,ILLandESRF–coveringdisciplinesrangingfromparticlephysics,spacescienceandbiologytofusionresearch,astronomy,andneutronandphotonsciences.ThepartnerorganisationshaveatrulyEuropeangovernance,fundingandremit,andinmanycasesshareaglobalengagement.Theyareworldleadersinbasicresearch,aswellasinmanagingandoperatinglargeresearchinfrastructuresandfacilities.TheEIROforumcollaborationishelpingEuropeansciencereachitsfullpotentialthroughexploitingitsunparalleledresources,facilitiesandexpertise.Bycombininginternationalfacilitiesandhumanresources,EIROforumexceedstheresearch potential of the individual organisations, achieving world‐ class scientific andtechnological excellence in interdisciplinary fields. EIROforumworks closelywith industry tofosterinnovationandtostimulatethetransferoftechnology.
PreparedbyCERNITdepartmentonbehalfoftheEIROforumITWorkingGroup.
EIROforumITWorkingGroup24November2015
iThis document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
ExecutiveSummaryEIROforummembersandotherResearchInfrastructureoperatorsfaceunsustainabledemandforcomputingandnetworkingservicestodeliverthepromiseofOpenScience.Theyneedmorecost‐effective approaches to collecting, processing, distributing and re‐using the rapidly growingamountsofdatabeingproducedbytheirinstruments.This will require innovative ways of providing an integrated IT infrastructure andoperationsexpertiseneededtorunapplications. Currently in‐house resources, public e‐infrastructure and commercial cloud services are notintegratedtoprovideaseamlessenvironmentfordata–intensivescience.Existingservicesdonotcover the full lifecycle of research from proposal submissions requesting access to ResearchInfrastructures,throughtodataacquisition,sharingandpublication.Researchersareby‐passingtheirin‐houseITdepartmentsandpubliclyfundede‐Infrastructurestomakeuseofcommercialcloudservicesthatofferinnovative,easy‐to‐usesolutionsandfilltheservicegaps.ThisshadowITinnovationrepresentsanopportunitytointroducechangebutmustbeundertakenwithfullknowledge of the policy aspects including data protection, intellectual property rights andapplicablelegislation.A European Open Science Cloud has the potential to provide themeans to link suchservicestogetherandincreasescientificoutput.TheHelixNebulainitiative(HNI)hasbroughttogethermorethan40serviceproviders,researchorganisations,dataprovidersandpublicly fundede‐infrastructures. Ithasdevelopedahybridcloud model with procurement and governance components suitable for the dynamic cloudmarket.APre‐CommercialProcurement(PCP)isbeingnegotiatedtobuildanewformofITasaService(IaaS)platformusingopensourcesolutionsinafederatedScienceCloud.Procuring cloud services from providers on a pay‐per‐usagemodel on the operationsbudgetratherthanthecapitalbudgetoffersbothflexibilityandscalability.E‐infrastructurecostswillbecomeanintegralpartofthecostofdoingscienceand,consequently,must be cost‐justified in terms of benefits and impact.Moving to the cloud can enablemoreflexiblepricingmodelssuchaspercore/hourorperrequest/transactionormigrationtoOpenSourceSoftware(OSS)tocontrolgrowingsoftwarelicensingcosts.Mostpubliclyfundedresearchorganisationslackdetailedcostmodelsinhibitingfinancialcomparisonsbetweentraditionalandcloud‐basedsolutions.RIsneedtounderstandthebenefitsaswellasthefullcostsof‘bigdata’servicesandbeabletomanagetheirownprocurements inacompetitivemarketplace,migrateusecasesandexistinginfrastructurestothecloudparadigm,andadoptanappropriatecollaborativegovernancemodel.Serviceswillbeprovisionedfromcommercialsupplierswhentheyarenotavailablein‐houseorcan be delivered externally on better terms (i.e. at shorter notice, lower cost or betterperformance etc.). Publicly funded data centres will continue to guarantee long‐term datapreservation and service supplier independence.Amarket assessment of the public researchsector and downstream business sectors that could build on the data produced by ResearchInfrastructuresisneededtobuildconfidenceinthebusinessmodelandjustifyinvestmentsinaEuropeanOpenScienceCloudbythesupply‐side.
EIROforumITWorkingGroup24November2015
iiThis document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
A significant difference compared to the currentmodel is that funding agencies andresearchorganisationswillno longerprovision servicesexclusively from theirown in‐houseresources.Stakeholdersinthepublicandcommercialsectorsmustnotonlyinvestinthebuildingblocksforthedevelopmentofe‐InfrastructurelistedinTable1,butalsoinend‐userfacingservicesandintrainingthenextgenerationofIT‐savvyresearchers.Thiswillleveragetheinvestmentsalreadymadeinthepubliclyfundede‐infrastructuresandcommercialcloudservices.All stakeholdergroupsneed towork together toensurewideadoptionof competitive,secure,reliableandintegratedcomputingservices.ManyresearchorganisationsthatoperateresearchinfrastructuresdonothavethemandatetoprovideITservicestotheirusersforthemanagementandprocessingoftheirexperimentaldataandwillrequireassistancetobridgethegap fromdatatoknowledgeacquisition.Theguidingprinciple is that funding from stakeholders like the EC and national funding agencieswill befocusedoninnovationofservicesanduptakebynewusercommunitiesandbusinessactorswhiletheoperationalcostswillbebornebytheoperatingorganisationsandtheusercommunities.The fundingmodel for a EuropeanOpen Science Cloudmust be designed so that theservicescanbesustainedbytheiroperatingorganisations.The EC’s INFRASTRUCTURES 2016‐2017work programme foresees new e‐Infrastructure fordataanddistributedcomputingandapilotforthefederation,networkingandcoordinationofpan‐Europeanresearchinfrastructuresandcloudsingeneral.Lookingfurtherahead,theEChastaken steps to ensure funding for GÉANT over the full duration of H2020 by introducing‘Framework Partnership Agreements’ (FPA). The FPA model represents a more long‐termengagementthatcouldencouragetheintegrationofe‐infrastructuresco‐fundedviaECprojectsintotheResearchInfrastructures’computingmodels.TheapplicationoftheFPAapproachtoaEuropeanOpenScienceCloudcouldestablishthebasisfortheEuropeanResearchArea’sdigitalcommonsandleadtowardsScience2.0.AEuropeanOpenScienceCloudrepresentsastrategicvisionthatcanbeavectorforintroducingchangeintheserviceprovisioningandcomputingmodelsforthepubliclyfundedresearchsectorinthemediumtolongterm.A European Open Science Cloud has the potential to greatly improve the provisioning of ITservicesforResearchInfrastructurestoaddresstheirbigdataneeds. Itcanencompassall thephasesoftheresearchlifecycleandofferaplatformofjointinnovationforthepublicandprivatesectors.ItwillsignificantlychangethewayITservicesareprocured,organisedandfunded.Thekeychallengesareintegratingfrequentlychangingtechnologies,managingthecomplexityandidentifyingtheoptimalorganisationalandfinancialmodels.Researchersmustbeconvincedthattheywillnotlosecontroloftheirpreciousdata.Itisanambitiousundertakingrequiringtheactiveengagement of many stakeholders and careful planning of the technical, financial, legal andgovernanceaspects.Forittosucceeditmustbecomeapriorityforalltheactorsinvolvedwithmonitoringbythefundingagenciesandregularassessmentbytheusercommunities.Thispositionpaperisarallyingcallforadoptionofsuchastrategicapproach–withintheECandotherfundingbodiestoworktheoperatorsofResearchInfrastructures.
EIROforumITWorkingGroup24November2015
iiiThis document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
Table 1 – major stakeholder groups
National funding agencies
Policy makers
Third sector
Granting bodies
European Commission
DG CONNECT
DG RTD
Research communities
Thought leaders
Peers
Scholarly publishers
Research Infrastructures
Policy‐makers
Operational staff
Data users
Public e‐infrastructures
Service providers
Host organisations
Technology providers
Commercial cloud service providers
Independent Software Vendors
Open Source developer communities
Standards bodies
Table 2 ‐ relevant EC co‐funded projects
AARC https://aarc-project.eu
Cloud for Europe
http://www.cloudforeurope.eu/
EGI https://wiki.egi.eu/wiki/Main_Page/
EUDAT http://www.eudat.eu
GÉANT http://www.geant.net/
Helix Nebula
http://www.helix-nebula.eu
Indigo Datacloud
https://www.indigo-datacloud.eu/
OpenAIRE https://www,openaire.eu
PICSE http://www.picse.eu/
PRACE http://www.prace-ri.eu/
SLALOM http://www.slalom-project.eu/
EIROforumITWorkingGroup24November2015
iiiThis document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
ContentsExecutiveSummary.................................................................................................................................................................................i
Contents......................................................................................................................................................................................................iii
Overview.....................................................................................................................................................................................................1
Sustainabilityinaworldexperiencingthedatatsunami.................................................................................................1
MindtheGap........................................................................................................................................................................................1
NeedfornewwayofprocuringICTservices.........................................................................................................................1
OpenSciencerequiresanintegratedapproach...................................................................................................................2
Hybridcloud‐basedsolutions......................................................................................................................................................2
Background................................................................................................................................................................................................2
Progresstodate..................................................................................................................................................................................2
Pre‐CommercialProcurement.....................................................................................................................................................3
ChallengesfacingResearchInfrastructureoperators.......................................................................................................3
Benefitsofahybridapproachforscalability.........................................................................................................................4
Commercialconsiderations................................................................................................................................................................4
Supply‐side...........................................................................................................................................................................................4
Demand‐side........................................................................................................................................................................................5
Procurement........................................................................................................................................................................................6
Theroleofstandards.......................................................................................................................................................................7
Implementation:Scope.........................................................................................................................................................................8
FederatedApproach.........................................................................................................................................................................8
Supportservices.................................................................................................................................................................................9
Implementation:Connectivity........................................................................................................................................................10
Transportofhugeamountsofdataandthelackofhigh‐performancelinks.......................................................10
Identitymanagement....................................................................................................................................................................10
Implementation:OpenData............................................................................................................................................................11
Providingbroaderaccesstocommunity‐specificsolutions........................................................................................11
Datapreservation...........................................................................................................................................................................13
Reproducibilityofresearch........................................................................................................................................................14
Governance.............................................................................................................................................................................................14
Investment..............................................................................................................................................................................................17
Proprietarysolutionsarenotsolutions................................................................................................................................17
Publicinvestment...........................................................................................................................................................................18
Investmentinskills........................................................................................................................................................................18
Long‐termstrategicinvestment...............................................................................................................................................19
Conclusions.............................................................................................................................................................................................20
EIROforumITWorkingGroup24November2015
1This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
Overview Sustainabilityinaworldexperiencingthedatatsunami MindtheServiceGap NeedforanewwayofprocuringICTservices OpenSciencerequiresanintegratedapproach Hybridcloud‐basedsolutions
SustainabilityinaworldexperiencingthedatatsunamiTraditionalwaysofmeetingthegrowingdemandforcomputingandnetworkingservicescapableofaddressingthe‘DataTsunami1’areseentobeunsustainablebyfundingagenciesaswellastheinfrastructureoperatorssuchasGÉANTandEGI.Thecostofcollecting,processing,distributingand re‐using the rapidly growing amounts of data produced by their instruments is amajorconcern for Research Infrastructure operators including the EIROforum members. Acollaborativeshifttowardsmorecost‐effectivewaysofgeneratingandusingscientificdataandagreaterrolefortheusersofthatdataisrequiredinordertodevelopasustainablefuturefortheevolutionofOpenScience.
MindtheServiceGapOverthelastdecade,drivenwithsustainedfundingfromtheEC,thee‐InfrastructurelandscapeacrossEuropehasgrownfromregionalprototypestoasetofpan‐EuropeanproductionresourcesincludingEGI,GEANT,PRACEetc.Thishasresultedinanumberofserviceswithinthecontextofeachproject but there is no common, overarching goal and souser communitiesmust investsignificantefforttobringtheseservicestogether.Currently in‐house resources, public e‐infrastructure and commercial cloud services are notintegratedtoprovideaseamlessenvironmentfordata–intensivescience.Existingservicesdonotcover the full lifecycle of research from proposal submissions requesting access to ResearchInfrastructures,throughtodataacquisition,sharingandpublication.Researchersareby‐passingtheirin‐houseITdepartmentsandpubliclyfundede‐Infrastructurestomakeuseofcommercialcloudservicesthatofferinnovative,easy‐to‐usesolutionstofill‐intheservicegaps.ThisshadowITinnovationrepresentsanopportunitytointroducechangebutmustbeundertakenwithfullknowledge of the policy aspects including data protection, intellectual property rights andapplicablelegislation.AEuropeanOpen ScienceCloudhas the potential to provide themeans to link such servicestogetherandincreasescientificoutput.
NeedforanewwayofprocuringICTservicesPublicresearchorganisationshavetofindalternativestothetraditionalrouteofpurchasingandoperating in‐house IT equipment which requires capital investment on the physicalinfrastructure (servers, network, storage) needed to run an application aswell as operationsexpertise.Cloudcomputinghasthepotential toreduceITexpenditurewhileat thesametimeimprovingthescopeforinnovativeandflexiblehigh‐qualityservices.Procuringexternalcloudservices from providers on a pay‐per‐usage model implies that infrastructure is no longer‘institutionalised’andthecostofcloudservicescanbe foundontheoperationsbudgetrather
1http://cordis.europa.eu/fp7/ict/e‐infrastructure/docs/hlg‐sdi‐report.pdf
EIROforumITWorkingGroup24November2015
2This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
than the capital budget. There is ‘elasticity’ in cloud‐based services and cloud‐basedinfrastructureisinherentlyscalable.
OpenSciencerequiresanintegratedapproach‘Open Science’ is still in its infancy ‐ driven predominantly by the availability of enablingtechnologiesandtheopportunitiesfornewwaysofworkingratherthanbydemandfromsocietyatlarge,accordingtoarecentconsultation2.Lackofintegrationoftheexistinginfrastructures(and,byinference,accesstothedatatheycarry)wasseentobeabarriertoadoptionofthosetechnologiesandworkingpracticesby86%of the individual scientistswhoresponded to thesurvey.
Hybridcloud‐basedsolutionsThe Cloud for Europe project3 has shown that uptake of cloud services by European PublicAdministrationsisstillveryfragmentedintermsofdemandandprocurementofITservices.TheHelixNebula4 initiative, however,hasdemonstrated thepotential of a hybridmodel inwhichserviceproviders,researchorganisations,dataprovidersandpubliclyfundede‐infrastructuresarebroughttogether.Buildingonthatpotentialwillallowustosupportandtransformpubliclyfundedresearchintodatadrivenknowledgewhichisofvaluetothewiderresearchcommunityanddownstreamindustries.HelixNebulahasalreadybroughtinnovationtotherelationshipbetweensuppliersandusersandintroducedawiderrangeofnewplayerstothemarketplace.ThisprovidesaplatformontowhichaEuropeanOpenScienceCloudinitiative5willaddafurthermuch‐neededdoseofinnovationandaccountabilityinthewaytechnologyisprocuredanddeployed.The goal of this position paper is to allow the EIROforummembers to articulate their ownexpectationsof the initiativebyhelping them tounderstand thenewEuropeanOpenScienceCloudandthewaythatitaddressestheneedsoftheInfrastructureoperatorsandusers.
Background Progresstodate Pre‐CommercialProcurement ChallengesfacingResearchInfrastructureoperators Benefitsofahybridapproachforscalability
ProgresstodateMilestonesonthejourneyinitiatedbyHelixNebulahaveincluded:
Creationofavibrantpublic‐privatepartnershipofmorethan40organisationsandcompanies.
Developmentandcontinuedmonitoringofastrategicplanforcloudcomputinginthepublicresearchsector6.
2Publicconsultationon"Science2.0:Scienceintransition"3http://www.cloudforeurope.eu/downloads4http://www.helix‐nebula.eu/helix‐nebula‐vision5http://dx.doi.org/10.5281/zenodo.160016http://www.helix‐nebula.eu/publications/deliverables/d92‐strategic‐plan‐scientific‐cloud‐computing‐infrastructure‐europe‐three
EIROforumITWorkingGroup24November2015
3This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
Identificationandevaluation(throughtestingandproductionuse)ofahybridcloudmodelmeetingtheneedsofpubliclyfundedresearchbylinkingcommercialcloudserviceswithe‐infrastructures7.
Validationofinclusiveprocurementmodelsthataddressmanyexamplesofkeyprocurementbarrierswithawizardtoolallowingpublicorganisationstoanalysetheirprocurementprocessesanddetermineasuitableprocurementmodelforcloudserviceswhentheirexistingmodelsarenotagoodmatchforthedynamiccloudmarket.8
Aninclusive,transparentanduserdrivengovernancestructurecapableofdeliveringontheinitiative’sobjectives.
Hybrid clouds combine private infrastructure and operationswith shared infrastructure andoperations.Atypicalhybridcloudusecasewouldbetherelocationofthepresentationtier(userinterface)andlogictierwheretheapplicationknowledgeisencapsulatedtoanoff‐sitecloudandhavethemcommunicatewiththedatabasestoredandmanagedwithintheorganisation’sownITinfrastructure.Inorderforthedemand‐sideuserstobeencouragedtopurchasecloudcomputingservices,theservicesofferedmustbeeconomicallyadvantageouscomparedtoothermeansofprocuringITservices.
Pre‐CommercialProcurementPromotionofjointprocurementhasledtothecreationofanexpandingprocurementnetworkofpubliclyfundedresearchorganisationsandestablishmentofanewPre‐CommercialProcurement(PCP),theHelixNebulaScienceCloud(HNSciCloud).HNSciCloud is designed to pull together publicly‐funded e‐Infrastructures using open sourcesolutions,tobuildahybridInfrastructureasaService(IaaS)platform.ItwillhostacompetitivemarketplaceofEuropeancloudplayerswheretheycandeveloptheirownservicesforawiderrange of users beyond research and science including downstreambusiness sectors that canmakeuseofpubliclyfundedresearchdata.ThegoalistoestablishasustainableEuropeanOpenScienceCloudservingEurope’sResearchInfrastructures,communitiesandrelatedbusinesssectorsandsurpassingthecapacitycurrentlyavailableviaexistingpublice‐infrastructuresandthein‐housefacilitiesofresearchorganisations.ItwillbebasedonthemigrationofInfrastructureasaServiceintothemoregeneralITasaServiceconsistingofsoftwaretoolsandapplicationsandtheplatformsonwhichtheyrun.Serviceswillbe provisioned from commercial suppliers when they are not available in‐house or can bedeliveredexternallyonbetterterms(i.e.atshorternotice,lowercostorbetterperformanceetc.).Publicly funded data centres will continue to guarantee long‐term data preservation andcommercialservicesupplierindependence.
ChallengesfacingResearchInfrastructureoperatorsHNSciCloud will enable the federation, networking and coordination of existing ResearchInfrastructuresandscientificcloudsinpreparationforwhatthe2016INFRASTRUCTURESWorkProgrammecallsthe“EuropeanOpenScienceCloudforResearch”.ItbringsEurope’stechnicaldevelopment,policyandprocurementactivitiestogethertoremovefragmentationandsupportResearchInfrastructureoperatorsfacingthreekeychallenges:
Empoweringthemtounderstandthebenefitsaswellasthefullcostsof‘bigdata’servicesandmanagetheirownprocurementsinacompetitivemarketplace
Migratingusecasesandexistinginfrastructurestothecloudparadigm
7http://www.helix‐nebula.eu/publications/deliverables/d62‐roadmap‐the‐integration‐and‐interoperation‐of‐commercial‐cloud‐e8http://www.picse.eu/publications/deliverables/d‐21‐research‐procurement‐model
EIROforumITWorkingGroup24November2015
4This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
Selectinganappropriatecollaborativegovernancemodelthatavoidsthebarriersthatcurrentlyinhibita‘joinedup’wayofworkingbyinvolvingtheresearchusercommunity,theresearchinfrastructuresandtheresearchfundingbodies.
We expect the scale and range of services being provisioned from commercial suppliers tograduallyincreaseovertimeasthecloudmarketmaturesandOpenSciencebecomesembeddedintheresearchlifecycle.Asignificantdifferencecomparedtothecurrentmodelisthatfundingagenciesandresearchorganisationswillnolongerprovisionservicesexclusivelyfromtheirownin‐houseresources.In an answer to a written question in the European Parliament about the current positionregardingprocurementoftheEuropeanScienceCloud,CommissionerOettingerstatedthat:“TheCommissionhassupportedpathfindingstudiesontheuseofhybridmodels,bringingtogetherpublicresearchorganisationsande‐infrastructureswithcommercialsupplierstobuildacommonplatformofferingarangeof services toresearchcommunities.Thiscanbeachievedbybuildingoncloudtechnologieseasilyaccessibletousersandbypromotingprocurementofcloudservicestoencourageinnovation on the supply side.” The role of Helix Nebula and the HNSciCloud in shaping thatpositionisclear.
BenefitsofahybridapproachforscalabilityIf there is significant variation in demand, theremay be an opportunity to reduce operatingexpenditurebymatchingthesupplyofresourcestothelevelofdemand.Byemployingahybridcloudmodel,anorganisationcanquicklyandeconomicallyaddresourcesasneededbyburstingoutofitsprivateITinfrastructuretoacommercialcloudprocessingandstoragecapacity.Acloud‐burstingscenariocanprovidethebenefitsofcostsavings,maximumutilisationofon‐premisesresources and rapid innovation, but also has its own set of challenges in ensuring theperformance,agility,securityandmanagementaspectsofahybridcloudinfrastructure.Byintermixingprivateandpubliccloudinfrastructures,organisationsareabletousethehybridmodeltoleveragein‐houseandoff‐siteresources.Thehybridmodelallowsorganisationstorelyonthecost‐effectivecommercialcloudfornon‐sensitiveoperationsandontheprivatecloudforcritical,particularlysensitiveoperationsprovidingenhancedagilitytomoveapplicationseasilybetweenthein‐houseandoff‐siteresourcestakingintoaccountaspectsofpolicy,cost,securityandavailability.
Commercialconsiderations Supply‐side Demand‐side Procurement Theroleofstandards
Supply‐sideOneimportantconsiderationisthatthisapproachmustgeneratebenefitfortheproviderswhohavetheresponsibilityofensuringthattheyhavethephysicalinfrastructuretomeettheirusers’demandandthattheirperformancemeetsagreedservicequalitylevels.Withoutanaccurateviewoffuturedemand,planningforvariablecostssuchasstaff,replacementserversorcoolers,andelectricitysuppliescanallbeverydifficult,andoptimisingthedistributionofvirtualmachinespresentsamajorchallenge.Themoreunpredictableandspikeytheworkloads,thegreatertheeconomicbenefitofsharingthesameservicesacrossdiverseresearchcommunitiesinthepublicand private sectors. Analysis of the procurements made via Helix Nebula, suggests there is
EIROforumITWorkingGroup24November2015
5This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
insufficient installed capacity currently available in the European market to satisfy theexceptionaldemandthatwillbegeneratedbythelatestgenerationofresearchinfrastructures.Significantinvestmentsbythesupply‐side,basedonaccuratefuturepredictionsofusagewillbenecessary.Consequentlyitisimportantthatamarketassessmentofthepublicresearchsectorand downstream business sectors that could build on the data produced by ResearchInfrastructures isperformed (similar thatperformedby theUKgovernment forpublic sectorinformation9) inorder tobuild confidence in thebusinessmodeland justify investments inaEuropeanOpenScienceCloudbythesupply‐side.Therearealsolicensingimplicationswhentransitioningfromascale‐uparchitecturetoascale‐outarchitecture:someapplicationsarelicensedper‐instanceorper‐CPU,oftenoveranannualterm.Inthisinstance,therecanbesignificantcostimplicationsofaddingnewinstancestoapoolofresources.Intime,applicationvendorswillfollowinfrastructureserviceprovidersinmovingtomoreflexiblepricingmodelssuchaspercore/hourorperrequest/transaction.ThealternativeistouseOpenSourceSoftware(OSS)wherethelicensecostissueisnon‐existent.
Demand‐sideAs identified in theGEANTExpert Group report10, the user communitieswill increasingly becalledupontopayfortheservicestheyreceiveife‐infrastructuresonwhichuserscandependare to continue to survive. E‐infrastructure costswill be an integral part of the cost of doingscienceand,consequently,e‐infrastructureinvestmentsmustmakeasubstantialandsustainableimpactinordertobejustifiedintermsofcostsandbenefits.AstudyofthecosteffectivenessofEuropeandedicatedHTCandHPCcomputinge‐infrastructuresforresearchcomparedtoequivalentcommercialleasedoron‐demandofferingswasperformedby the eFISCAL project11 in 2011. The conclusion was that the ratio of CAPEX (CAPitalEXpenditure) to OPEX (OPerational EXpenditure) for e‐infrastructures was 30%‐70% andmanpower accounted for approximately 50% of the costs (CAPEX+OPEX). A Total Cost ofOwnership(TCO)study12wasperformedbySAPResearchonspecificCERNin‐houseserviceswithinthecontextoftheHelixNebulaFP7project.Bothof thesestudies indicatedthatmostpublicly fundedresearchorganisations lackdetailedcostmodelsforindividualservices.Financialcomparisonsbetweentraditionalandcloud‐basedsolutionswouldneedasetofguidelinesforsuchorganisationsproposingwhichcategoryofcostsshouldbeincludedorexcluded.ItisimportanttorecognisethatshiftingtheprocurementofITservicestoapay‐per‐usagemodelwillnormallyhavealimitedimpactonTCOsincethebulkofexpenditure over the lifetime of an application is not related to the purchase of physicalinfrastructure.ItisalsothecasethatnotallpubliclyfundedresearchcentresareinapositiontomakeaccurateestimationsoftheTCOofin‐houseITservicessincesomecontributingcostsarebornebydifferentdepartments.Theadoptionofcloudcomputingservicesbypublicresearchorganisationsrequiresadditionaljustificationintermsofthebenefitsofthenewwaysofworkingthatcloud‐basedservicesenable.ResearchorganisationsjustifytheirinvestmentsbytheimpactmadeinITservicesontheend‐usercommunitiesintermsofscientificoutput.Togaugethisimpactitisnecessarytounderstand
9MarketAssessmentofPublicSectorInformation,May2013,https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/198905/bis‐13‐743‐market‐assessment‐of‐public‐sector‐information.pdf10http://cordis.europa.eu/fp7/ict/e‐infrastructure/docs/geg‐report.pdf11http://www.efiscal.eu/files/deliverables/D2%203%20Executive%20Summary%20‐%20Computing%20e‐Infrastructure%20cost%20calculations%20and‐business%20_models_vam1‐final.pdf12http://www.helix‐nebula.eu/publications/deliverables/d73‐costing‐exercise‐comparing‐in‐house‐vs‐cloud‐based‐operation‐the‐cern
EIROforumITWorkingGroup24November2015
6This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
theneedsandactivitiesoftheend‐users.Factorssuchaspatternofdemandandtransitionalcostsneedtobeincludedinanyfinancialanalysisofapotentialcloudcomputingsolution.A European Open Science Cloud will need to perform IT capacity planning for all engagedresearch communities on a regular basis. As an example, theWLCGproject has a ComputingResourcesScrutinyGroup13whichreviewsthecomputingresourcesfortheLHCexperimentsonanannualbasis.
ProcurementThe EC‐funded ‘Procurement Innovation for Cloud Services in Europe’ (PICSE14) project isidentifying barriers to procurement of cloud services bypublic research organisations and isdevelopinganewprocurementmodeltoovercomethem.Withtheadventofcloudcomputing,the delivery of ICT services is going through a fundamental change. However, while cloudtechnology service options continue to evolve, procurement processes and policies of publicresearch organisationshave remained firmly rooted inhistorical practices that areno longereffective.Inorderforpublicresearchorganisationsofallsizestotakeadvantageofthebestthecloudmarkethastooffer,amoreflexibleandagileprocurementmodelmustbeidentifiedandimplemented.PICSEhascontactedanumberofpublicsectororganisationsandinitiatives(includingCERN15,Cloud for Europe16, DG DIGIT17, ECMWF18, EMBL19, ESA20, ESRF21, Europeana22, GRNET23 andUmeåUniversity24)todiscusstheircurrentpractices.Themainchallengesidentifiedthatneedtobeaddressedintheprocurementofcloudservicescanbesummarisedasfollows: Aswithallpurchasesofnewtechnologies,procuringinnovativeservicesrequiresnew
skillsandcompetences. Organisational/culturalbarrierstocloudadoptionareveryimportant,especiallywhen
theorganisationispurchasingcloudforthefirsttime. Financialissuesmayariseduetothenewwaytoevaluatecostsinmovingtothecloud. Legal/organisationalissuesmaybeencounteredduetothecloudservicedeployments
particularitiese.g.applicablelaw,datalocationrestrictions,dataprotection,etc. Security,includingnetworksecurity,dataprotection,privacy,dataandservice
portability,interoperabilityareallelementstobeconsideredwhenidentifyingthecloudsolutionstopurchase.
Vendorlock‐in(dependencyonthevendor)andvendorviabilityareaspectsthathavetobeconsidered.
13http://wlcg.web.cern.ch/collaboration/management/computing‐resources‐scrutiny‐group14http://www.picse.eu/15http://home.web.cern.ch/16http://www.cloudforeurope.eu17http://ec.europa.eu/dgs/informatics/identity_en.htm18http://www.ecmwf.int/19http://www.embl.de/20http://www.esa.int/ESA21http://www.esrf.eu/22http://www.europeana.eu/23https://www.grnet.gr/24http://www.umu.se/
EIROforumITWorkingGroup24November2015
7This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
Dynamicandchangingcloudservicesmustbemonitoredtoensureproperperformanceandbenefitrealisation.Servicelevelagreements(SLAs)mustbedraftedandmanageddiligently,anareawheretheEUSLALOMprojecthasbegunworking25.
Vendorcontractnegotiationiscomplicatedandcritical.Therearenostandardcontractsforcloud.TheSLALOMprojectisfinalisingacloudservicecontracttemplatewithequitabletermsandconditionsforsuppliersandcustomers.
Contractterminationconditionsneedtobecarefullyevaluated.Portingdatatoanothercloudornon‐cloudsolutionmayinvolvehighcosts.Cloudescrowisalsomissing.
Thesechallengeshavean impactonall thestepsof theprocurementprocess.There isaclearimpact on skills and knowledge required. IT managers within public research organisationsshouldhaveaclearunderstandingofthenewtechnologybeingpurchased.Functionally similar to financial market brokers, cloud brokers match provider supply withconsumerdemand.Thismodelbenefitsallparties:experiencingmorepredictabledemand,cloudproviderscanbetteroptimizetheirworkflowtominimizecosts;cloudusersaccesscheaperratesoffered by brokers; and cloud brokers generate profit from charging fees. Including suchbrokerage models in a European Open Science Cloud could reduce the risks that arise frommarketinstability.TheadoptionofahybridcloudmodelwillalsohelptoreducetheimpactofmarketinstabilitiesonaEuropeanOpenScienceCloud.
TheroleofstandardsStandardsimprovetransparencyandcomparabilityforserviceusers.Theyopenupnewmarketsfor suppliers and offer equal access conditions, particularly for small and medium‐sizedcompanies. Standards also improve the quality, security and sustainability of products andservices and adoption of suitably defined standards exposes the supplier’s unique sellingpropositions.Openstandardscanbeadoptedtoprovideinteroperabilitybetweenpartsoftheinfrastructure,portabilityfromonecloudserviceprovidertoanotherandtrustintheintegrity(provenance,reliability,etc.)oftheinfrastructurethathasbeenbuilt.Emergingcloudstandardsforapplicationorchestrationprovidetemplate‐drivendescriptionsofapplicationsasatransparentwayofabstractingtherelationshipsbetweencloudapplicationsandservicesandtheunderlyingplatformorinfrastructure.OneexampleofthisisTOSCA(TopologyandOrchestrationSpecification forCloudApplications) fromOASIS26, selectedby theHorizon2020 EC co‐funded IndigoDataclouds27 project. This gives suppliers and users interoperabledescriptionsofcloud‐hostedservicesandapplications,includingtheircomponents,relationships,dependencies, requirements, and capabilities. TOSCA has the potential to expand customerchoice,improvereliability,andreducecostandtime‐to‐value,facilitatingtheagile,continuousdeliveryofapplications(DevOps)acrosstheirentirelifecycle.Portabilityisanothersignificantpropertysinceprospectiveuserswanttoavoidvendorlock‐inwhentheychoosetousecloudservices.Usersneedtoknowthattheycanmovetheirdataandapplicationsbetweenmultiplecloudserviceprovidersatlowcostandwithminimaldisruption.Portability through the appropriate standardisation of APIs, data models, data formats andvocabularieswillhelpautomatebusinessprocessessurroundingcloudcomputingprocurement,enable straightforward technical integration between the client and provider, and allow forflexibleanddynamicapplicationdeploymentsacrossmultipleclouds.
25http://www.slalom‐project.eu/26https://www.oasis‐open.org/committees/tc_home.php?wg_abbrev=tosca27https://www.indigo‐datacloud.eu/
EIROforumITWorkingGroup24November2015
8This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
TrustandconfidenceincloudcomputingservicesreliesonworksuchasENISA’sCIIP28(CriticalInformation InfrastructureProtection) initiativewhichdefinesappropriatestrategies,policiesandspecificmeasuresforprotectinginformationonthecloud.Theunderlyingcauseofmanyofthe risks and challenges associated with cloud computing is that the user passes overresponsibilityfordataandforapplicationstothecloudserviceproviderandtheproviderhasamulti‐tenantenvironmentinwhichresourcesareshared.Inadditiontothemanyeconomicandtechnologicaladvantages that cloudcomputingoffers to researchcommunities, therearealsosignificantsecuritybenefitsinmigratingapplicationsandusagetothecloud,asnotedbyENISA.The shared resources available in clouds also potentially include rare expertise, shared bestpracticesandadvancedsecuritytechnologies,beyondthemeansorabilitiesofthevastmajorityofSMEs,manylargercompaniesandevenmanygovernmentbodies,toprovidefortheirin‐housesystems.Atrulyinteroperablecloudwillencourageadoptionbyusers,safeintheknowledgethattheycanchangeproviders,orusemultipleproviders,withoutsignificant technicalchallengesoreffort.Thiswillexpandthesizeofmarketsinwhichcloudprovidersoperate.
Implementation:Scope Federatedapproach Supportservices
FederatedApproachAEuropeanOpenScienceCloudshouldofferaninitialportfolioofservicescorrespondingtothelistofe‐InfrastructureservicesdocumentedbyeIRGinitsbluepaperof201029withthetechnicalcharacteristicsidentifiedbytheHighLevelExpertGrouponScientificDataintheir“RidingtheWave”reportfromthesameyear30.Implementations for the majority of the foreseen services already exist at varying levels ofmaturity. The key challenges are integrating frequently changing technologies,managing thecomplexityandidentifyingtheoptimalorganisationalandfinancialmodels.Researchersmustbeconvinced that theywillnot losecontrolof theirpreciousdata.Thedata centresoperatedbypublicresearchorganisationscanprovidesuchguarantees.Theycanrapidlyexpandtheavailablecapacitybymakinguseofcommercialserviceprovidersofferingcommoditycomputeanddataservicesaspartofthehybridcloudmodel.Bykeepinga“safecopy”oftheresearchdata,thepublicresearch organisations can also insulate the researcher communities from changes in serviceproviderandtechnology.AEuropeanOpenScienceCloudshouldtakeabottom‐upapproachtoimplementation,startingwithIaaS.Integrationshouldstartwithacommoncatalogueofservicesandafederatedidentitymanagement system offering a single sign‐on facility to access services across all suppliers.Startingbottom‐upisessentialtogetthecoretechnical,financial,andpolicyprinciplesright.IaaScan be introduced without impacting higher‐level user‐facing services that will require asignificantsoftware investment. Italsorepresentsastrategywith lowerriskbecausethe IaaSmarketismorematurethanthePaaSandSaaSmarkets.
28https://www.enisa.europa.eu/activities/Resilience‐and‐CIIP29http://e‐irg.eu/documents/10920/238805/e‐irg_blue_paper_201030http://cordis.europa.eu/fp7/ict/e‐infrastructure/docs/hlg‐sdi‐report.pdf
EIROforumITWorkingGroup24November2015
9This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
The services of a European Open Science Cloud will need to be integrated with a range ofresourcescurrentlyoperatedbypublicorganisationstoformahybridcloudsolution.Realisationofthebenefitsofahybridcloudisinhibitedbymanybarriersrelatedtoprocurement,trustworthiness,technicalstandardsandlegaltermsofreference,riskofvendorlock‐inandsoon. The overall challenge is to overcome these barriers in order to boost productivity bystimulating all stakeholder groups towork together to ensurewide adoption of competitive,secure,reliableandintegratedcomputingservices.InorderforaEuropeanOpenScienceCloudtobedeployedrapidly,itisessentialtobuildontheexistinginfrastructures.ThisrequiresanagreedoverarchingarchitectureandthecommitmentoftheserviceoperatorstomakeaEuropeanOpenScienceCloudapriority.TheremustalsobeagreementbyallthestakeholdersonthegovernancestructureandfinancialmodeltoensureaOpenScienceCloudcangrow,innovateandbesustained.The EGI Federated Cloud31 is an example of an inter‐disciplinary approach to infrastructureimplementationallowingdatasharingandcollaborationbetweenresearchcommunities.Itisagridofacademicprivatecloudsandvirtualisedresources,builtaroundopenstandardsandfocusingontherequirementsofthescientificcommunity.Technicalconsistencyintheservicedelivery between participating suppliers is ensured by use of recommended publicly definedinterfacespecificationssuchasOCCI32,CDMI33andOVF34.The experience gathered by EGI in managing its federated infrastructure35 will be directlyrelevantandprovideinsightintomakingalargerportfolioofcapacitystyleHPCservicesfordatacentricapplicationsaccessibletoitsexistinguser‐base.Workingwithcommercialcloudserviceproviders will inject the innovation potential created by the uptake of cloud computing inresearchandbusinesssectors.The complementary expertise developed by PRACE and related projects in efficient parallelprogrammingparadigms andoptimising software for a range of architectures is also directlyrelevant to a European Open Science Cloud and application/service developers. The HPCcapabilityservicesofferedbythePRACEcentresshouldbeintegratedtoformpartoftheoverallecosystem. This will require the PRACE HPC centres to participate in the federated identitymanagementschemeanddatasharingservicesdescribedbelow.
SupportservicesSupport services will also be required to ensure the operational staff in the public researchorganisations can resolve end‐user support issues as quickly and as efficiently as possible.Similarly, security responseserviceswillbenecessary tohandle incidents thatmayaffect theplatform.Thepubliclyoperated infrastructures thatarepartof thehybridcloudalreadyhaveuser‐supportandComputerSecurityIncidentResponseteams(CSIRTs)inplacebuttheydonotfullyinteroperateandallcloudservicessupportedbyaEuropeanOpenScienceCloud,whetheroperatedbycommercialserviceprovidersorpublicorganisations,willneedtobeintegratedintothesestructures.
31https://www.egi.eu/infrastructure/cloud/32http://occi‐wg.org/about/specification/33http://www.snia.org/cdmi34http://www.dmtf.org/standards/ovf35https://wiki.egi.eu/wiki/Fedcloud‐tf:UserCommunities
EIROforumITWorkingGroup24November2015
10This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
Implementation:Connectivity Transportofhugeamountsofdata Identitymanagement
Transportofhugeamountsofdataandthelackofhigh‐performancelinksInorderforaEuropeanOpenScienceCloudtooperateeffectively,itisnecessarytoassurethereissufficientnetworkcapacitytopermitdataingressfromtheResearchInfrastructures.GÉANT36 is the high bandwidth pan‐European research and education backbone thatinterconnectsNationalResearchandEducationNetworks(NRENs)acrossEuropeandprovidesworldwideconnectivitythroughlinkswithotherregionalnetworks.TheGÉANTnetworkistheprimary means of connecting the research organisations and universities to the commercialproviders. The Helix Nebula initiative has already demonstrated that it is possible to makepracticaluseofthedatacentresofcommercialcloudserviceprovidersovertheGÉANTnetwork.GÉANTOpen37isaserviceallowingNRENsandapprovedcommercialorganisationstoexchangeconnectivityforthepublicResearchandEducationsectorwithNOCsupport,SLAmonitoring,adefinedpolicy38andcostmodel39.CommercialcloudserviceprovidersareexpectedtoaddthecostofconnectionandusageofGÉANTOpentothepriceofthecloudservicesdeliveredtotheresearchcommunity.Commercialcloudserviceproviderswillalsowanttoofferthesametypesof cloud services to customers from business sectors and will have to integrate alternativenetwork providers which will allow the stakeholders to compare the efficiency and costeffectivenessofallthenetworkservicesprovidedbythedifferentsuppliers.TheopeningupofaEuropeanOpenScienceCloudtousersbeyondthepubliclyfundedresearchsector is essential if it is to attract investment from theprivate sector and support a vibrantinnovationcycle.Lookingfurtherintothefuture,aEuropeanOpenScienceCloudcouldbemorecloselylinkedtothedataacquisitionandreal‐timerequirementsofResearchInfrastructures.Forexample,EuropeanXFEL,ILLandESRFtogetherwithEurofusionsitesallrequireonlineorrapidfeedbackinordertoprepareforthenextexperimentalrun.Thisimpliesimportantincreasesinnetwork capacity. Similar real‐time needswill also be important for the applications of newdetectortechniquesaddressedbytheATTRACTconsortium40.
IdentitymanagementeduGAIN41 isan international inter‐federationservice interconnectingresearchandeducationidentity federations. It enables the secure exchange of information related to identity,authentication and authorisation between participating federations. eduGAIN provides aninfrastructure forestablishing trustedcommunicationsbetweenIdentityProviders(IdPs)andServiceProviders(SPs)indifferentparticipatingfederations.End‐usersauthenticateatIdPsandobtainaccesstoservicesdeliveredbySPs.FederatedidentitymanagementisalsogainingtractioninbusinesssectorsasshownbytherisingpopularityofUniversal2ndFactor(U2F)asanauthenticationstandardcreatedbytheFIDO(Fast36http://www.geant.net/37http://www.geant.net/Services/ConnectivityServices/Documents/GEANT%20Open%20Service%20Brief.pdf38http://www.geant.net/Services/ConnectivityServices/Documents/GN3PLUS13‐1439‐12_geant_open_exchange_production_policy_v4_3.pdf39http://www.geant.net/Services/ConnectivityServices/Documents/GEANT%20Open%20Service%20Description.p40http://www.attract‐eu.org/41http://services.geant.net/edugain/Pages/Home.aspx
EIROforumITWorkingGroup24November2015
11This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
IDentity Online) Alliance42 an industry group established to standardize authenticationtechnologyanddevicesthatcansimplifyandstrengthentwo‐factorauthenticationforbusinessesandconsumers.SoitwillbeessentialforeduGAINtoensureitcanengagewithcommercialIdPsandSPs to avoid isolating the researchandeducationcommunity.The recently startedAARC(AuthenticationandAuthorisationforResearchandCollaboration)43H2020projectintendstofurtherdevelopeduGAINanditisessentialthataprimarygoalofthisprojectshouldbetoensureeduGAINcansupportaEuropeanOpenScienceCloudinproductionusage.
Implementation:OpenData Broaderaccesstocommunity‐specificsolutions Datapreservation Reproducibilityofresearch
Providingbroaderaccesstocommunity‐specificsolutionsProvidingaccesstothird‐partyopendatarequiresappropriatemanagementstructuresfordataaswellastheconnectivityallowinginterchangeofthedataitself.Thevaluechainforinformationcanbeconsideredinthreelayers–dataproviders,value‐addedprovidersanddownstreamusers.TheGlobalEarthObservationSystemofSystems(GEOSS44)isanexampleofacommoninfrastructureprovidedbyacommunityofdataproviders.The‘GEOSSPortal’isasingleInternetaccesspointforusersseekingdata,imageryandanalyticalsoftwarepackagesrelevanttoallpartsoftheglobe.GEOSSdoesnotoffertohostdatasetsorguaranteethattheyarealwaysavailablebutsimplymakesthemaccessiblefromtheiroriginalsites.GEOhasaworkinggroupwhichhasrecentlydefinedthreeconditionsforlegalinteroperabilityamongmultipledatasetsfromdifferentsourcestoexist45:
useconditionsareclearlyandreadilydeterminableforeachofthedatasets, thelegaluseconditionsimposedoneachdatasetallowcreationanduseofcombinedor
derivativeproducts,and usersmaylegallyaccessanduseeachdatasetwithoutseekingauthorizationfromdata
creatorsonacase‐by‐casebasis,assumingthattheaccumulatedconditionsofuseforeachandallofthedatasetsaremet.
Similarly, fourteen research infrastructures in the biological, biomedical and environmentalsciences developed commonly agreed principles of data management and sharing. Thedocument46producedbytheBiomedBridgesprojectmakeskeyrecommendationsonhowdatamanagementandsharingviatheresearchinfrastructurescanbesupportedandencouraged:
1. TheRIs encouragedata sharingand reuseandsupport thenotion thatpublic fundersshouldencourageOpenAccesstodatafrompubliclyfundedresearchwherepossible.
42https://fidoalliance.org/43https://aarc‐project.eu44http://www.earthobservations.org/geoss.php45https://www.earthobservations.org/documents/dswg/Annex%20VI%20‐%20%20Mechanisms%20to%20share%20data%20as%20part%20of%20GEOSS%20Data_CORE.pdf46PrinciplesofdatamanagementandsharingatEuropeanResearchInfrastructures,February2014,http://dx.doi.org/10.5281/zenodo.8304
EIROforumITWorkingGroup24November2015
12This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
2. Somedatamayonlybesharedundercertainconditionsandwithappropriatesafekeepingmechanismsinplace,suchaspersonallyidentifiabledata,datasubjecttoethicalorlegalrestrictions,orrestrictionsforintellectualpropertyprotection.
3. To encourage data sharing, systematic reward and recognition mechanisms arenecessary.
4. Proposals forpublicly funded researchatRIs should includeadatamanagementplanconcerningthedepositionofdatainlong‐termarchivesthataddressesspecificresourcesandactivities(includingstandardisationofdataproductionandcuration/annotation).
5. Fundingfortoolsandactivitiesconnectedtodatadepositionmustbeavailable.6. Systems, services and resources must be in place to facilitate straightforward data
deposition by researchers, including support concerning the necessary data useagreements and consent forms for data with data protection or intellectual propertyrequirements.
7. Systemsarealsoneededtocaptureandtrackdataprovenanceanduse.8. To ensure necessary trust by data providers or depositors, RIs must guarantee high
standardsofsecurityandtraceability.TheUKisrankedtopof86countriesbytheOpenDataBarometer47,whichmeasuresacountry’sreadiness to secure benefits from open data, its publication of key datasets and evidence ofemergingimpactsfromopengovernmentdata.The2015OpenDataInstitutereport“Opendatameansbusiness:UKinnovationacrosssectorsandregions48”providesconvincingargumentsforlearningfromtheprivatesectorwhenitcomestomanagingthesharingofpublicsectordata,highlightingtheroleofvalue‐addedproviders.TheUK’scentralrepositoryofpublicsectoropendata,data.gov.uk,containsnearly15,000datasetspublishedwithanOpenGovernmentLicense.Examples include geospatial/mapping data (OpenStreetMap49), transport‐related data(Traveline50), demographics/social data (Office for National Statistics51) and business data(CompaniesHouse52).BestpracticesincludetheadoptionofOpenDataCertificates53andtheuseofCreativeCommons54publicdomainlicence(CC0)andattributionlicence(CC‐BY).TheCreativeCommonsattributionandshare‐alikelicence(CC‐BY‐SA)isalsoused,butmaylimitacompany’sabilitytousethatdataforcommercialproductsandservicesbyrequiringthemtoalsoattachthesameopenlicencetothedatatheyderive.Somedatacanneverbe“open”intheliteralsenseandspecificauthorizationmayberequired(e.g. for medical patient data). However, the “FAIR” principles of Findability, Accessibility,InteroperabilityandReusability55shouldstillberespectedand formthebasis foraEuropeanOpenScienceClouddatapolicy.OpenAIRE56isanetworkofOpenAccessrepositories,archivesandjournalsthatsupportOpenAccesspolicies.OpenAIREisanetworkofmorethan580dataproviders,integratingmorethan47http://barometer.opendataresearch.org/report/analysis/rankings.html48http://theodi.org/open‐data‐means‐business‐uk‐innovation‐sectors‐regions49http://www.openstreetmap.org/50http://www.traveline.info/51http://www.ons.gov.uk/52https://www.gov.uk/government/organisations/companies‐house53https://certificates.theodi.org/54https://creativecommons.org/55http://datafairport.org/56https://www,openaire.eu
EIROforumITWorkingGroup24November2015
13This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
10millionOpenAccesspublications,relatedtoabout25,000organisationsand45,000projectsfrom3funders.OpenAIREiscontributingtotheLinkedOpenDatamovement,andhasrecentlylaunchedtheDLIService57,forDataLiteratureInterlinking.AEuropeanOpenScienceCloudwillbeinterfacedasacontentprovidertothisresourceandasaconsumerofserviceAPIswhichwillallowotherstobuildintegrateddatadiscoveryandanalysisservices.TheZenododigitalrepositorypoweredbyInvenioandoperatedbyCERNaspartofOpenAIREhasbeenextendedwithimportantfeaturesthatgreatlyimprovedatasharingandithasbecomeverypopularwithresearchers frommanydisciplinesaround theworld. Inparticular,Zenodonow offers persistent identifiers for data objects so datasets and software from the popularGitHubcoderepositoryaswellaspublicationscanbecitedandincludesinterfacespermittingmetadatatobeharvested.EUDAT58 is developing a collaborative data infrastructure (CDI) for European researchcommunities. TheB2servicessuitecurrentlyconsistsoftheB2SAFEserviceforimplementingdatamanagementpolicieswithintheEUDATCDI,theB2STAGEservicewhichprovidestoolsandAPI’stointeractwiththeEUDATCDI,theB2SHAREdatarepositoryservicetostoreandshareresearchdata, theB2FIND service for finding researchdata, theB2DROP service as EUDAT’sDropBox‐likeservicetosynchroniseandexchangedatawithinatrustedenvironment.Metadata and indexing facilities across the set of services from OpenAIRE repositories andEUDATdataservicesaswellasengagedcloudserviceprovidersareseenasbeingparticularlyrelevant.
DatapreservationDatacentresoperatedbythegroupofpubliclyfundedresearchorganisationsandrelatedthirdpartiesprovidecomputeandstorageservices to theresearchcommunityaswellasaccess toscientificdatasetsandpublications.Nextgeneration“datafactories”,includingtheResearchInfrastructuresontheESFRIroadmap,arecharacterisedbydatavolumesthatcanextendfrommultiplePetaBytetoseveralExaBytesandevenbeyond(suchastheSKA59)servinguptoseveralthousandsofresearchersaroundtheworld,aswellasmanymorepotentialusersviaOpenAccess.Datapreservation–forcurrentandfuturere‐useandsharing–isafundamentalcomponentofon‐goingdatamanagementplansandthereiscommonagreementontheOAISmodel(ISO14721)togetherwithcloselyrelatedstandards(ISO16363and16919).Thisapproachfocusesalmostexclusivelyonmanagementofrepositorydataandadditionalcapabilitiesareneededtosatisfythe key use cases driving data (knowledge) preservation, sharing and re‐use in a multi‐disciplinaryenvironment.Theseadditionalcapabilitiesrequireagoodunderstandingofwhowillre‐usethedata(“theconsumers”)togetherwithknowledgecapturefromtheOpenScientistswhoare“theproducers”(OAISterms)ofthedata.Preservation policies implemented in a measurable and certifiable manner across shared e‐infrastructurestogetherwithdomainandinstitutionalrepositorieswouldstimulatemuchwiderre‐use of data through the captured and preserved knowledge, as well as the capability topreserveandre‐usedataandknowledgeforsignificantlylongerperiodsoftime.Thistranslatesto a larger returnon investment for the funding agencies, togetherwith associated scientific,educationalandculturalbenefits.
57https://www.openaire.eu/dliservice58http://www.eudat.eu59Inthepreprint“ImagingSKA‐Scaledatainthreedifferentcomputingenvironments”RichardDodson(ICRAR)etal.,2 November 2015, the authors compared commercial cloud services (AWS), a cluster and a HPC installation onperformance,usabilityandcostforSKAimageworkloadsandratedthecloudserviceservicesfavourably.
EIROforumITWorkingGroup24November2015
14This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
ReproducibilityofresearchFederated cloud‐based services will improve reproducibility and transparency (servingResponsible Research & Innovation principles, as envisaged by the OpenAIRE & FOSTER60report61),facilitatingwideraccessfortheknowledge‐basedindustries,andlettingthefreeflowofideasandknowledgespeedupinnovationanddeliveryofaddedvaluetothemarketplace.TheRDAReproducibilityInterestGroupdefinedasetofhigh‐priorityservicesforreproducibilityofOpenScience,asfollows62:1)Persistentlinkingandavailabilityofdataandcode(viarepositoriesorothermechanisms)usedinthegenerationofpublishedresearchresults,withthepublicationitself;2) Development, encouragement, and adoption of meta‐data standards for data and code,especiallyforthoselinkedtopublications;3)Development, encouragement, and adoption of data and code publication, authorship, andcitationpractices,especiallyforthoselinkedtopublications;4)Developmentandadoptionofappropriatetoolsandcomputationalinfrastructurethatenable:thesharingofresearchworkflowsandpermitreplicationofcomputationalscientificfindings;thepersistent linking of all digital scholarly objects used to generate research findings such asdatasets in repositories; and versioning of digital scholarly objects to ensure persistentreproducibility.TosupportreproduciblescienceaEuropeanOpenScienceCloudwillneedtointegrateanetworkof Zenodo‐like repository services and link them to the computing services to ensure thatregisteringandstoringresearchoutputsbecomesasimpleandstandardoperationattheendofthecomputecycle.Inaddition,thiswillenableuserstoanalysetore‐analysetheregistereddatawith the referenced codes and extend it with their own software directly contributing openscienceworkflows.
Governance ShadowITandthechangingroleofITdepartments Inclusivegovernancestructure End‐usersandprocurersattheheartofthedecisionmakingprocess
Disruptivetechnologiessuchascloudofferamyriadofpossibilitiesbutcomewithnewpressuresforserviceprovisioning.Cloudtechnology ismoreaccessible tousersmeaning theyaremoreknowledgeableaboutwhatproductsandservicestheyneedandduetotherapidlygrowingandeasilyaccessiblecloudservicesmarket, theyhavealternatives to their traditionalsupplier foracquiringthem.Aroundtheworld,ITdepartmentsarebeingby‐passedasusersprocuretheirowncloudservicesdirectly.Thisagrowingtendencybyindividualsandworkgroupstosign‐upforcommerciallyoperatedcloudserviceswithoutany involvement fromtheir ITdepartmentswhichcreatesseriousrisksforpublicorganisations.Therisksfromsuchshadowcloudservicesinclude issues with data security, transaction integrity, business continuity and regulatorycompliance.ConsequentlytheroleofserviceprovisioningforITdepartmentshastochangetobecomemoreofabrokerfortechnologyandservices.InthisnewroleitisimportantfortheITdepartment toknowwhat is availableon themarket, howwell itworks, tobe able to assessproviders, validate security, understand service levels and ensurepolicies and legislation arerespected.Sothereisanurgentneedtoorganisetheintroductionofcommercialcloudservices
60https://www.fosteropenscience.eu/project/61https://www.fosteropenscience.eu/sites/default/files/pdf/927.pdf62https://rd‐alliance.org/sites/default/files/case_statement/RDA‐ReproducibilityIG‐Revised‐2_0.pdf
EIROforumITWorkingGroup24November2015
15This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
inthepublicresearchsectorinaconsolidatedandsecuremanner.Forminganetworkofpublicresearch organisations that can procure cloud services will attract the interest of servicesuppliersaswellasfundingagencies.Themajorityofthisprocurementfundingwillbedirectedto service providers and the approach has the advantage of permitting the procuringorganisationstochoosewhichservicesandprovidersreceivethesefundsandthusrepresentsachangetotheestablishedfundingmodelforpublicsectorITservices.BringingtogetherthepublicandprivatesectorintheinnovationcyclewillstrengthenEurope’sglobalcompetitivenessandencouragethecreationofnewandsustainablejobsandthepromotionofgrowth.Theintroductionofprocurementofpay‐per‐usecloudservicesbyfundingagenciesandresearchorganisationsonbehalfof theirend‐usersrepresentsasignificantchange toe‐Infrastructuresandwillimpactthegovernancemodel.Currentlypubliclyfundede‐InfrastructuresaresupplierdrivenwhileaEuropeanOpenScienceCloudputsprocuresandusersattheheartofthedecisionmakingprocess.Itwillbenecessarytoestablishaninclusivegovernancestructurewhereallthestakeholders are represented and avoid a monopoly of any procurer, supplier or researchcommunity.ThegovernanceprincipleshavetoensuretheinterestsofbothpublicandprivateparticipantsaremetandthataEuropeanOpenScienceCloudbecomessustainablyattractiveandbeneficialforallstakeholdersfrombothsectors.A European Open Science Cloud will be a cornerstone of an open science commons and itsgovernancemodelneedstotakeintoaccounttherealitiesofthepublicresearchsectorwiththefollowingobjectives:1. Enableintegrationofexistinge‐Infrastructureswithcommercialcloudcomputingeffectively
andefficiently2. Ensure alignment with the Digital Single Market, foster coherence, equitability and
inclusiveness3. Ensureparticipationofallstakeholdersandfairbalanceoftheirneedsandinterests4. Ensuretransparency,opennessandresponsiveness5. Ensurevalueformoneyandfairincentivesandreturns6. Continuouslymanagelegalandethicalcomplianceandotherrisks7. Ensureaccountabilityandresponsibilityofstakeholdersanddecisionmakers8. Manage the identityandbrandofaEuropeanOpenScienceCloudandensuresustainable
innovationandgrowth.InadditionaEuropeanOpenScienceCloudwouldbecomeacritical ICT infrastructurefortheEuropean Research Area and would need to be protected by identifying vulnerabilities andensuring an operational security plan is in place to minimize the detrimental effects ofdisruptions.ThegovernancestructureiscomposedofseveralbodiesasshowninFigure1.
EIROforumITWorkingGroup24November2015
16This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
Eachbodyinthegovernancestructurehasaspecificroleandcomposition: Board of Procurers – this grouping of all procurers (research organisations, funding
agenciesetc.)istheultimatedecisionmakingbodyofaEuropeanOpenScienceCloud. PolicyAdvisoryBoard–expertsaddressinglegal,contractualandethicalaspectstoensure
thataEuropeanOpenScienceCloud iscompatiblewithEuropean legislation. Itwouldensure theapplicationofbestpractises for thecontractual aspectsofdeliveringcloudservicesincludingservicelevelagreementsimplementingrecognisedpoliciesfortrust,security andprivacynotably fordataprotection; certification requirements; a codeofconduct;andtermsandconditionsthatrespectEuropeanlegislation.
ProcurementandAssessmentAgency–oneormoreorganisationscommissionedbytheBoardofProcurerstoperformthejointprocurementandcentralisedbillingofservicesonbehalfofallprocurersaswellasgatherdatanecessarytomeasureasetofagreedKeyPerformance Indicators (KPIs). Having an organisation to oversee the procurementprocess, certify and enrol service providers as well as handle the contractualarrangementsbetweensuppliersandprocurerswithcentralisedbillingwouldsimplifytheoperationandexpansionofaEuropeanOpenScienceCloud.
End‐UserBoard–groupingofend‐usersfromengagedresearchcommunitiesincludingthelong‐tailofsciencetoprovideaconsultativeopinionontherelevanceandaddedvalueofdeliveredservices.End‐userscontributeapplicationssoftware,dataandpublications.Responsibility for all data that is made available, linked or accessed via the servicesprovidedbytheprojectremainswiththedataprovidersandmusthavebeenobtainedinaccordancewiththelawsandregulationsinoperationinthecountryinwhichthedataproviderresides.Thisincludesanyrequirementforapprovalfromanappropriateethicscommitteeorotherregulatorybody.
SuppliersForum–consultativeforumopentoallcloudservicesuppliers(commercialandpubliclyfunded)whowanttoenterintoadialogwiththeprocurersandend‐usersandprovideinputonallaspectsofaEuropeanOpenScienceCloud.
TechnicalBoard – grouping of technical experts to assess the technical maturity andsuitabilityofservices,includingsecurityaspects.
External Advisory Committee – grouping of external experts from the public andcommercialsectorsthatwillprovideadvicetotheBoardofProcurersonthestateandfuturedirectionsofaEuropeanOpenScienceCloud.
Thedetailsoftheappointments,votingrightsandproceduresofthevariousbodiesremainstobedefined,togetherwithhowtohandlethesituationwhereaparticipatingorganisationisbothaservicesupplierandprocurer.
Figure1 GovernanceStructure
EIROforumITWorkingGroup24November2015
17This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
TherelationshipofaEuropeanOpenScienceCloudtoH2020assumesthatthee‐infrastructureprogramme includes two parallel tracks, production supporting the sustainability of pan‐Europeane‐infrastructuresandinnovationrepresentingchangestotheproductionservices.Theproductiontrackbuildsonnationalstructurestoensurethelong‐termoperation,maintenanceandevolutionofasetofservicesprovidedtoawiderangeofuser‐groupsacrossthebordersofindividualmemberstates.Theproductiontrackdeliversaportfolioofhorizontalcorenetworking,computeanddataservicesthatprovidethebackboneofaEuropeanOpenScienceCloudthroughthe integrationandconsolidationof e‐infrastructureplatformsand the creationof a commonservicecatalogue.Theinnovationtrackisorganisedasshort‐term,competitivecycleswherethebestproposalsaredevelopedintoprototypesthatareassessedagainstagreedcriteriatobecomecandidatesforinclusionasproductionservices.Aservicelifecyclemanagesindividualservicesstartingfromtheirconception,developmentintheinnovationtrack,transitionintotheproductiontrack,operationandeventualretirement.Thetransitionfromprototypeservicetoproductionservice is a decision that involves the stakeholders represented in the governance structuredescribed above. It is expected that Research Infrastructures, including ESFRI projects, willbecome stakeholders of a European Open Science Cloud as procurers and end‐users.TheInternet2NET+63initiativecontainsmanyoftheaspectsnecessaryforthegovernanceofaEuropeanOpenScienceCloudandcanbeagoodsourceofinspiration.
Investment Proprietarysolutionsarenotsolutions In‐houseinvestment Investmentinskills Long‐termstrategy
ProprietarysolutionsarenotsolutionsThecostofprovidinglicensestopopularproprietysoftwarepackagesfortheusersofResearchInfrastructurescontinuestoincrease.Asanexample,between2008and2014,CERN’sspendingonsoftwaredoubledwithoutanysignificantincreaseinthenumberoflicenses.Movingtoacloudmodelwheresoftwarelicensesarerentedonapay‐per‐usebasismayhelpstemthisincrease.Butsomeproprietarysoftwarepackageshaveaneffectivemonopolyintheresearchdomainandtheirmarketdominancecanoffsetanypotentialsavings.Itisessentialthatthereisappropriateinvestmentinopensourcesolutionsinkeydomainssotheycanbesupportedbymultipleproviders.WemustleveragetherichnessinthediversityofEuropeansuppliersandtomatchitwiththeexpertiseavailableinproductione‐Infrastructures,demonstratingthetechnicalfeasibilityofinteroperabilitybetweentheseplayers.The European Technology Platform for High Performance Computing project64 published aStrategicResearchAgendaforachievingHPCleadershipinEurope65whichspecificallyhighlightsthe upcoming big‐data challenges for leading research activities and the relevance of cloudservices:
63http://www.internet2.edu/vision‐initiatives/initiatives/internet2‐netplus/64http://www.etp4hpc.eu/65http://www.etp4hpc.eu/wp‐content/uploads/2013/06/ETP4HPC_book_singlePage.pdf
EIROforumITWorkingGroup24November2015
18This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
“Europe is in a unique position to excel in the area of HPC Usage and Big Data owing to the experience level of current and potential users (and the recognition of the importance of data by such users as CERN, ESA, and biological data banks) and the presence of leading ISVs for large‐scale business applications. Europe should exploit that knowledge to create competitive solutions for big‐data business applications, by providing easier access to data and to leading‐edge HPC platforms, by broaden the user base (e.g., through Cloud Computing and Software as a Service (SaaS), and by responding to new and challenging technologies.”
ThereisnoclearbusinesscaseforpurelycommercialHPCservicesatthescaleofPRACEtier‐0installationsbutsmaller‐scalecommercial‘HPCinthecloud’offeringsarestartingtoappearonthemarket.ThiswillhelpaddresstheshortfallbetweensupplyanddemandforcapabilityHPCservicesasseenasPRACE66wheretypicallyonlyonethirdoftherequestscanbesatisfied.TheuseofcapabilityHPCservicesbythecommercialsector,inparticularSMEs,isbeinginvestigatedby the EC funded Fortissimo project67. This will make hardware, expertise, applications,visualisation and tools available and on a pay‐per‐use basis. In parallel, the UberCloudMarketplace68 is offering on‐demand access to HPC services for individual engineers andscientists.
PublicinvestmentThestepsdescribedabovewillneedconsiderablepublicinvestmentaswellasinvestmentfromcommercialserviceproviderstobringtheplatformtogether.InorderfortheresearchcommunitytobeabletobenefitfullyfromtheexistenceofaEuropeanOpenScienceCloud,ithastoexpandbeyondthebasicIaaSlevelandprovidehigher‐levelservicesthatareclosertotheneedsofthedailyworkofaresearcher.TheHNSciCloudPCPprojectprovidesavehicleforjointinvestmentinIaaSservicesandasimilarapproachshouldbeenvisagedforhigher‐levelsoftwareservices.Thenatural follow‐on step for successful PCPprojects is toprocure at a larger scalewithPPI co‐fundedprojects that could significantly increase the capacity and impactof aEuropeanOpenScienceCloud.Thiswilltakeasustainedinvestmentbyallthestakeholdersinboththepublicandcommercialsectors,notonlyincloudtechnology,supportinginfrastructureandstrategicsoftwarebutalsoinend‐userfacingserviceswhichwillsimplifyaccesstoaEuropeanOpenScienceCloud.Significant investment in software capability will be absolutely essential to obtain the bestperformancefromcurrentandfuturecomputerandstoragearchitectures.ManysciencestodaybenefitfromcommodityCPUanddiskstoragebuttherearesignificantarchitecturalchangesinmodernCPUs(memorylayout,I/Opaths,accelerators,vectorunits,etc.)whichmeansitwillbenecessaryforsciencetoinvestheavilyinsoftwareandtrainingtobeabletomigrateapplicationcodesandprogrammersandfullyexploitthesenewtechnologies.ThisinvestmentinsoftwareisessentialtomaintainEuropeancompetitivenessinthisarea,andshouldincludecoordinationofexistingexpertisetothebenefitofdiversecommunities.
InvestmentinskillsThe design, creation and operation of e‐infrastructure services are essential tools in thedevelopmentofskillsandcompetenciesfortheEuropeanmarket.Theabilitytofullyexploitthepotentialforknowledgeandjobcreationthatislocked‐upinthedatasetsandalgorithmsatthecentreofOpenSciencewillrequirethenurturingofanewgenerationofdatascientistswitha
66PRACEannualreport2014,May2015,ISBN97890216941667http://www.fortissimo‐project.eu/project/the‐project.html68https://www.theubercloud.com/store/
EIROforumITWorkingGroup24November2015
19This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
core set of ICT skills. The EIROforum organisations have core competences in training andeducationwhichcancontributetothisactivity.AEuropeanOpenScienceCloudcanbuildonthisandsimilarinitiativestohelptrainthenextgenerationofIT‐savvyresearchers,andalsoimproveoutreachtothegeneralpublic.
Long‐termstrategicinvestmentAEuropeanOpenScienceCloudmustleveragetheinvestmentsalreadymadeinEuropeforthepubliclyfundede‐infrastructuresandcommercialcloudservices.ThroughHorizon2020,theECandnationalfundingagencieshaverecentlyconfirmedtheircommitmentstoGÉANT,AARC,EGI,OpenAIRE,EUDATandPRACE.Inordertoensurefullsynergies,DGCONNECTforeseesthate‐infrastructureprojectswillbegroupedintoclustersofrelatedprojects.Thisnewphaseoffundingfortheclustersofe‐infrastructureprojectsofferstheECawindowofopportunityandameanstofocusonestablishingaEuropeanOpenScienceCloud.InparallelDGRTDintendstofundapilotaction that will encourage the uptake of a European Open Science Cloud by the ResearchInfrastructures. Close coordination between DG RTD and DG CONNECT funded projects willfacilitatetheestablishmentofaEuropeanOpenScienceCloud.ThefinancialplanforaEuropeanOpenScienceCloudshouldbedesignedsothattheservicescanbe sustained by their operating organisations according to a continuum of funding modelsrangingfromsponsoredresourcesforpeer‐reviewedscientificcasestocommunitieswhowouldpayfortheservicestheyreceive.Additionalresourceswillberequiredinorderfortheseservicestobeexpandedandtoserveawiderrangeofusers.TheEuropeanCommissiontogetherwithregional,nationalandthematicfundingagencieswillneedtobecomestakeholdersandcontributetotheexpansionofEuropeanOpenScienceCloud.Theguidingprincipleisthatfundingfromsuchstakeholderswillbefocusedoninnovationofservicesanduptakebynewusercommunitiesandbusinessactorswhiletheoperationalcostswillbebornebytheoperatingorganisationsandtheusercommunities.Belowisanon–exhaustivelistofareaswherefundingagenciescancontributetothecreationofaEuropeanOpenScienceCloud:
Developmentofnewservicestobedeployedonthee‐infrastructure.Significanteffortwillberequiredtoco‐developscalableservicesthatcanoperateinadistributedvirtualenvironmentandserveawiderangeofusers.
Financialincentiveschemetoincreaseadoptionofservicesbyusersincluding‘long‐tailofscience’researchgroupsandSMEs.
Engagingtheuseoftheservicesbynewresearchcommunities(e.g.curationofdata‐sets,connectionofidentityfederations,deploymentofcommunityspecificservicesetc.)
Developmentoftrainingandeducationalactivitiesbuildingonthecloudservicestomaximisetheirimpact.Thiscanalsoincludeexpansionofservicestosupportforvolunteercomputingsothatresearcherscanbuildcitizen‐cybersciencecommunitiesandfurtherengagethegeneralpublicinscience.
Organisationofuserforumeventsaswellasoutreachanddisseminationtoarangeofaudiencesandproductionofmaterialforpolicyrelatedactivities.
Internationalcollaboration(beyondEurope)throughinteroperationwithequivalentstructuresinotherregionsoftheworld.
ExpansionofthenetworkofserviceprovidersacrosstheEuropeanmemberstatestoaddressnationalandthematicneeds.
EIROforumITWorkingGroup24November2015
20This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
Manyresearchorganisationsthatoperateresearchinfrastructuresdonothavethemandatetoprovidecloudservicestotheirusersforthemanagementandprocessingoftheirexperimentaldata.Thisrepresentsagapinthescientific lifecycleandamissedopportunitytohighlighttheresults and impact of public funded research. These research organisations will requireassistancetobridgethisgapbysupportingtheiruserssotheycanmakeuseofcloudservicestomanageandprocesstheirexperimentaldata.TheEuropeanCommission’sINFRASTRUCTURES2016‐2017workprogrammeforeseesapilotaction addressing the federation, networking and coordination of pan‐European researchinfrastructuresandcloudsforthepurposeof increasingresearchandsciencedataavailabilityand use. It also foresees Data and Distributed Computing e‐infrastructures for Open Sciencewhichshouldcooperatewiththepilotaction.Thecombinedfocusofthesefundingcallsshouldprovide an incentive for the existing e‐infrastructures and Research Infrastructures to worktogethertoformthebasisofaEuropeanOpenScienceCloud.Lookingfurtherahead,theEChastaken steps to ensure funding for GÉANT over the full duration of H2020 by introducing‘Framework Partnership Agreements’ (FPA). The FPA model represents a more long‐termengagementthatcouldencouragetheintegrationofe‐infrastructuresco‐fundedviaECprojectsintotheResearchInfrastructures’computingmodelswhoneedtoplanforfuturedecades69.TheapplicationoftheFPAapproachtoaEuropeanOpenScienceCloudcouldestablishthebasisfortheEuropeanResearchArea’sdigitalcommonsandleadtowardsScience2.070.
ConclusionsCloudcomputingrepresentsaparadigmshiftinthewayITresourcesareprovisionedforresearchcommunities. Traditionally the ITdepartments of researchorganisationshavedeveloped andoperated in‐house theservices that theirusers required.Butcommercial cloudservicesareadisruptivetechnologywitheasy‐to‐usecommodityservicesmadeavailableoftenona‘freemium’basistousersataglobalscale.ConsequentlytheroleofITdepartmentsischangingasusersby‐passtheirtraditionalserviceprovisionchannelstogettheon‐demandservicestheywantandtherebyintroducingshadowITservicesthatareoutsidethepolicyandsecurityboundariesofresearch organisations. This is impacting data intensive science and how e‐infrastructureservicesareusedbyresearchersandjudgedbyfundingagencies.ThiswaveofchangeistakingplacewithinthebroadercontextofOpenSciencebringingever‐greater transparency, accessibility and accountability, wherein stakeholders in the researchprocess increasingly expect to be able to access and reuse the outputs of taxpayer fundedresearch.Fromthegrassroots,OpenAccessfirstemergedfromtheHighEnergyPhysicsscholarlyresearchcommunity71,whosawbenefitinnolongerwaitingfortraditionalpublicationschedulesbeforesharingresearchfindings(and,subsequently,dataandsoftwarecode).Top‐down,governmentsandotherfundersseeopennessasacatalystforincreasingpublicandcommercialengagementwithresearch,bringingaboutbothsocietalandcommercialbenefit.Thisnewrealityrepresentsathreattotheestablishedserviceprocurementanddeliverymodelsbut also an opportunity. In an era of rationalisation and budget concentration, all means ofoptimisingservicedeliveryandreducingoperationalcostsmustbeconsidered.
69EIROforumdiscussionpaper:Long‐termsustainabilityofResearchInfrastructures,http://www.eiroforum.org/downloads/20150325_discussion‐paper‐research‐infrastructures‐sustainability.pdf70http://ec.europa.eu/research/consultations/science‐2.0/background.pdf71 Open Access: Unlocking the Value of Scientific Research, Richard K. Johnson (SPARC), March 2004,http://www.sparc.arl.org/sites/default/files/media_files/OpenAccess_RKJ_preprint.pdf
EIROforumITWorkingGroup24November2015
21This document produced by Members of the EIROforum (http://www.eiroforum.org/) and is licensed under the Creative Commons CC‐BY 4.0 licence.
TheEIROforummembershaveextremeITneedsthatincreasewiththeprogressoftheresearchinfrastructurestheyoperatewhilethebudgetenvelopeforITremains,atbest,unchanged.Cloudcomputingand thecloudservicesmarketdidnotexistwhen thecomputingmodels formanyESFRIresearchinfrastructureswereconceived.Thesecomputingmodelsmustevolvetobecomemoreagileandopportunistic,capableofusingIT capacity in whatever form it is delivered, be it in a grid, cloud, HPC or even a volunteerstructure.Weexpectcommercialcloudservicestoplayanincreasingroleinthesecomputingmodels.Commercialsectorsareinvestingheavily incloudservices leadingtoarapidexpansionofthemarketandabreath‐takingrateof innovationthatthepubliclyfundedresearchsectorcannotmatchbutcanleverageandsoprofitfromsuchadvances.AEuropeanOpenScienceCloudrepresentsastrategicvisionthatcanbeavectorforintroducingchangeintheserviceprovisioningandcomputingmodelsforthepubliclyfundedresearchsectorinthemediumtolongterm.A European Open Science Cloud has the potential to greatly improve the provisioning of ITservicesforResearchInfrastructurestoaddresstheirbigdataneeds. Itcanencompassall thephasesoftheresearchlifecycleandofferaplatformofjointinnovationforthepublicandprivatesectors.ItwillsignificantlychangethewayITservicesareprocured,organisedandfunded.Thekeychallengesareintegratingfrequentlychangingtechnologies,managingthecomplexityandidentifyingtheoptimalorganisationalandfinancialmodels.Researchersmustbeconvincedthattheywillnotlosecontroloftheirpreciousdata.Itisanambitiousundertakingrequiringtheactiveengagement of many stakeholders and careful planning of the technical, financial, legal andgovernanceaspects.Forittosucceeditmustbecomeapriorityforalltheactorsinvolvedwithmonitoringbythefundingagenciesandregularassessmentbytheusercommunities.Thispositionpaperisarallyingcallforadoptionofsuchastrategicapproach–withintheECandotherfundingbodiestoworkwiththeoperatorsofResearchInfrastructures.