Grid cloud comparative study
Comparing Grids and Clouds evolution or revolution?
Marc-Elian Bgin Six Srl, Geneva, Switzerland www.sixsq.com ECHOGRID
Athens, Greece, June 9, 2008
June 9, 2008
Background
This presentation is based on material developed for EGEE:
www.eu-egee.org
June 9, 2008
Content
Context of comparative studyGrid: EGEE/gLiteCloud: Amazon Web
ServiceComparison summaryConclusionsRecommendations
June 9, 2008
Context of comparative study
This presentation is a summary of the report:An EGEE Comparative
study: Grids and Clouds- evolution or revolution?, by Marc-Elian
Bginhttps://edms.cern.ch/file/925013/3/EGEE-Grid-Cloud.pdfObjective:As
cloud computing gains popularity and traction, need to position
grid computing with respect to cloud computingCompare real
implementations and production offeringsEGEE/gLite grid production
serviceAmazon Web Services, with focus on EC2 and
S3Outcome:Identified convergence paths andRecommendations for
managing convergence going forward
June 9, 2008
Acknowledgment
Many people provided comments, suggestions and feedbackSpecial
thanks got to:Bob Jones, CERNJames Casey, CERNCharles Loomis, CNRS
and Six partner
June 9, 2008
ArcheologyAstronomyAstrophysicsCivil ProtectionComp.
ChemistryEarth SciencesFinanceFusionGeophysicsHigh Energy
PhysicsLife SciencesMultimediaMaterial Sciences
>250 sites48 countries>50,000 CPUs>20 PetaBytes>10,000
users>150 VOs>150,000 jobs/day
June 9, 2008
Grid: EGEE/gLite
EGEE highlights:Federated but separately administered resources
(multiple sites, countries and continents)Heterogeneous
resourcesDistributed, multiple research user communities grouped in
Virtual Organisations (VO)Mostly publicly funded at local, national
and international levelsRange of data models, ranging from massive
data sources, hard to replicate to transient datasets composed of
varied file sizes
June 9, 2008
Grid: EGEE/gLite (2)
Provided services:Basic services (focus of comparison with
AWS)Computing Element (CE) Storage Element (SE)Higher-level
servicesWorkload Management System (WMS)File & Metadata Catalog
ServicesFile Transfer Service (FTS)Virtual Organization Management
Service (VOMS)
For more info: Bob Jones, EGEE Project Director, CERN,
[email protected]
June 9, 2008
Amazon Web Services
EC2 (Elastic Computing Cloud) is the computing service of
AmazonBased on hardware virtualisation (Xen)Users request virtual
machine instances, pointing to an image (public or private) stored
in S3Users have full control over each instance (e.g. access as
root, if required)Request can be issued via SOAP and RESTS3 (Simple
Storage Service) is a service for storing and accessing data on the
Amazon cloudFrom a users point-of-view, S3 is independent from the
other Amazon servicesData is built in a hierarchical fashion,
grouped in buckets (i.e. containers) and objectsData is accessible
via SOAP, REST and BitTorrent
June 9, 2008
Amazon Web Services (2)
Other AWS services:SQS (Simple Queue Service)SimpleDBBilling
services: DevPayElastic IP (Static IPs for Dynamic Cloud
Computing)Multiple Locations
June 9, 2008
Costs
Cost study for computing upgrade at CERN for LHC (by Ian Bird, Tony
Cass, Bernd Panzer-Steindel and Les Robertson)Cost summary for
providing 40 MSI2000 of computing:Custom data centre construction:
4.4 MCHF (~2.7 M)Using EC2: 92 MCHF (~56.9 M)Cost of 4.4 MCHF
doesnt include software license and man-power costsComparison is
made difficult by the choice of reference Amazon is using for its
EC2 Compute Unite.g. EC2 Compute Unit (ECU) provides the equivalent
CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor
Our calculation was for 40 MSI2000 on EC2: 57 MCHF (~35.3
M)
June 9, 2008
Costs: EGEE workload in 2007
CPU: 114 Million hours
Data:25PB stored11PB transferred
Estimated cost if performed with Amazons EC2 and S3: ~38 M
http://gridview.cern.ch/GRIDVIEW/same_index.php
http://calculator.s3.amazonaws.com/calc5.html? 17/05/08
$58688679.08
June 9, 2008
Chart1
47288679.08
102759.08
11400000
Xfer
CPU
Storage
Sheet1
How much would it cost to run EGEE's 2007 workload on Amazon?
This page shows how the calcuation of cost of running EGEE workload from 2007 ((all VOs, all sites)) on Amazon S3 and EC2 was made
CPU:$11,400,000
EGEE in 2007 consumed 114million CPU-hours
Typical EGEE site CPU corresponds to Amazon "small instance" - 1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit), 160 GB of instance storage, 32-bit platform - $0.10 /hour
Storage:$47,185,920
EGEE in 2007: 25Petabytes == 25600Terabytes == 26214400Gigabytes
Amazon S3 pricing (Europe): $0.18 per GB-Month of storage used
314572800
$0.15 * 26214400Gb * 12 months
Data Xfer:$1,397,983
EGEE aggregrate data xfer between all sites: 11.42Pb == 11694Tb == 11974737Gb hence average of 997895Gb/month. Assume 50% in and 50% out
Amazon data Xfer pricing:
$0.10 per GB - all data transfer in
$0.17 per GB - first 10 TB / month data transfer out
$0.13 per GB - next 40 TB / month data transfer out
$0.11 per GB - next 100 TB datatransfer out/ month
$0.10 per GB - datatransfer out/ month over 50 TB
TOTAL$59,983,9031,397,983.32
498947.375
In Euros 47,486,5489,500,000.00
69,421,087.32
Sheet1
storage
CPU
Xfer
summary
EGEE07 figures put these into http://calculator.s3.amazonaws.com/calc5.html?:
storage (GB-months) US314572800
data xfer-in498947.375
data xfer-out498947.375
small instance CPU hrs114000000
MonthStorageXferCPUtotal $total eurototal chftotal gbp
feb'0856,623,104.001,397,983.3211,400,000.0069,421,087.32 47,486,548
may'0847,288,679.08102,759.0811,400,000.0058,688,679.0837,734,473.1061,532,145.5829,977,003.50
summary
Storage
CPU
Xfer
Sheet3
High-level deployment of LCG grid resources
Where could the cloud be? Since transferring data across the cloud
border costs!
June 9, 2008
Can BitTorrent Help
Using BitTorrent, transfers not meteredby cloud if requestingthe
same files
Where could the cloud be? Since transferring data across the cloud
border costs!
June 9, 2008
Performance
EC2, S3 bandwidth performance summary
The conclusions from [6] regarding the EC2 -> EC2 transfers
are that basically were getting a full gigabit between the
instances.
Test typeTransfer (MB/sec)RemarksEC2 -> EC275.0Using curl on 1-2 GB files, without SSLS3 -> EC249.8Using 8 x curl on 1 GB files, with SSL51.5Using 8 x curl on 1 GB files, without SSLEC2 -> S353.8Using 12 x curl on 1 GB files, with SSL
June 9, 2008
Performance (2)
Like AWS, CERN has opted for a storage / compute farms
separation
CERN can deliver a sustained 70 GB/s data throughput between the storage and compute farms
A large scale performance analysis not available on
AWS
June 9, 2008
Scale
Is EC2 (Elastic Computing Cloud) really elastic?Scale of EGEE is
already established and well documentedScale from AWS is unknown,
while latest experiments seem to indicate good scalingBoth systems
now have SLAs in place, including penalties (partial refund) from
Amazon when not honouredElastic IP and Multiple Locations provide
building blocks for users to deploy resilient services, whileEGEE
is already massively distributed (>250 sites)
June 9, 2008
AWS Cloud interfaces
No middleware!!
Resource-sidegrid middleware?
June 9, 2008
Ease of Use
Key to the success of AWS is the choice of technologiesHTTP(S)/REST
and support for ROA (Resource Oriented Architecture)Hardware
virtualisation (Xen based)X.509 certificatesThis backs-up the claim
from Amazon that AWS requires no middleware (for the user!)However,
the level of service provided by AWS is lower than EGEEFor
EGEE/gLite, several MB are required to use the grid
June 9, 2008
Service Mapping
Ease of use comes at a cost: The cost of simplicityThe basic
constructs that EC2 and S3 services offer do not currently meet all
the requirements of grid users and do not replace high-level
services provided by gLite e.g.:File Transfer Service (FTS)Workload
Management System (WMS)Grid catalogues such as ARDA Metadata
Catalogue (AMGA), LCG File Catalog (LFC) or GANGAAre all users
using the grid the same way?Should we revisit the way the grid is
used and accessed?Who should be responsible for providing different
levels of functionality
June 9, 2008
Collaboration and Virtual Organisations
Grids are used by large and/or distributed communities of
collaboratorsVirtual Organisations support this concept, with
services such as VOMSOnly primitive ACLs are provided by AWS, can
we bridge the gap?Scientific collaborations include the need for
resources to be contributed and connected to the grid. Can the
cloud be augmented by custom data centres
June 9, 2008
Application Software Deployment
Grid application software is often required to be installed at data
centres for jobs to execute successfullySeveral operating systems
and platforms required to host grid jobsHardware virtualisation
could alleviate these burdensGrid application software can be baked
in a virtual imageData centres do not have to provide specific
operating system defined at the level of the VMHardware
virtualisation provides high-level of control to user (e.g. root)
and high control and security for hosts
June 9, 2008
Interoperability
Assuming that several cloud computing providers come to beWhich
interface matter?
BOTH!!!
June 9, 2008
Standards
Since simple is beautiful, if the proposed interfaces by cloud
services like AWS are to become popular with grid users, they might
change the standardisation landscapeHTTP, REST, Xen and BitTorrent
are already largely standardisedWhat is left at that levelREST
access to storageVirtual Image formatsInstantiation API (perhaps
based on REST)Metering interfaces (including monitoring)A reference
open source implementation is missingWhat about higher-level
services? Which ones?
June 9, 2008
Conclusions
Cloud computing is getting traction, especially with Amazon Web
Services (AWS) commercial offeringGrid (e.g. EGEE) has a larger
scope, however, technological choices and simple interfaces like
AWS is relevant to the grid worldThe question what is the usage
pattern that will emerge in the coming years? remains unanswered
and will have to be carefully trackedNone of the resources
contributed to the EGEE grid come from commercial offerings, such
as Amazon. While this change?Technologies such as REST, HTTP,
hardware virtualisation and BitTorrent could displace existing
accesses to grid resources
June 9, 2008
Conclusion (2)
EGEE has an opportunity to lead the next generation
e-Infrastructure by integrating new advancements such as cloud
computingHardware virtualisation could lower the operations cost of
large infrastructuresImportant that new development is not a
distraction from ensuring current production grid continuityRoadmap
should be defined to include cloud technology in current
e-Infrastructures in an incremental and harmonious
fashion
June 9, 2008
Recommendations
Promote/support the development of an open source cloud middleware
distribution, based on interfaces similar to current commercial
offeringsPromote the standardisation of the cloud, with the above
mentioned implementation as a potential referenceIdentify a
convergence path between cloud services such as EC2 and S3 and the
current EGEE security model based on VOMSVirtualise all key grid
services (e.g. information system, metadata catalogues, security
service) with the goal of being able to deploy these on EC2-like
resourcesPromote/lobby the need for experiments (i.e. LHC/HEP, Life
science) and other grid users to virtualise their application, with
the goal of being able to deploy them on EC2-like resourcesAs a
follow-on to point 5, promote/lobby the need for all service
dependencies that grid user applications have to also be
virtualisedLaunch/support a feasibility study to verify that
monitoring of cloud jobs can be performed at the hypervisor level,
such that monitoring is independent from the virtualised
applicationsUpgrade current metadata catalogues to support HTTP(S)
endpoints and S3-like metadataExplore feasibility of running
BitTorrent on grid sites
June 9, 2008
*