grid computing – introduction sathish vadhiyar. generic grid architecture/components resource...
TRANSCRIPT
Grid Computing –Grid Computing –IntroductionIntroduction
Sathish VadhiyarSathish Vadhiyar
Generic Grid Generic Grid Architecture/ComponentsArchitecture/Components
Resource Layer
High speed networks and routers
Computers Data basesOnline instruments
Service Layers
User Portals
Authentication
Scheduling &Co- SchedulingNaming &
Files Events
Grid Access & InfoProblem SolvingEnvironments
Application Science Portals
Resource Discovery& Allocation
Fault Tolerance
Software
OK, I have built some software.OK, I have built some software.Is mine a Grid software?Is mine a Grid software?
Ian Foster’s three-point checklist:Ian Foster’s three-point checklist:
1.1. coordinates resources not subject coordinates resources not subject to centralized controlto centralized control
2.2. using standard, open, general-using standard, open, general-purpose protocols and interfacespurpose protocols and interfaces
3.3. to deliver non-trivial qualities of to deliver non-trivial qualities of serviceservice
Some Myriad DefinitionsSome Myriad Definitions ““Coordinated resource sharing and Coordinated resource sharing and
problem solving in dynamic, multi-problem solving in dynamic, multi-institutional virtual organizations”institutional virtual organizations”
““Anatomy of the grid – highly flexible sharing relationships, Anatomy of the grid – highly flexible sharing relationships, sophisticated and precise levels of control over use of shared sophisticated and precise levels of control over use of shared resources, sharing of varied resources, diverse usage modes.”resources, sharing of varied resources, diverse usage modes.”
““Controlled sharing – not free Controlled sharing – not free access”access”
““Infrastructure enabling integrated, collaborative Infrastructure enabling integrated, collaborative use of resources”use of resources”
““Sharing resources can vary dynamically vary over time”Sharing resources can vary dynamically vary over time”
More colorful definitions keep comingMore colorful definitions keep coming Common keywords: Coordinated, shared, multi-institutions, Common keywords: Coordinated, shared, multi-institutions,
controlled, usage, collaborationcontrolled, usage, collaboration
Differences with Other Differences with Other TechnologiesTechnologies
Enterprise-level Enterprise-level distributed computingdistributed computing – limited – limited cross-organizational supportcross-organizational support
Current Current distributed computingdistributed computing approaches do not approaches do not provide a general resource-sharing framework provide a general resource-sharing framework that addresses Virtual Organization (VO) that addresses Virtual Organization (VO) requirements.requirements.
WWWWWW – just client-server. Lacks richer interaction – just client-server. Lacks richer interaction modelsmodels
Technologies like Technologies like CORBA, Java, DCOMCORBA, Java, DCOM – single – single organization, limited scopeorganization, limited scope
Some of the Grid techniques complement existing Some of the Grid techniques complement existing techniques.techniques.
Grids vs Conventional Distributed Grids vs Conventional Distributed Computing (Nemeth and Sunderam)Computing (Nemeth and Sunderam)
Distributed ComputingDistributed Computing Virtual Pool of nodesVirtual Pool of nodes Set of nodes static. Users have login access. They explicitly Set of nodes static. Users have login access. They explicitly
know about nodesknow about nodes VM constructed out of a priori knowledgeVM constructed out of a priori knowledge Resource assignment implicit Resource assignment implicit Resource owningResource owning
Grid ComputingGrid Computing Virtual Pool of wide range of resourcesVirtual Pool of wide range of resources Set of nodes static/dynamic. Resources dynamic and diverse – Set of nodes static/dynamic. Resources dynamic and diverse –
can vary in number, can vary in performancecan vary in number, can vary in performance Difficult for user to get a priori knowledgeDifficult for user to get a priori knowledge User abstraction at resource layersUser abstraction at resource layers Resource sharingResource sharing Apps. – resource requirements more than can be solved on Apps. – resource requirements more than can be solved on
machines “owned”machines “owned”
ContinuedContinued
Nemeth and SunderamNemeth and Sunderam
Motivating examplesMotivating examples
SETI@homeSETI@home
To search new life and civilizationsTo search new life and civilizations Use individual computers’ idle time Use individual computers’ idle time
through running SETI@home screen saverthrough running SETI@home screen saver Screen savers retrieves data, analyzes and Screen savers retrieves data, analyzes and
reports results back to SETI projectreports results back to SETI project Looking for extra-terrestrial signal over a Looking for extra-terrestrial signal over a
12-second period12-second period Each work unit takes 10 to 50 hours on an Each work unit takes 10 to 50 hours on an
average computer – 2.4 to 3.8 trillion average computer – 2.4 to 3.8 trillion floating point operationsfloating point operations
Steps and StatisticsSteps and Statistics
Data collected from Arecibo telescope in Puerto Rico onto tapes and shipped to SETI@home lab in UC, Berkeley. Break tapes -> work units -> given to users
Find candidate signals reported from users
Other steps:
•Checking data integrity
•Removing radio frequency interference (RFI)
•Identify final candidatesStatistics:
208,174,383 work units
1,261 tapes
Statistics from 1999-2004
Total
Users 5054812
Results received
1459999962
Total CPU time 1988719.151 years
Floating PointOperations
5.278185e+21
Average CPU timeper work unit
11 hr 55 min 56.3 sec
Images and statistics from SETI web site
Climateprediction.netClimateprediction.net Forecast climate in 21Forecast climate in 21stst century century 3 steps – explore current model, 3 steps – explore current model,
validate against past climate, validate against past climate, forecast 21forecast 21stst century climate century climate
Different models (in terms of initial Different models (in terms of initial conditions, forcing [volcanoes, conditions, forcing [volcanoes, solar activity etc.], parameters solar activity etc.], parameters [approximations or ranges of fixed [approximations or ranges of fixed values in the model. E.g. ice size in values in the model. E.g. ice size in ocean, friction between different ocean, friction between different ocean layers]) distributed to ocean layers]) distributed to different usersdifferent users
Massive ensemble experimentMassive ensemble experiment
From climateprediction.net
StepsStepsExperiment Goal Methodology
1Explore model
sensitivity to parameters
Identify suitable ranges of parameters.
Each simulation includes 3 phases:
• Calibration (15yrs) • Pre-industrial CO2 run
(15yrs) • Double CO2 run (15yrs)
2Simulation of 1950-
2000
Assess model skill by making a probability based forecast of the
past climate.
Run the model with a range of initial conditions and
parameters for the period 1950-2000.
Compare model outputs with observations to
assess how well the model performs.
3Simulation of 2000-
2100
Make a probability based forecast of future
climate.
Run the model with a range of initial conditions,
forcings and parameters for the period 2000-2100. From climateprediction.net
Prime number generation - GIMPSPrime number generation - GIMPS
Finding Mersenne prime numbers – 2Finding Mersenne prime numbers – 2PP-1-1 GIMPS is to find largest known Mersenne GIMPS is to find largest known Mersenne
prime numbersprime numbers 4141stst Mersenne prime found recently - Mersenne prime found recently -
2224,036,58324,036,583-1 with 7,235,733 decimal digits !!!-1 with 7,235,733 decimal digits !!! GIMPS found sevenGIMPS found seven For mostly funFor mostly fun 1000s of Pentium PCs involved. Setup 1000s of Pentium PCs involved. Setup
similar to SETI@homesimilar to SETI@home PCs do primality testsPCs do primality tests
Other @home ProjectsOther @home Projects genome@home – designing new genes that form genome@home – designing new genes that form
working proteins in cells. To study protein working proteins in cells. To study protein evolution. Individual PCs design protein sequencesevolution. Individual PCs design protein sequences
folding@home – to study why proteins fold/misfold. folding@home – to study why proteins fold/misfold. Each PC simulates a particular kind of protein Each PC simulates a particular kind of protein foldingfolding
evolution@home – to understand and simulate evolution@home – to understand and simulate evolutionevolution
Compute-against-cancer – to study cancer cells and Compute-against-cancer – to study cancer cells and to design new cancer drugsto design new cancer drugs
FightAids@home – screen millions of candidate FightAids@home – screen millions of candidate drug compoundsdrug compounds
Distributed.net – cryptography, secret key Distributed.net – cryptography, secret key challengeschallenges
More can be found in More can be found in http://boinc.berkeley.edu/projects.phphttp://boinc.berkeley.edu/projects.php
The Telescience projectThe Telescience project Grid for remote Grid for remote
accessing accessing microscopes, data microscopes, data analysis and analysis and visualizationvisualization
To study complex To study complex interactions of interactions of molecular and cellular molecular and cellular biological structures biological structures and hence understand and hence understand brain diseasesbrain diseases
Interactively steer a Interactively steer a 400,000-volt electron 400,000-volt electron microscope at UC San microscope at UC San Diego Diego
From TeleScience web site
ReferencesReferences http://www.globus.org/research/papers/chapter2.pdfhttp://www.globus.org/research/papers/chapter2.pdf What is the Grid? A three point checklist. Ian Foster. GRIDToday, What is the Grid? A three point checklist. Ian Foster. GRIDToday,
July 20, 2002.July 20, 2002. The Anatomy of the Grid: Enabling scalable virtual organizations. I. The Anatomy of the Grid: Enabling scalable virtual organizations. I.
Foster, C. Kesselman, S. Tuecke. IJSA. 15(3), 2001.Foster, C. Kesselman, S. Tuecke. IJSA. 15(3), 2001. A Complete History of the Grid. Dr. Rob Baxter. A Complete History of the Grid. Dr. Rob Baxter. PdfPdf Zsolt Nemeth, Mauro Migliardi, Dawid Kurzyniec and Vaidy Zsolt Nemeth, Mauro Migliardi, Dawid Kurzyniec and Vaidy
Sunderam. A comparative analysis of PVM/MPI and computational Sunderam. A comparative analysis of PVM/MPI and computational grids. In EuroPVM/MPI 2002.grids. In EuroPVM/MPI 2002.
Zsolt Nemeth and Vaidy Sunderam. A comparison of conventional Zsolt Nemeth and Vaidy Sunderam. A comparison of conventional distributed computing environments and computational grids. ICCS distributed computing environments and computational grids. ICCS 2002.2002.
Zsolt Nemeth and Vaidy Sunderam. A formal framework for defining Zsolt Nemeth and Vaidy Sunderam. A formal framework for defining grid systems. CCGrid 2002.grid systems. CCGrid 2002.