grid services

67
Grid Services Presented by Karan Bhatia

Upload: jovita

Post on 21-Jan-2016

73 views

Category:

Documents


0 download

DESCRIPTION

Grid Services. Presented by Karan Bhatia. Hype Curve. Overview. Grid Computing Background Definition Opportunities Markets Technical Challenges Security Infrastructure Resource Management Service Interoperability Summary. Grid Computing is …. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Grid Services

Grid ServicesGrid Services

Presented by

Karan Bhatia

Presented by

Karan Bhatia

Page 2: Grid Services

2

Hype Curve

Page 3: Grid Services

3

Overview

• Grid Computing Background– Definition

– Opportunities

– Markets

• Technical Challenges– Security Infrastructure

– Resource Management

– Service Interoperability

• Summary

Page 4: Grid Services

4

Grid Computing is …

• “Co-ordinated resource sharing and problem solving in dynamic multi-institutional virtual organization.” [Foster, Kesselman, Tuecke]

– Co-ordinated - multiple resources working in concert, eg. Disk & CPU, or instruments & database, etc.

– Resources - compute cycles, databases, files, application services, instruments.

– Problem solving - focus on solving scientific problems

– Dynamic - environments that are changing in unpredictable ways

– Virtual Organization - resources spanning multiple organizations and administrative domains, security domains, and technical domains

Page 5: Grid Services

5

Grid Computing is … (Industry)

• “about finding distributed, underutilized compute resources (systems, desktops, storage) and provisioning those resources to users or applications requiring them.” [The Grid Report, Clabby Analytics]

– Distributed - all the resources laying around in departments or server rooms.

– Underutilized - typical utilization of “big iron” is 5 to 10%. Organizations save money by increasing utilization versus purchasing new resources.

– Resources - servers and server cycles, applications, data resources

– Provisioning - predict and schedule resource use depending on load.

Page 6: Grid Services

6

Types of Grids…

• Compute Grids– Seti@home, Entropia,

United Devices, Condor

• Data Grids– Storage Resource Broker

(SRB), Avaki, BIRN, GEON

• Collaboration Grids– Instrumentation

(telescience), applications

• Enterprise Grids– Majority of commercial

interest

• Partner Grids– B2B, Academic/Govt Grids

• Service Grids– “Utility” Computing, “On

Demand”, pervasive, autonomic, etc…

Page 7: Grid Services

7

A Grid is …

• “the next generation Internet,”

• “all about free cycles ala SETI@HOME,”

• “a distributed object system,”

• “a new programming model,”

• “a replacement for high performance computing,”

Page 8: Grid Services

8

IMAGING INSTRUMENTS

COMPUTATIONALRESOURCES

LARGE-SCALE DATABASES

DATAACQUISITION ,ANALYSIS

ADVANCEDVISUALIZATION

Example… TeleScience Grid

Page 9: Grid Services

9

Grid Resources - Networks

Page 10: Grid Services

10

Grid Resources - Compute

Page 11: Grid Services

11

Top 500.org

Page 12: Grid Services

12

Page 13: Grid Services

13

Another Grid Example … Google

• Queries– 150 M queries/day (2000/s)

– 100 countries

– 3.3 B documents

• Hardware– 15,000 Linux systems in 6 data centers

– 15 Tflop/s and 1000 TB total capacity

– 40-80 1U/2U servers/cabinet

– 100 MB Ethernet switches/cabinate with gigabit uplinks

– Growth from 4000 systems (18 M queries/day)

Page 14: Grid Services

14

Grid Resources - Data

• SDSC Resources – HPSS:

• SDSC's central long-term data storage system,• one of the world's largest IBM High Performance Storage System

(HPSS) units,• currently holds more than a petabyte (a million gigabytes) of data in

approximately 21 million files,• It has the capacity to store six petabytes of data; files are added at an

average rate of 10,000 gigabytes per month.

– Storage-Area Network (SAN): • A 72-processor Sun Microsystems SunFire 15K high-end server and 11

Brocade switches (1,400 ports) • 225,000 gigabytes of networked disk storage for data-oriented

applications.

• 1 TB of data = $2500

Page 15: Grid Services

15

Protein Data Bank (PDB)

Page 16: Grid Services

16

Putting it all together… TeraGrid

Page 17: Grid Services

17

Grid Market

Page 18: Grid Services

18

Grid Companies

• IBM– “on demand” solutions

• Sun Microsystems– N1 initiative

• Oracle– 10g

• Dell

• HP– “utility” computing

• Platform Computing– LSF, metaclulstering

• United Devices– Desktop grids

• DataSynapse• Akamai• Google?• Sony online

entertainment?

• Where’s Microsoft?

Page 19: Grid Services

19

Grid Organizations

• Global Grid Forum (GGF)

• Organization for the Advancement of Structured Information Standards (OASIS)

• Distributed Management Task Force (DMTF)

• World Wide Web Consortium (W3C)

• Globus Alliance

• NSF Middleware Initiative (NMI)

• NASA IPG

• DOE Science Grid

• EU DataGrid

• NSF TeraGrid

Page 20: Grid Services

20

Technical Challenges for Grid Computing

Page 21: Grid Services

21

Challenges: Security

• Grids traverse organizational boundaries– Different administration domains have different authentication

mechanisms– Resources have different use agreements and sharing priorities

• Single sign-on– Multiple passwords difficult to manage

• Rights delegation• Trust

– Authentication of users– Authorization of users– Resource access

Page 22: Grid Services

22

Security• Public Key Infrastructure

– Public key A.public– Private key A.private

• Supports Encrpyption– Message to B:

• m’ = F(m,A.private), send m’ to B• recv m’, m = F’(m’,A.public)

• Digital Signatures– Signed message to B:

• m’ = (m,F(m,A.public))

– Receiver verifies that m’ is from A and not tampered

Page 23: Grid Services

23

Grid Security Infrastructure (GSI)

• A central concept in GSI authentication is the certificate.

• Every user and service on the Grid is identified via a certificate, a text file containing the following information:– a subject name identifying the person

or object that the certificate represents, – the public key belonging to the

subject, – the identity of a Certificate Authority

(CA) that has signed the certificate to certify that the public key and the identity both belong to the subject,

– the digital signature of the named CA.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 24: Grid Services

24

Proxy Certificate

• A proxy consists of a new certificate with a new public and private key.

• The new certificate contains the owner's identity modified slightly to indicate that it is a proxy.

• The new certificate is signed by the owner rather than a CA.

– This is called a self-signed certificate.

• The certificate also includes a time notation after which the proxy should no longer be accepted by others.

• Proxies have limited lifetimes in order to minimize the security vulnerability.

• Because the proxy isn't valid for very long, it doesn't have to kept quite as secure as the owner's private key.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 25: Grid Services

25

Mutual Authentication

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 26: Grid Services

26

Additional Challenges

• Certificate Management– MyProxy

• Role-based Access Control– CAS, VOM

• Authorization services• Integration with

applications & Portals

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 27: Grid Services

27

Challenges: Resource Management

• Resources loosely-coupled– Higher network latencies– Planned and unplanned disruptions

• How to provide QoS guarantees?

• Case Study: Entropia Desktop Grids– Additional trust/security issues

Page 28: Grid Services

29

Entropia 1: Gimps• Over 1.5 Billion

CPU hours served

• 300,000+ machines, over 4 years operational

• Every PC and hardware config imaginable (proc, memory, disk, etc.)

• Every networking hookup imaginable

• Found 35th, 36th, 37th, 38th, and 39th Mersenne Primes

Page 29: Grid Services

30

Entropia 2: FightAids@home

• Sept 2000 launch• Internet-Based• 54,657 total

machines• 10,770,506 total

hours of computation

• 27,881 peak billions of calculations/sec

Page 30: Grid Services

31

Entropia 3: DCGrid

• Enterprise focus– Tremendous resources available in enterprise– Complements other HPC resources

• Computing Platform– Arbitrary application (open scheduling model)– Security, unobtrusiveness, manageability guaranteed

• Focus on – Pharmaceuticals, Chemicals, and Materials – Financial Services

Page 31: Grid Services

32

DCGrid Architecture

Page 32: Grid Services

35

Server vs. Desktop Grids

• Server environment– Fixed IP, always connected

– Always-on operation

– Moderate number of systems (10’s – 100’s)

– Dedicated use, trusted systems

• Desktop environment– Dynamic, temporary IP, intermittent connection

– Off evenings, off weekends, off lunch

– Large numbers of systems (100’s – 1000’s - ?)

– Shared resources, potentially untrusted users

• These differences give rise to desktop Grid challenges

Page 33: Grid Services

36

Typical PC-Grid Environment

0

100

200

300

400

500

600

700

552 576 600 624 648 672 696 720

Time (hours)

Page 34: Grid Services

37

PC-Grid Challenges

• Provide a stable compute environment for apps– Isolate app from variable desktop environment

• Operate in environment of dynamic use– Unobtrusiveness and Fault Tolerance are key!

• Provide simple application integration– Support ANY Application without modification

• Provide centralized management console– Zero additional management costs

Page 35: Grid Services

38

JobManagement

ResourceSchedulinng

Physical NodeManagement

Job Manager

Subjob Scheduler

Node Manager

End-user

Entropia Clients

computation

resource

resource description

Workflow

2

3

45

6b

1

7

8

a

Page 36: Grid Services

39

Stable Compute Environment

• Entropia Proprietary Sandbox– Binary-level protection

– System virtualization (registry, file system, network)

• Open Scheduling Infrastructure– Intelligent scheduling (match resources to subjobs

requirements)

– Manage subjob redundancy/fault tolerance

Page 37: Grid Services

40

Manage Dynamic Use

• PC primary use must be respected!• Entropia Proprietary Sandbox

– Guaranteed to run at idle priority– Limit application capability– Monitor page faults, network access

• Management– Provide time-of-use windows– Different levels of unobtrusiveness

• Gathers 95+ % of cycles

Page 38: Grid Services

41

Application Integration

• Support any Win32 binary– Language Neutral (C, C++, Fortran, Java,C#, etc.)

– Compiler/library Neutral

Client1 *

Client2 *

Open Grid Platform

App A

App B

App C

qsubqstat…

ApplicationPreparation Tools

Run Applications

Page 39: Grid Services

42

Manageability

Page 40: Grid Services

43

Application Performance

0

5

10

15

20

25

30

35

40

0 25 50 75 100 125 150

Number of Clients

Sequences per hourEntropia

1CPU SGI

1CPU SUN

Linear (Entropia)

0

50

100

150

200

250

300

350

400

0 100 200 300 400 500 600

Number of Clients

Throughput (Packets per Hour)

0

20

40

60

80

100

120

140

160

0 5 10 15 20 25 30 35 40 45 50

Number of Clients

Compounds per Hour

GOLD

AUTODOCK

HMMER

0

1000

2000

3000

4000

5000

6000

7000

0 100 200 300 400 500

Number of Clients

Compounds per Hour

DOCK

Page 41: Grid Services

44

Scheduling PerformanceJob 14 Nodes (94 clients)

0

10

20

30

40

50

60

70

80

90

100

0 3600 7200 10800 14400 18000 21600

Time (secs)

Client ID

Page 42: Grid Services

45

Challenges: Service Interoperability

• Trying to force homogeneity on users is futile. Everyone has their own preferences, sometimes even dogma.

• The Internet provides the model…

Page 43: Grid Services

46

Typical Application

WebBrowser

ComputeServer

DataCatalog

DataViewer

Tool

Certificateauthority

ChatTool

CredentialRepository

WebPortal

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

RegistrationService

Page 44: Grid Services

47

Typical Application

• Implementations are provided by a mix of– Application-specific code

– “Off the shelf” tools and services

– Tools and services from the Globus Toolkit

– Tools and services from the Grid community (compatible with GT)

• Glued together by…– Application development

– System integration

Page 45: Grid Services

48

How it Really Happens(without the Grid)

WebBrowser

ComputeServer

DataCatalog

DataViewer

Tool

Certificateauthority

ChatTool

CredentialRepository

WebPortal

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

CameraTelepresence

Monitor

RegistrationService

A

B

C

D

E0Grid

Community

0Globus Toolkit

13Off the Shelf

9Application Developer

Page 46: Grid Services

49

How it Really Happens(with the Grid)

WebBrowser

ComputeServer

GlobusMCS/RLS

DataViewer

Tool

CertificateAuthority

portlet

MyProxy

Portal

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

CameraTelepresence

Monitor

Globus IndexService

GlobusGRAM

GlobusGRAM

GlobusDAI

GlobusDAI

GlobusDAI

4Grid Community

4Globus Toolkit

9Off the Shelf

2Application Developer

Page 47: Grid Services

50

Theory -> Practice

Page 48: Grid Services

51

What You Get in the Globus Toolkit

• OGSI(3.x)/WSRF(4.x) Core Implementation– Used to develop and run OGSA-compliant Grid Services (Java,

C/C++)

• Basic Grid Services– Popular among current Grid users, common interfaces to the most

typical services; includes both OGSA and non-OGSA implementations

• Developer APIs– C/C++ libraries and Java classes for building Grid-aware

applications and tools

• Tools and Examples– Useful tools and examples based on the developer APIs

Page 49: Grid Services

52

Components in Globus Toolkit 3.0

GSI

WS-Security

Data Managemen

tSecurity

WSCore

Resource Managemen

t

Information Services

RFT(OGSI)

RLS

WU GridFTPJAVA

WS Core(OGSI)

OGSI C Bindings

MDS2

WS-Index(OGSI)

Pre-WSGRAM

WS GRAM(OGSI)

Page 50: Grid Services

53

Components in Globus Toolkit 3.2

GSI

WS-Security

CAS(OGSI)

SimpleCA

Data Managemen

tSecurity

WSCore

Resource Managemen

t

Information Services

RFT(OGSI)

RLS

OGSI-DAI

WU GridFTP

XIO

JAVAWS Core(OGSI)

OGSI C Bindings

MDS2

WS-Index(OGSI)

Pre-WSGRAM

WS GRAM(OGSI)

OGSI Python Bindings

(contributed)

pyGlobus(contributed)

Page 51: Grid Services

54

Planned Components in GT 4.0GSI

WS-Security

CAS(WSRF)

SimpleCA

Data Managemen

tSecurity

WSCore

Resource Managemen

t

Information Services

Authz Framework

RFT(WSRF)

RLS

OGSI-DAI

New GridFTP

XIO

JAVAWS Core(WSRF)

C WS Core(WSRF)

MDS2

WS-Index(WSRF)

Pre-WSGRAM

WS-GRAM(WSRF)

CSF(contribution)

pyGlobus(contributed)

Page 52: Grid Services

55

Grid and Web Services Convergence

The definition of WSRF means that the Grid and Web services communities can move forward on a common base.

Page 53: Grid Services

Grid

Services

Example

• (from sotomayor tutorial)

• MathService API:

– add(int x)

– subtract(int x)

– getvalue()

Note 1: How is this different than - Web Services? - Corba? - COM/DCOM?

Note 2: This is too simple! What about - co-ordination/workflows - personalization - presentation - security

Page 54: Grid Services

OGSI

(or

what is a

grid service?)

• Using web service infrastructure

– MathService is defined by WSDL (like idl)

<?xml version="1.0" encoding="UTF-8"?>...<types><xsd:schema targetNamespace="http://www.gt3tutorial.org/namespaces/0.2/core/gwsdl/Math" attributeFormDefault="qualified" elementFormDefault="qualified" xmlns="http://www.w3.org/2001/XMLSchema"> <xsd:element name="add"> <xsd:complexType> <xsd:sequence> <xsd:element name="value" type="xsd:int"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:element name="addResponse"> <xsd:complexType/> </xsd:element>...</types>

<message name="AddInputMessage"> <part name="parameters" element="tns:add"/></message><message name="AddOutputMessage"> <part name="parameters" element="tns:addResponse"/></message>...

<gwsdl:portType name="MathPortType" extends="ogsi:GridService"> <operation name="add"> <input message="tns:AddInputMessage"/> <output message="tns:AddOutputMessage"/> <fault name="Fault" message="ogsi:FaultMessage"/> </operation> <operation name="subtract"> <input message="tns:SubtractInputMessage"/> <output message="tns:SubtractOutputMessage"/> <fault name="Fault" message="ogsi:FaultMessage"/> </operation> <operation name="getValue"> <input message="tns:GetValueInputMessage"/> <output message="tns:GetValueOutputMessage"/> <fault name="Fault" message="ogsi:FaultMessage"/> </operation></gwsdl:portType>

</definitions>

Page 55: Grid Services

Basic

Concepts

Page 56: Grid Services

The

GridService

PortType

• a “grid service” is a web service that implements the GridService PortType

<portType name="GridService"><operation name="setServiceData"> [snip] </operation><operation name="destroy"> [snip] </operation><operation name="requestTerminationAfter"> [snip] </operation><operation name="requestTerminationBefore"> [snip] </operation><operation name="findServiceData"> [snip] </operation></portType>

<gwsdl:portType name="GridService"><sd:serviceData maxOccurs="unbounded" minOccurs="1" modifiable="false" mutability="constant" name="interface" nillable="false" type="xsd:QName"/> <sd:serviceData maxOccurs="unbounded" minOccurs="0" modifiable="false" mutability="mutable" name="serviceDataName" nillable="False" type="xsd:QName"/> <sd:serviceData maxOccurs="1" minOccurs="1" modifiable="false" mutability="mutable" name="factoryLocator" nillable="true" type="ogsi:LocatorType"/> <sd:serviceData maxOccurs="unbounded" minOccurs="0" modifiable="false" mutability="extendable" name="gridServiceHandle" nillable="false" type="ogsi:HandleType"/> <sd:serviceData maxOccurs="unbounded" minOccurs="1" modifiable="false" mutability="mutable" name="gridServiceReference" nillable="false" type="ogsi:ReferenceType"/> <sd:serviceData maxOccurs="unbounded" minOccurs="1" modifiable="false" mutability="static" name="findServiceDataExtensibility" nillable="false" type="ogsi OperationExtensibilityType"/> <sd:serviceData maxOccurs="unbounded" minOccurs="1" modifiable="false" mutability="static" name="setServiceDataExtensibility" nillable="false" type="ogsi:OperationExtensibilityType"/> <sd:serviceData maxOccurs="1" minOccurs="1" modifiable="false" mutability="mutable" name="terminationTime" nillable="false" type="ogsi:TerminationTimeType"/> <sd:staticServiceDataValues> <ogsi:findServiceDataExtensibility inputElement="ogsi:queryByServiceDataNames"/> <ogsi:setServiceDataExtensibility inputElement="ogsi:setByServiceDataNames"/> <ogsi:setServiceDataExtensibility inputElement="ogsi:deleteByServiceDataNames"/> </sd:staticServiceDataValues></gwsdl:portType>

Page 57: Grid Services

GridService

PortType

• FindServiceData()• QueryByServiceDataNames()• GetServiceData()• SetByServiceDataNames()• DeleteByServiceDataNames()• RequestTerminationAfter()• RequestTerminationBefore()• Destroy()

Page 58: Grid Services

Capabilities

of a

Grid

Service

• 2-level naming (GSH vs. GSR)

• Factories

• Lifetime management

• Service Data Elements

• Event Notification

• ServiceGroups

Page 59: Grid Services

GSH

versus

GSR

• A GSH (Grid Service Handle) is a unique name for a Grid Service Instance

• A GSR (Grid Service Reference) is a perhaps temporary mechanism to access the Grid Service Instance

Page 60: Grid Services

Factories

• Create new instances of services dynamically

• Individualized Instances

• lifetime management techniques

Page 61: Grid Services

Service

Data

Elements

• Generalized State

– useful for describing capability

– Get/Set model similar to javaBeans Properties

• Can specify initial values in WSDL

• Integrated with Notification mechanism

Page 62: Grid Services

Service

Data

Elements:

GridService

• Interface

• ServiceDataName

• FactoryLocator

• GridServiceHandle

• GridServiceReference

• TerminationTime

Page 63: Grid Services

Notifications

• Source – implements NotificationSourcePortType– sends a notification message (XML Element) to Sinks• Sink– implements NotificationSinkPortType– sends a notification subscription request to source– causes a GridService Instance of porttype NotificationSubscription to be created

Page 64: Grid Services

ServiceGroups

• A grid service that maintains information about other grid services• Can be used to implement a classic registry model• Can be used for dataset replication• A grid service can belong to more than one Service Group• Membership in a ServiceGroup can be homogeneous or heterogeneous• Service group portTypes are optional

Page 65: Grid Services

Grid

Services:

Summary

• Extends Web Services to support Transient Services– WSDL 1.2 expected to include extensions• Requires support for factories, lifetime management, soft-state management, and

notifications• Java implementation pretty solid– Security implementation still shaky

Page 66: Grid Services

69

Other Challenges

• Developing user interfaces

• Data Management

• Scheduling/co-scheduling of resources

• Failure management

• Application development

• Performance

• Many others…

Page 67: Grid Services

70

What I hope you got from this talk

• Grid Computing is about – Co-ordinated use of different resources– Provisioning resources for increased utilization– Scaling to large numbers of resources, services

and users

• Many systems being built

• Many Applications being developed