bt9002 grid computing 2

23
1 Grid Computing BT9002 Part-2 By Milan K Antony

Upload: techglyphs

Post on 16-Apr-2017

143 views

Category:

Education


0 download

TRANSCRIPT

1

Grid Computing BT9002 Part-2

By Milan K Antony

2

1. What is the Open Grid Services Architecture (OGSA)?

The Open Grid Services Architecture (OGSA) describes an architecture for a service-oriented grid computing environment for business and scientific use, developed within the Global Grid Forum (GGF). The Open Grid Services Architecture (OGSA) represents a new vision of both the grid and web services. By defining standard communication protocols and formats, OGSA represents the means to build truly large-scale, inter operable grid systems. The OGSA Working Group in the Global Grid Forum produces a set of documents detailing this vision. The OGSA vision is being instantiated

in the Open Grid Services Infrastructure (OGSI).OGSA is based on several other Web service technologies, notably WSDL and SOAP, but it aims to be largely agnostic in relation to the transport-level handling of data.

Briefly, OGSA is a distributed interaction and computing architecture based around services, assuring interoperability on heterogeneous systems so that different types of resources can communicate and share information. OGSA has been described as a refinement of the emerging Web Services architecture, specifically designed to support Grid requirements. OGSA has been adopted as a grid architecture by a number of grid projects including the Globus Alliance. Conceptually, OGSA was first suggested in a seminal paper by Ian Foster called "The Physiology of the Grid", and later developed by GGF working groups, which resulted in a GGF information document, entitled The Open Grid Services Architecture, Version 1.5.

According to the OGSA Roadmap document, OGSA is:

An architectural process in which the GGF's OGSA Working Group collects requirements and maintains a set of informational documents that describe the architecture.

A set of normative specifications and profiles that document the precise requirements for a conforming hardware or software component.

Software components that adhere to the OGSA specifications and profiles, enabling deployment of grid solutions that are interoperable even though they may be based on implementations from multiple sources.

3

2. What are the major goals of OGSA?

The major goals of OGSA are:

Identifying the use cases that drive the OGSA platform components

Identifying and define the core OGSA platform components

Defining hosting and platform specific bindings

Defining resource models and resource profiles with inter operable solutions

There have been a lot of activities in the GGF to define the use cases and core platforms services. In the next section, we first start with some use cases that drive the architecture behind OGSA.

3. List and explain OGSA functionalities for use case NFC.

This use case uses the following OGSA functionalities:

1. Discovery. The clients need to discover network services before they are used. Service brokers need to discover hardware and software availability.

2. Workflow management. A fusion Grid network service is a workflow of multiple components (remote execution, input and output data transfer, etc.).

4

3. Scheduling of service tasks. The service provider (or broker) acting on the service providers behalf needs to schedule resource in order to meet the execution constraints requested by the client.

4. Disaster recovery. As the service provider (or broker acting on its

behalf) strives to meet the clients end-to-end constraints, some degree of adaptation may have to be used to prevent failure.

5. Brokering. The service broker identifies software and platforms suitable for execution requested by the client.

6. Load Balancing. Some load balancing may be required to use service provider resource more efficiently.

7. Fault Tolerance. A reliable solution is needed in order to provide the time-critical execution capability.

8. Transport Management. Reliable transport management is essential to obtain the end to-end QoS required by this application.

9. Legacy application management. Realizing the Grid potential to deal with legacy issues was the one of the foremost motivation for this project.

10. Services facilitating brokering. This capability is essential for the service broker to compose and later execute a workflow meeting the requested constraints.

11. Application and network-level firewalls. This is a long-standing problem in the fusion use case. It is made particularly difficult by the many different policies we are dealing with and particularly harsh restrictions at international sites.

5

12. Agreement-based interaction. This project requires agreement-based interaction capable of specifying and enacting agreements between clients and service providers (not necessarily human) and then composing those agreements into higher-level end user structures.

13. Authorization and usage policies. We also require use-policy specification and enforcement mechanisms as described above.

4. Describe briefly the Scenarios in NFC use case.

In the experimental scenario described above, a scientist at one of the NFC sites (a client site) needs to remotely run code installed and maintained at another NFC site (a service provider site) during an experiment within time bound T (typically on the order of 10 minutes). For a very simple execution, the following would be available on the service providers side: a script that will download experimental data for the application input once that data becomes available; a suitable “short-running” configuration of an application, capable of executing in less than T; a script delivering results to the client; as well as an execution plan, or a workflow, describing the sequence of these actions and their QoS dependencies. To ensure that the code executes with the required QoS (in this case: within time T), the scientist at the client site enters into a contract with the application server and as a result is guaranteed code execution within T any time it is requested during the experimental availability window (typically a day). Since only a few such executions may be requested during that day, and the service provider resources have to be shared with other clients, it is essential that resource allocations are not overgenerous and that other software can share the resource with the time-critical application, getting preempted whenever the situation requires. When the client claims the execution based on the contract, the service provider initiates and monitors the run, adaptively recovering from failure of specific actions if needed. Depending on the importance of the run, the service provider could over provision, or replicate the run. This scenario can become more sophisticated depending on the service in question. It is essential that the execution time or other QoS aspects experienced by the client is end-to-end. In other words, the service provider accounts not only for application execution, but also allows for database access, data transfer, and other activities. It is important to note that data availability before transfer time (replication) cannot be leveraged in this case

6

as it becomes available dynamically. Similarly, in national (and potentially international) deployment, data transfer will become a significant factor, which cannot currently be reliably managed. Also, it is important that the QoS-based execution is available to small fusion labs in small centers as well as large fusion labs in large centers.

Apart from the time, fusion codes can also require a mode of execution that is not time-critical but that provides accurate results, or the time requirement can be relaxed to complete by a certain deadline rather than in a specific amount of time

5. What is OGSI? Briefly explain.

OGSA describes the features that are needed for the implementation of services provided by the grid, as web services. It however, does not provide the details of implementation. Open Grid Services Infrastructure (OGSI) provides a formal and technical specification needed for the implementation of grid services. It provides a description of Web Service Description Language (WSDL) which defines a grid service. We can call the base for OGSA the Open Grid Services Infrastructure (OGSI).

The Open Grid Services Infrastructure (OGSI) defines mechanisms for creating, managing, and exchanging information among entities called Grid services.

A Grid service is a Web service that conforms to a set of conventions (interfaces and behaviors) that define how a client interacts with a Grid service. In this unit, we focus on technical details, providing a full specification of the behaviors and Web Service Definition Language (WSDL) interfaces that define a Grid service. A „Grid service instance is a service that conforms to a set of conventions, expressed as Web Service Definition Language (WSDL) interfaces, extensions, and behaviors, for such purposes as lifetime management, discovery of characteristics, and notification. OGSI version 1.0 defines a component model that extends WSDL and XML Schema definition to incorporate the concepts of stateful Web services, extension of Web services interfaces, etc.

6. Write a note on service data concept.?

Grid service is a stateful Web service. This approach in OGSI identified the

7

need for a common mechanism to expose a service instances state data to service requestors for query, update and change notification. Since this concept is applicable to any Web service including those used outside the

context of Grid applications, a common approach to exposing Web service state data called “service Data” can be proposed.

In order to provide a complete description of the interface of a stateful Web service, it is necessary to describe the elements of its state that are externally observable. The service data concept can be extended to any stateful webservices for declaring its publicly available state information through service data concept. The need to declare service data as part of the services external interface is roughly equivalent to the idea of declaring attributes as part of an object-oriented interface described in an object- oriented interface definition language (IDL).

Since WSDL defines operations and messages for portTypes, the declared state of a service MUST be externally accessed only through service operations defined as part of the service interface. To avoid the need to define service Data specific operations for each service Data element, the Grid service portType provides base operations for manipulating service Data elements by name. The service Data declaration is the mechanism used to express the elements of publicly available state exposed by the service's interface.OGSI defines extensible operations for querying (get), updating (set), and subscribing to notification of changes in service Data elements.

7. What are the most important OGSA basic services?

The most important basic services which are derived from the use cases are

as follows

Common Management Model (CMM)

Service domains

Policy

Security

8

Distributed data access and replication

Monitoring

Scheduling

Accounting /metering

Common distributed logging provisioning and resource managementThe OGSA

introduces the termplatform services todenote services thatprovidefunctionalities thatare basic. Platformservices provide functionalities (i)on which otherservices build, (ii)that are common to several high-levelservices, and (iii)that are designed tobe used primarily through the“extends”relationship. Thefunctionalityprovided by a given platform service is, by definition, present in several high-level services. As a consequence, platform service functionalities permeate the high-level services, being pervasive within OGSA

The above OGSA layers form the foundation for new high level management applications and middleware grid solutions.

9

8. What are the security challenges in a Grid environment? Explain briefly.

A fundamental construct underlying many of the required attributes of the Grid services architecture is that of service virtualization. It is virtualization of Grid services that underpins the ability to map common service semantic behavior seamlessly onto native platform facilities. Current OGSA design work focuses on the adaptation of the Web Services Description Language (WSDL) for this purpose, although other interface definition languages (IDLs) could also be used. Controlling access to services through robust security protocols and securitypolicy is paramount to controlling access to VO resources and assets. Thus,authentication mechanisms are required so that the identity of individualsand services can be established. Service providers must implementauthorization mechanisms to enforce policy over how each service can beused. The requirement for composition complicates issues of policyenforcement, as one must be able to apply and enforce policy at all levels ofcomposition and to translate policies between levels of composition.

To address these challenges, an evolutionary approach to creating secure,integrated and interoperable Grid services based on a set of security abstractions that unify formerly dissimilar technologies is proposed.

The security challenges faced in a Grid environment can be grouped into three categories: integration with existing systems and technologies, interoperability with different “hosting environments” (e.g., J2EE servers, .NET servers, Linux systems), and trust relationships among interacting hosting environments. Relationships among these three categories of

challenges are shown in figure . Now, we discuss the security challenges encountered in Grid environments.

The integration challenge For both technical and pragmatic reasons, it is unreasonable to expect thata single security technology can be defined that will both address all Grid security challenges and be adopted in every hosting environment. Existing security infrastructures cannot be replaced overnight. For example, each

10

domain in a Grid environment is likely to have one or more registries in which user accounts are maintained (e.g., LDAP directories). Such registries are unlikely to be shared with other organizations or domains. Similarly, authentication mechanisms deployed in an existing environment that is reputed secure and reliable will continue to be used. Each domain typically has its own authorization infrastructure that is deployed, managed and supported. It will not typically be acceptable to replace any of these technologies in favor of a single model or mechanism. Thus, to be successful, a Grid security architecture needs to step up to the challenge of integrating with existing security architectures and models across platforms and hosting environments. This in turn, requires that theOGSA security must be implementation agnostic, so that it can be instantiated in terms of any existing security mechanisms (e.g., Kerberos, PKI); extensible, so that it can incorporate new security services as they become available; and integratable with existing security services.

The interoperability challenge Services that traverse multiple domains and hosting environments need to be able to interact with each other, thus introducing the need for interoperability at multiple levels. Let us examine these various levels in the following points:

• At the protocol level, we require mechanisms that allow domains to exchange messages. This can be achieved via SOAP/HTTP, for example.

• At the policy level, secure interoperability requires that each party be able to specify any policy it may wish in order to engage in a secure conversation and that policies expressed by different parties can be made mutually comprehensible. Only then can the parties attempt to establish a secure communication channel and security context upon mutual authentication, trust relationship, and adherence to each other’s policy.

• At the identity level, we require mechanisms for identifying a user from one domain, in another domain. For any cross-domain invocation to succeed in a secure environment, mapping of identities and credentials must be made possible. This can be enforced at either end of a session through proxy servers or through trusted intermediaries acting as trust proxies.

The trust relationship challenge

11

Grid service requests can span multiple security domains. Trust relationships among these domains play an important role in the outcome of such end-to-end traversals. A service needs to make its access requirements available to interested entities, so that they can request secure access to it. Trust between end points can be presumed, based on topological assumptions (e.g., VPN), or explicit, specified as policies and enforced through exchange of some trust-forming credentials. In a Grid environment, presumed trust is rarely feasible due to the dynamic nature of VO relationships. The dynamic nature of the Grid in some cases can make it impossible to establish trust relationships among sites prior to application execution. Given that the participating domains may have different security technologies in their infrastructure, it then becomes necessary to realize the required trust relationships through some form of federation among the security mechanisms.

The trust relationship problem is made more difficult in a Grid environment by the need to support the dynamic, user-controlled deployment and management of transient services. End users create such transient services to perform request-specific tasks, which may involve the execution of user code. Controlled access to VO resources and services is clearly a critical aspect of a secure Grid environment.

Given the dynamic nature of Grids and the scale of the environment, serious challenge exist and need to be addressed in the area of security exposure detection, analysis, and recovery. In summary, security challenges in a Grid environment can be addressed bycategorizing the solution areas:

a) Integration solutions where existing services needs to be used, and interfaces should be abstracted to provide an extensible architecture. b) Interoperability solutions so that services hosted in different virtual organizations that have different security mechanisms and policies will be able to invoke each other; and c) Solutions to define manage and enforce trust policies within a dynamic Grid environment.

A solution within a given category will often depend on a solution in another category. The dependency between these three categories of security items is illustrated in Figure . For example, any solution for federating credentials to achieve interoperability will be dependent on the trust models defined within the participating domains and the level of integration of the

12

services within a domain.

In a Grid environment, where identities are organized in VOs that transcend normal organizational boundaries, security threats are not easily divided by such boundaries. Identities may act as members of the same VO at one moment and as members of different VOs the next, depending on the tasks they perform at a given time. Thus, while the security threats to OGSA fall into theusualcategories(snooping,man-in-the-middle,intrusion,denial of service, theftof service,viruses andTrojanhorses, etc.)the maliciousentity could beanyone.

Thesize of someGrid

environments introduces the need to deal with large-

scale distributed systems. The number, size, and scalability of security components such as user registries, policy repositories, and authorization servers pose new challenges. Many cross-domain functions that may be statically pre-defined other environments will require dynamic configuration and processing in a Grid environment.

13

9. What is Globus Toolkit?

The Globus Alliance is a community of organizations and individuals

developing fundamental technologies behind the "Grid," which lets people share computing power, databases, instruments, and other on-line tools securely across corporate, institutional, and geographic boundaries without sacrificing local autonomy.

The Globus Toolkit is an open source software toolkit used for building Grid

systems and applications. It provides a set of tools for application programming (APIs) and system development kits (SDKs). It is being developed by the Globus Alliance and many others all over the world. A growing number of projects and companies are using the Globus Toolkit to unlock the potential of grids for their cause.

10. List and explain briefly the major components of OGSI.NET.

Globus provides a component to implement resource management, data

management, and information services as illustrated in Figure

The components are:

GRAM/GASS: The primary components of the resource management pyramid are the Grid Resource Allocation Manager (GRAM) and the Global Access to Secondary Storage (GASS).

MDS (GRIS/GIIS) Based on the Lightweight Directory Access Protocol (LDAP), the Grid Resource Information Service (GRIS) and Grid Index Information Service (GIIS) components can be configured in a hierarchy to collect the information and distribute it. These two services are called the Monitoring and Discovery Service (MDS). The LDAP query language is used to retrieve the desired information.

14

GridFTP: GridFTP is a standard extension to the normal File Transfer

Protocol (FTP).GridFTP is a key component for the secure and high- performance data transfer and this protocol is optimized for high bandwidth across Wide area networks.

GSI: This provides security functions including single/mutual authentication, confidential communication, authorization, and delegation.

Grid Security Infrastructure (GSI)

GSI provides elements for secure authentication and communication in a

grid. The infrastructure is based on the SSL (Secure Socket Layer) protocol,

15

public key encryption, and x.509 certificates. For a single sign-on, Globus

adds some extensions on GSI. It is based on the Generic Security Service

API. The main functions implemented by GSI are

Single/mutual authentication

Confidential communication

Authorization

Delegation

Grid Resource Allocation Manager (GRAM)

GRAM is the module that provides the remote execution and status

management of the execution. When a job is submitted by a client, the

request is sent to the remote host and handled by the gatekeeper daemon

located in the remote host. Then the gatekeeper creates a job manager to

start and monitor the job. When the job is finished, the job manager sends

the status information back to the client and terminates.

Figure depicts the conceptual view about GRAM. It contains the

following elements:

The globusrun command

Resource Specification Language (RSL)

The gatekeeper daemon

The job manager

16

The forked process

Global Access to Secondary Storage (GASS)

Dynamically-Updated Request Online Coallocator (DUROC)

The globusrun command:

The globusrun command submits and manages remote jobs and is used by almost all GRAM client tools. This command provides the following functions:

Request of job submission to remote machines.

17

Transfer the executable files and the resulting job-submission output files

Resource Specification Language (RSL)

RSL is the language used by the clients to submit a job. All job submission requests are described in RSL, including the executable file and condition on which it must be executed.

Gatekeeper

The gatekeeper daemon builds the secure communication between clientsand servers. It communicates with the GRAM client (globusrun) and authenticates the right to submit jobs. After authentication, gatekeeper forks and creates a job manager delegating the authority to communicate with clients.

Job manager

Job manager is created by the gatekeeper daemon as part of the job requesting process. It provides the interfaces that control the allocation of each local resource manager, such as a job scheduler, or Load Leveler. The job manager functions are:

Parse the resource language, Breaks down the RSL scripts.

Allocate job requests to the local resource managers.

Send callbacks to clients, if necessary

Receive the status and cancel requests from clients

Send output results to clients using GASS, if requested

Global Access to Secondary Storage (GASS)

18

GRAM uses GASS for providing the mechanism to transfer the output file from servers to clients. Some APIs are provided under the GSI protocol to furnish secure transfers. This mechanism is used by the globusrun command, gatekeeper, and job manager.

Dynamically-Updated Request Online Coallocator (DUROC)

By using the DUROC mechanism, users are able to submit jobs to differentjob managers at different hosts or to different job managers at the same host

Monitoring and Discovery Service (MDS)

MDS provides access to static and dynamic information of resources.

Basically, it contains the following components:

Grid Resource Information Service (GRIS)

Grid Index Information Service (GIIS)

19

Information Provider

MDS client

Figure represents the conceptual view interconnection of the MDS components. The resource information is obtained by the information provider and it is passed to GRIS. GRIS registers its local information with the GIIS, which also registers with another GIIS, and so on. MDS clients can get the resource information directly from GRIS (for local resources) and/or a GIIS(for grid-wideresources).

The MDSusesLDAP,whichprovidesthe

decentralized maintenance of

resource information.

20

Resource information

Resource information contains the objects managed by MDS, which represent components resources as follows:

Infrastructure components. For example, name of the job manager or name of the running job

Computer resources For example, network interface, IP address, or memory size.

Grid Resource Information Service (GRIS)

GRIS is the repository of local resource information derived from information providers. GRIS is able to register its information with a GIIS, but GRIS itself does not receive registration requests. The local information maintained by GRIS is updated when requested, and cached for a period of time known as the time-to-live (TTL). If no request for the information is received by GRIS, the information will time out and be deleted. If a later request for the information is received, GRIS will call the relevant information provider(s) to retrieve the latest information.

Grid Index Information Service (GIIS)

GIIS is the repository that contains indexes of resource information

registered by the GRIS and other GIISs. GIIS has a hierarchical mechanism,

like DNS, and each GIIS has its own name.

Information providers

The information providers translate the properties and status of local

resources to the format defined in the schema and configuration files.

MDS client

The MDS client is based on the LDAP client command, ldapsearch. A

21

search for a resource information that you want in your grid environment is

initially performed by the MDS client.

Hierarchical MDS

The MDS hierarchy mechanism is similar to the one used in DNS. GRIS and

GIIS, at lower layers of the hierarchy, register with the GIIS at upper layers.

Clients can query the GIIS for any information about resources that build a

grid environment.

Grid File Transfer Protocol (Grid FTP)

GridFTP provides a secure and reliable data transfer among grid nodes.The word GridFTP can referred to a protocol, a server, or a set of tools.

GridFTP protocol

GridFTP is a protocol intended to be used in all data transfers on the grid. It is based on FTP, but extends the standard protocol with facilities such as multistreamed transfer, auto-tuning, and Globus based security.As the GridFTP protocol is still not completely defined, Globus Toolkit does not support the entire set of the protocol features currently presented.

GridFTP server and client

Globus Toolkit provides the GridFTP server and GridFTP client, which are implemented by the in.ftpd daemon and by the globus-url-copy command, respectively. They support most of the features defined on the GridFTP protocol. The GridFTP server and client support two types of file transfer: standard and third-party. The standard file transfer is where a client sends the local file to the remote machine, which runs the FTP server. An overview is shown in Figure

22

Third-party file transfer is where there is a large file in remote storage and

the client wants to copy it to another remote server, as illustrated in Figure

GridFTP tools

Globus Toolkit provides a set of tools to support GridFTP type of data transfers. The gsi-ncftp package is one of the tools used to communicate with the GridFTP Server.

The GASS API package is also part of the GridFTP tools. It is used by the GRAM to transfer the output file from servers to clients.

API and software developer's kit

Two other components are available to help develop Globus related grid

23

applications:

APIs

Developers toolkit

API: Globus Toolkit APIs are basically implemented in the C language.

*************************************