the lattice project - a grid computing system

42
The Lattice Project - A Grid Computing System Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology

Upload: habib

Post on 02-Feb-2016

35 views

Category:

Documents


0 download

DESCRIPTION

The Lattice Project - A Grid Computing System. Michael P. Cummings Laboratory of Molecular Evolution Center for Bioinformatics and Computational Biology. Acknowledgments. Core Middleware Development Adam Bazinet Daniel Myers (now MIT) John Fuetsch (now Dreamworks Animation) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Lattice Project -  A Grid Computing System

The Lattice Project - A Grid Computing System

Michael P. Cummings

Laboratory of Molecular EvolutionCenter for Bioinformatics and Computational Biology

Page 2: The Lattice Project -  A Grid Computing System

Acknowledgments

Core Middleware Development Adam Bazinet Daniel Myers (now MIT) John Fuetsch (now Dreamworks Animation) Stephen McLellan, Chris Milliron, Deji

Akinyemi

Semantic Web-based Grid Services Sung Lee, Fujitsu Laboratories of America Nada Hashmi (now CBA, Saudi Arabia) David Wang, UMIACS

Page 3: The Lattice Project -  A Grid Computing System

Outline

Grid computing introduction and motivation

Goals of The Lattice Project Basic architecture Our current production Grid

system Implementation details Results of usage

Research and development

Page 4: The Lattice Project -  A Grid Computing System

Grid Computing: A Definition

A model of distributed computing that uses geographically and administratively disparate resources. In Grid computing, individual users can access computers and data transparently, without having to consider location, operating system, account administration, and other details. In Grid computing, the details are abstracted, and the resources are virtualized.

Page 5: The Lattice Project -  A Grid Computing System

Why Go Grid?

Scientific problems are solved faster Parallel execution means higher throughput

Make compute resources a commodity Analogous to the electrical power grid

Foster growth and interaction in the research community Use of the Grid spans departments and domains Grid resources are typically shared resources

Page 6: The Lattice Project -  A Grid Computing System

Grid Computing: Advantages

Provides increased resources for research

Utilizes resources already purchased

Space and HVAC needs already met Little increased administrative

burden Economically and environmentally

appealing

Page 7: The Lattice Project -  A Grid Computing System

Research Projects Using the Grid

The Laboratory of David Fushman has run protein-protein docking algorithms on Lattice CNS is the primary Grid service in this

project Floyd Reed and Holly Mortensen from the

Laboratory of Sarah Tishkoff have run a number of population genetics analyses MDIV and IM are the primary Grid services

The Laboratory of Molecular Evolution has run statistical phylogenetic analyses GSI is the primary Grid service

Page 8: The Lattice Project -  A Grid Computing System

Recent Grid Usage

IM – 0.13 CPU years (BOINC) MDIV – 4.93 CPU years (BOINC) CNS – 12.4 CPU years (BOINC) GSI – 94.05 CPU years (Condor)

Total: 111.51 CPU years BOINC participants in 21

countries

Page 9: The Lattice Project -  A Grid Computing System

Outline

Grid computing intorduction and motivation

Goals of The Lattice Project Basic architecture Our current production Grid

system Implementation details Results of usage

Research and development

Page 10: The Lattice Project -  A Grid Computing System

The Lattice Project: Initial Goals

Develop a Grid system for scientific research that: Speeds up workflows by “Grid-enabling”

various programs Is simple and intuitive Takes advantage of heterogeneous

resources Is capable of managing large numbers of

jobs (thousands) Supports multiple users and lowers the

barriers to getting involved Is community-driven and supported

Page 11: The Lattice Project -  A Grid Computing System

Principles of Design

Make use of well supported open source software Globus Toolkit BOINC Condor

Engineered software should be scalable, modular, and robust

Expose programs as well-defined services Arbitrary user-supplied code cannot be run

Page 12: The Lattice Project -  A Grid Computing System

Outline

Grid computing introduction and motivation

Goals of The Lattice Project Basic architecture Our current production Grid

system Implementation details Usage statistics

Research and development

Page 13: The Lattice Project -  A Grid Computing System

Terminology

Client: A Grid user interface OR a machine that performs computation

Grid Service: A Grid-enabled program

Scheduler: Decides where Grid jobs will run

Resource: Executes Grid jobs

Page 14: The Lattice Project -  A Grid Computing System

Basic Architecture (1 of 3)

Page 15: The Lattice Project -  A Grid Computing System

Basic Architecture (2 of 3)

Page 16: The Lattice Project -  A Grid Computing System

Basic Architecture (3 of 3)

Page 17: The Lattice Project -  A Grid Computing System

Outline

Grid computing introduction and motivation

Goals of The Lattice Project Basic architecture Our current production Grid

system Implementation details Results of usage

Research and development

Page 18: The Lattice Project -  A Grid Computing System

Software Components Globus Toolkit version 3.2.1

Backbone of the Grid http://www.globus.org/

Condor-G Grid-level scheduler / resource broker http://www.cs.wisc.edu/condor/

BOINC: Berkeley Open Infrastructure for Network Computing SETI@home-style desktop grid http://boinc.berkeley.edu/

Custom components GSBL, GSG, Globus-BOINC adaptor, MDS-

matchmaking bridge, user interface(s), administrative scripts, and much more

Page 19: The Lattice Project -  A Grid Computing System

Globus Toolkit 3

Key components: Globus Core

Grid service hosting environment GSI – Grid Security Infrastructure

Uses public key cryptography Secures communication Authenticates and authorizes Grid users

WS GRAM – Job management GASS – Point to point file transfer MDS2 – Information provider

Page 20: The Lattice Project -  A Grid Computing System

Condor-G

Condor-G is part of the Condor suite

Resources and jobs send Condor-G descriptions of themselves called ClassAds

Condor-G matches Grid jobs to suitable resources, then submits and manages them

This process is called matchmaking

Page 21: The Lattice Project -  A Grid Computing System

BOINC

Most novel feature of our Grid Public computing model

Untrusted resources

Potentially our largest resource

We have targeted 3 platforms: Windows / Linux x86 / Mac OS X

Page 22: The Lattice Project -  A Grid Computing System

Our Current Grid System

Page 23: The Lattice Project -  A Grid Computing System

User Interface The “Grid Brick”: a machine used to submit Grid

jobs Our primary interface for Grid users Command line clients mimic normal program

execution Lattice Intranet

Provides instructions for submitting jobs and managing data input and output

Provides tools for describing and monitoring jobs

Other possibilities: Web portal model of job submission A client capable of composing complex workflows

using Task Computing and Semantic Web technology developed by collaborators at Fujitsu

Page 24: The Lattice Project -  A Grid Computing System

Basic Architecture – Client/Service

Page 25: The Lattice Project -  A Grid Computing System

Grid Client Stack

lattice_submit / lattice_retrieve

Service-specific* submit / retrieve scripts

Client.pm – base Perl module

Service-specific* submit / retrieve classes

GSBL – Grid Service Base Library

Globus API

Command-line Interface Perl Java

* Service-specific templates and stubs are created by the Grid Service Generator

Page 26: The Lattice Project -  A Grid Computing System

Grid Service Stack

Service-specific* Implementation

GSBL – Grid Service Base Library

Globus API

Grid Service Hosting Environment, a.k.a. “the container” Java

* Service-specific templates and stubs are created by the Grid Service Generator

Page 27: The Lattice Project -  A Grid Computing System

Tools for Writing Grid Services

Grid Service Base Library (GSBL) Java API for building Grid services with the

Globus Toolkit Shields programmers from having to work with

the Globus API directly Provides a high-level interface for

operations such as job submission and file transfer

Grid Service Generator (GSG) Simplifies the process of creating Grid

Services Intended for use with GSBL

Page 28: The Lattice Project -  A Grid Computing System

Grid Service Generator

Deploying a Grid service with GT3 is absurdly complicated Many files, namespaces: lots of

potential typos GSG takes as input a few

parameters (service name, location, an XML argument description, etc.) and generates all requisite configuration files and skeleton Java classes

Page 29: The Lattice Project -  A Grid Computing System

Grid Services

Application

Condor (Linux/UNIX)

BOINC

Linux X86 Win32 Mac OS X

BLAST1 Yes No No No

Clustal W Yes Yes Yes Yes

CNS Yes Yes Yes No

Lamarc Yes Yes Yes Yes

MDIV Yes Yes Yes Yes

Migrate-N Yes Yes Yes Yes

Modeltest Yes Yes Yes Yes

MrBayes Yes Yes Yes Yes

ms Yes Yes Yes Yes

Muscle Yes Yes Yes Yes

PAUP*2 Yes No No No

Phyml Yes Yes Yes Yes

Pknots Yes Yes Yes Yes

Seq-gen Yes Yes Yes Yes

Snn Yes Yes Yes Yes

ssearch Yes Yes Yes Yes

Structure3 Yes No No No

Page 30: The Lattice Project -  A Grid Computing System

Grid Services Creating Grid Services requires:

Knowledge of the application Techniques for compiling and porting the

application to various platforms Knowledge of the infrastructure so it can

be effectively tested and deployed Challenges:

Maintaining bodies of Grid Service code as the number of applications grow and new versions of applications are released

Minimizing the number of updates that need to be applied when the framework changes

Page 31: The Lattice Project -  A Grid Computing System

Basic Architecture - Scheduling

Page 32: The Lattice Project -  A Grid Computing System

Condor-G: ClassAds

Resources and jobs send Condor-G descriptions of themselves called ClassAds Jobs require certain capabilities

of resources Resources advertise their

capabilities Similar to a dating service: central

broker points pairs of compatible jobs/resources at each other

Page 33: The Lattice Project -  A Grid Computing System

Condor G: ClassAds

Condor Collector

Resource A Resource B Resource C

I haveMrBayes!

I haveSSEARCH!

I havePAUP*!

Condor user

I need MrBayes!

Resource CCondor user

I hear you haveMrBayes?

Well, let's talkabout that...

Page 34: The Lattice Project -  A Grid Computing System

Generating ClassAds

Job ClassAds are generated by the Condor-G job manager Job requirements are specified in the

Grid service configuration files

Resource ClassAds are generated by extracting information from MDS Lattice information providers supply

data required for matchmaking

Page 35: The Lattice Project -  A Grid Computing System

Monitoring and Discovery System (MDS2)

Globus information services component LDAP based

Answers questions like: What resources are available? What capabilities do these resources

have? What is the load on these resources?

This in turn allows for intelligent decisions to be made in areas such as scheduling and resource accounting

Page 36: The Lattice Project -  A Grid Computing System

Basic Architecture - Resources

Page 37: The Lattice Project -  A Grid Computing System

Current Grid Resources

http://lattice.umiacs.umd.edu/resources/

UMIACS Condor pool ~ 400 processors

BOINC pools Clients on campus > 100 Public (off-campus) clients > 1000

Page 38: The Lattice Project -  A Grid Computing System

BOINC Works on the “pull” model, that is:

One or more servers create workunits Clients connect asynchronously, pull down

work, and return the results Clients are relatively lightweight and

easy to install and manage One client can process work for

multiple projects Participants can join teams and are

given credit for the work they complete http://lattice.umiacs.umd.edu/

boinc_public

Page 39: The Lattice Project -  A Grid Computing System

Globus-BOINC Adapter

Consists of a number of components that allow us to run Grid Services on BOINC BOINC job manager Custom validator and assimilator

Registers BOINC with Globus as a GRAM-addressable resource

BOINC compatibility library eases the process of porting applications to BOINC

Page 40: The Lattice Project -  A Grid Computing System

Outline

Grid computing introduction and motivation

Goals of The Lattice Project Basic architecture Our current production Grid

system Implementation details Results of usage

Research and development

Page 41: The Lattice Project -  A Grid Computing System

GT4 Research and Development We are currently upgrading the Grid system to

use Globus Toolkit 4.0 GT4 adheres strictly to emerging and

established Web service standards Actively developed and supported Many components have been greatly improved

GridFTP/RFT (replace GASS) WS GRAM MDS4 (XML based; replaces MDS2, LDAP based)

Our basic architecture remains the same, and the upgrade has been made easier because of tools we have already developed (GSBL, GSG)

Page 42: The Lattice Project -  A Grid Computing System

More Information

Lattice Website http://lattice.umiacs.umd.edu/