virtual communities and science in the large dr. carl kesselman isi fellow director, center for grid...

41
Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research Professor Computer Science Viterbi School of Engineering University of Southern California

Upload: william-hodge

Post on 29-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

Virtual Communities and Science in the Large

Dr. Carl KesselmanISI Fellow

Director, Center for Grid TechnologiesInformation Sciences Institute

Research ProfessorComputer Science

Viterbi School of EngineeringUniversity of Southern California

Page 2: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

2

Acknowledgements Ian Foster, with whom

I developed many of these slides Bill Allcock, Charlie Catlett,

Kate Keahey, Jennifer Schopf, Frank Siebenlist, Mike Wilde @ ANL/UC

Ann Chervenak, Ewa Deelman, Laura Pearlman, Mike D’Arcy, Gaurang Mehta, SCEC @ USC/ISI

Karl Czajkowski, Steve Tuecke @ Univa Numerous other fine colleagues NSF, DOE, IBM for research support

Page 3: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

3

Context:System-Level Science

Problems too large &/or complex to tackle alone …

Page 4: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

4

Seismic Hazard Analysis (T. Jordan & SCEC)

InSAR Image of theHector Mine Earthquake

A satellitegeneratedInterferometricSynthetic Radar(InSAR) image ofthe 1999 HectorMine earthquake.

Shows thedisplacement fieldin the direction ofradar imaging

Each fringe (e.g.,from red to red)corresponds to afew centimeters ofdisplacement.

SeismicHazardModel

Seismicity Paleoseismology Local site effects Geologic structure

Faults

Stresstransfer

Crustal motion Crustal deformation Seismic velocity structure

Rupturedynamics

Page 5: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

5

SCEC Community Model

IntensityIntensityMeasuresMeasures

Earthquake Earthquake Forecast ModelForecast Model

AttenuationAttenuationRelationshipRelationship

11

Standardized Seismic Hazard Analysis

Ground motion simulation

Physics-based earthquake forecasting

Ground-motion inverse problem

Structural Simulation

AWMAWMGroundGroundMotionsMotionsSRMSRM

Unified Structural RepresentationUnified Structural RepresentationFaults Motions Stresses Anelastic modelFaults Motions Stresses Anelastic model

22

AWP = Anelastic Wave Propagation

SRM = = Site Response Model

RDRDMM

FSMFSM

33

FSM = Fault System Model

RDM = Rupture Dynamics Model

InvertInvert

Other DataOther DataGeologyGeologyGeodesyGeodesy

44

22

33

11

44

55

55

Page 6: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

6

Science Takes a Village …

Teams organized around common goals People, resource, software, data, instruments…

With diverse membership & capabilities Expertise in multiple areas required

And geographic and political distribution No location/organization possesses all required

skills and resources Must adapt as a function of the situation

Adjust membership, reallocate responsibilities, renegotiate resources

Page 7: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

7

Virtual Organizations From organizational behavior/management:

"a group of people who interact through interdependent tasks guided by common purpose [that] works across space, time, and organizational boundaries with links strengthened by webs of communication technologies" (Lipnack & Stamps, 1997)

The impact of cyberinfrastructure People computational agents & services Communication technologies IT

infrastructure, i.e. Grid

“The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001

Page 8: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

8

Forming & Operating (Scientific) Communities

Define VO membership and roles, & enforce laws and community standards I.e., policy

Build, buy, operate, & share community infrastructure Data, programs, services, computing, storage,

instruments Define and perform collaborative work

Use shared infrastructure, roles, & policy Manage community workflow

Page 9: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

9

Forming & Operating (Scientific) Communities

Define VO membership and roles, & enforce laws and community standards I.e., policy

Build, buy, operate, & share community infrastructure Data, programs, services, computing, storage,

instruments Service-oriented architecture

Define and perform collaborative work Use shared infrastructure, roles, & policy Manage community workflow

Page 10: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

10

Defining Community: Membership and Laws

Identify VO participants and roles For people and services

Specify and control actions of members Empower members delegation Enforce restrictions federate policy

A

1 2

B

1 2

A B

1

10

1

10

1

16

Page 11: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

11

Policy Challenges in VOs

Restrict VO operations based on characteristics of requestor VO dynamics create challenges

Intra-VO VO specific roles Mechanisms to specify/enforce

policy at VO level Inter-VO

Entities/roles in one VO notnecessarily defined in another VO

Access granted by community

to user

Site admission-

control policies

EffectiveAccess

Policy of site to

community

Page 12: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

12

Core Security Mechanisms Authentication and digital signature

“Identity” of communicating party Attribute Assertions

C asserts that S has attribute A with value V Delegation

C asserts that S can perform O on behalf of C Namespaces and Attribute mapping

{A1, A2… An}vo1 {A’1, A’2… A’n}vo2

Policy Entity with attributes A asserted by C may perform

operation O on resource R

Page 13: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

13

Security Services for VO Policy Attribute Authority (ATA)

Issue signed attribute assertions (incl. identity, delegation & mapping)

Authorization Authority (AZA) Decisions based on assertions & policy

Use with message/transport level security

VO AService

VOATA

VOAZA

MappingATA

VO BService

VOUser A

Delegation AssertionUser B can use Service A

VO-A Attr VO-B Attr

VOUser B

Resource AdminAttribute

VO MemberAttribute

VO Member Attribute

Page 14: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

14

Security Services in Practice

VO

RightsUsers

Rights’

ComputeCenter

Access

Services (runningon user’s behalf)

Rights

Local policyon VO identityor attributeauthority

CAS or VOMSissuing SAMLor X.509 ACs

SSL/WS-Securitywith ProxyCertificates

Authz Callout:SAML, XACML

KCA

MyProxy

Page 15: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

15

Forming & Operating Scientific Communities

Define VO membership and roles, & enforce laws and community standards I.e., policy

Build, buy, operate, & share community infrastructure Data, programs, services, computing,

storage, instruments Define and perform collaborative work

Use shared infrastructure, roles, & policy Manage community workflow

Page 16: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

17

Beyond Science Silos:Service-Oriented Architecture

Decompose across network Clients integrate dynamically

Select & compose services Select “best of breed” providers Publish result as a new service

Decouple resource & service providers

Function

Resource

Data Archives

Analysis tools

Discovery toolsUsers

Fig: S. G. Djorgovski

Page 17: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

18

Decomposition EnablesSeparation of Concerns & Roles

User

ServiceProvider

“Provide access to data D at S1, S2, S3 with performance P”

ResourceProvider

“Provide storage with performance P1, network with P2, …”

D

S1

S2

S3

D

S1

S2

S3Replica catalog,User-level multicast, …

D

S1

S2

S3

Page 18: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

19

Providing VO Services:(1) Integration from Other Sources

Negotiate servicelevel agreements

Delegate and deploy capabilities/services

Provision to deliver defined capability

Configure environment Host layered functions

CommunityA

CommunityZ…

Page 19: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

20

Deploying New Services

Policy

Client

Environment

Activity

Allocate/provisionConfigure

Initiate activityMonitor activityControl activity

Interface Resource provider

Current mechanisms include:GRAM, Workspaces (Keahey, et al), HAND (Qi, et al)

Page 20: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

21

Virtualizing Existing Services into a VO

Establish service agreement with service E.g., WS-Agreement, GRAM

Delegate use to VO user

UserA

VO Admin

UserBVO User

ExistingServices

Page 21: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

22

www.opensciencegrid.org

Jobs (2004)

Open Science Grid 50 sites (15,000 CPUs) & growing 400 to >1000 concurrent jobs Many applications + CS experiments;

includes long-running production operations Up since October 2003; few FTEs central ops

Page 22: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

23

VOUser

EmbeddedResource Management

ClusterResourceManager

GRAM

ClusterResourceManager

GRAM

• VO admin delegates credentials to be used by downstream VO services.• VO admin starts the required services.• VO jobs comes in directly from the upstream VO Users• VO job gets forwarded to the appropriate resource using the VO credentials• Computational job started for VO

Client-side

VO Scheduler Other Services

VO Admin

. . .

Monitoring and control

HeadnodeResourceManager

GRAM

Deleg Deleg

Deleg

VOUser

VO Job

VO Job

Page 23: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

24

The Condor Brick

Deploy Brick

Allocate resourcesInitiate management services

Execute Jobs via Condor-C

Local CondorEnvironment

Public Network Private Network

Allocate resourcesInitiate job starters

(i.e. glidein)

GRAM

GRAM

VO Admin

VOUser

Page 24: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

25

Policy for Dynamic VO Service

Hosting Environment

Service PDPDoIt Service

ContainerPDP

VO PDP

User

Create doit

AddPolicy if Role=VO/Admin

Role=HE/Service_Creator

CreateService if Role=HE/ServiceCreator

AddUser DoIt if VO_PDP(Attrs)=yes &

Role=HE/Doer

VO ATA

DoIt if Role=VO/Doer

Page 25: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

26

Providing VO Services:(2) Coordination & Composition

Take a set of provisioned services …

… & compose to synthesize new behaviors

This is traditional service composition But must also be concerned with emergent

behaviors, autonomous interactions See the work of the agent & PlanetLab

communities

“Brain vs. Brawn: Why Grids and Agents Need Each Other," Foster, Kesselman, Jennings, 2004.

Page 26: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

27

Birmingham•

The Globus-BasedLIGO Data Grid

Replicating >1 Terabyte/day to 8 sites>120 million replicas so farMTBF = 1 month

LIGO Gravitational Wave Observatory

www.globus.org/solutions

Cardiff

AEI/Golm

Page 27: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

28

Pull “missing” files to a storage system

List of required

Files

GridFTPLocal

ReplicaCatalog

ReplicaLocation

Index

Data Replication

Service

Reliable File

Transfer Service Local

ReplicaCatalog

GridFTP

Data Replication Service

“Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005

ReplicaLocation

Index

Data MovementData Location

Data Replication

Page 28: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

29

Hypervisor/OS Deploy hypervisor/OS

Composing Resources …Composing Services

Physical machineProcure hardware

VM VM Deploy virtual machine

Provisioning, management, and monitoring at all levels

JVM Deploy container

DRS Deploy service GridFTP RLS

VO Services

GridFTP

Page 29: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

30

Community Commons

What capabilities are available to VO? Membership changes, state changes

Require mechanisms to aggregate and update VO information

VO-specific indexes

S

S

S SInformation

AA

A

FRESH

MOREThe age of

information

Page 30: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

31

GT4 Container

Monitoring and Discovery Services

MDS-Index

GT4 Cont.

RFT

MDS-Index

GT4 Container

MDS-Index

Registration &WSRF/WSN Access

GridFTP

adapter

Custom protocolsfor non-WSRF entities

Clients (e.g., WebMDS)

GRAM User

Automatedregistrationin container

WS-ServiceGroup

Page 31: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

32

Forming & Operating Scientific Communities

Define VO membership and roles, & enforce laws and community standards I.e., policy

Build, buy, operate, & share community infrastructure Data, programs, services, computing, storage,

instruments Service-oriented architecture

Define and perform collaborative work Use shared infrastructure, roles, & policy Manage community workflow

Page 32: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

33

Collaborative WorkExecuted

Executing

Executable

Not yet executable

Query

Edit

ScheduleExecution environment

What I Did

What I Want to Do

What I Am Doing

Time

Page 33: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

34

Managing Collaborative Work Process as “workflow,” at different scales, e.g.:

Run 3-stage pipeline Process data flowing from expt over a year Engage in interactive analysis

Need to keep track of: What I want to do (will evolve with new knowledge) What I am doing now (evolve with system config.) What I did (persistent; a source of information)

AbstractWorfklow

Workflow with executable

nodes

Jobs

TemplateGeneration

WorkflowRefinement

ExecutionEnvironment

Page 34: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

35

Problem Refinement Given: desired result and constraints

desired result (high-level, metadata description) application components resources in the Grid (dynamic, distributed) constraints & preferences on solution quality

Find: an executable job workflow A configuration that generates the desired result A specification of resources to be used Sequence of operations: create agreement, move

data, request operation May create workflow incrementally as information

becomes available

"Mapping Abstract Complex Workflows onto Grid Environments," Deelman, Blythe, Gil, Kesselman, Mehta, Vahi, Arbree, Cavanaugh, Blackburn, Lazzarini, Koranda, 2003.

Page 35: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

36Trident: The GriPhyNVirtual Data System

Abstractworkflow

Local planner

DAGmanDAG

StaticallyPartitioned

DAG

DAGman &Condor-GDynamically

PlannedDAG

VDLProgram

Virtual Datacatalog

Virtual DataWorkflowGenerator

JobPlanner

JobCleanup

Workflow spec Create Execution Plan Grid Workflow Execution

Page 36: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

37

Seismic Hazard Curve

Ground motion that will be exceeded every yearExceeded every year

Ground motion that a person can expect to be exceeded during their lifetime

Typical design for buildings

Typical design for hospitals

Typical design fornuclear power plant

Exceeded 1 time in10 years

Exceeded 1 time in100 years

Exceeded 1 time in1000 years

Exceeded 1 time in10,000 years A

nnual fr

equ

ency

of

exce

edance

Ground Motion – Peak Ground Acceleration

0.1 0.2 0.3 0.4 0.5 0.6

Carl’s house during Northridge

Minor damage Moderate damage

10% probability of exceedance in 50

years

Page 37: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

38

SCEC Cybershake

Calculate hazard curves by generating synthetic seismograms from estimated rupture forecast

Rupture Forecast

Synthetic Seismogram

Strain GreenTensor

Hazard CurveSpectral Acceleration

Hazard Map

Page 38: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

39

Cybershake on the SCEC VO

TeraGridCompute

TeraGridStorage

VO Scheduler

Workflow Scheduler/Engine

VO Service Catalog

Provenance Catalog

Data Catalog

SCECStorage

Page 39: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

40

Summary (1):Community Services

Community roll, city hall, permits, licensing & police force Assertions, policy, attribute & authorization services

Directories, maps Information services

City services: power, water, sewer Deployed services

Shops, businesses Composed services

Day-to-day activities Workflows, visualization

Tax board, fees, economic considerations Barter, planned economy, eventually markets

Page 40: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

41

Summary (2) Community based science will be the norm

Requires collaborations across sciences— including computer science

Many different types of communities Differ in coupling, membership, lifetime, size

Must think beyond science stovepipes Increasingly the community infrastructure will become the

scientific observatory Scaling requires a separation of concerns

Providers of resources, services, content Small set of fundamental mechanisms required to build

communities

Page 41: Virtual Communities and Science in the Large Dr. Carl Kesselman ISI Fellow Director, Center for Grid Technologies Information Sciences Institute Research

42

For More Information Globus Alliance

www.globus.org NMI and GRIDS Center

www.nsf-middleware.org www.grids-center.org

Infrastructure www.opensciencegrid.org www.teragrid.org

Background www.isi.edu/~carl

2nd Editionwww.mkp.com/grid2