programming a service cloud rosa m. badia, jorge ejarque, daniele lezzi, raul sirvent, enric tejedor...

24
Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing Center Cloud Futures Workshop, Redmond, WA, 8-9 April 2010

Upload: brianna-martinez

Post on 26-Mar-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Programming a service Cloud

Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor

Grid Computing and Clusters Group Barcelona Supercomputing Center

Cloud Futures Workshop, Redmond, WA, 8-9 April 2010

Page 2: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

Outline

• StarSs programming model

• COMPSs framework

• EMOTIVE Cloud

• COMPSs towards SOA and Clouds

• ServiceSs

• Conclusions

2

Page 3: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

...for (i=0; i<N; i++){ T1 (data1, data2); T2 (data4, data5); T3 (data2, data5, data6); T4 (data7, data8); T5 (data6, data8, data9);}...

Sequential Application

T10 T20

T30

T40

T50

T11 T21

T31

T41

T51

T12

Resource 1

Resource 2

Resource 3

Resource N

.

.

.

Task graph creation

based on data

precedence

Task selection +

parameters direction

(input, output, inout)

Scheduling,

data transfer,

task execution

Synchronization,

results transfer

Parallel Resources(cluster, grid)

Star Superscalar Programming Model

3

Page 4: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

StarSs programming model

• GRIDSs, COMPSs

• Tailored for Grids or clusters

• Data dependence analysis based on files

• C/C++, Java

• SMPSs

• Tailored for SMPs or homogeneous multicores

• Altix, JS21 nodes, Power5, Intel-Core2

• C or Fortran

• CellSs / GPUSs

• Tailored for Cell/B.E. processor / for GPUs

• C or Fortran

• NestedSs

• Hybrid approach that combines SMPSs and CellSs

Page 5: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

COMPSs

• Componentised runtime

• Each component in charge of a functionality

5

Base technologies:

• Java as programming language

• ProActive:

• Reference implementation of the GCM model

• Used to build the components

• JavaGAT

• API that provides uniform access to different kinds of Grid middleware

• Used for job submission and file transfer

Page 6: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

initialize(f1);

for (int i = 0; i < 2; i++) {

genRandom(f2);

add(f1, f2);

}

print(f2);

Java application

COMPSs Programming model – Application + interface

public interface SumItf {

@ClassName(“example.Sum")

@MethodConstraints(OSType = "Linux")

void genRandom(

@ParamMetadata(type = Type.FILE, direction = Direction.OUT)

String f

);

@ClassName(“example.Sum")

...

}

Task constraints

Parametermetadata

Implementation

Java interface

6

Page 7: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

Custom Java Class Loader

Java app code

COMPSs runtime

Annotated

interface

Javassist

insertscalls to

Custom Loader

uses

input

C/C++ app code

COMPSs runtime

Interface

insertscalls to

Stubs Generator

input

JNI

Page 8: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

Runtime behavior

Custom Loader

Javassist

initialize(f1);

for (int i = 0; i < 2; i++) {

genRandom(f2);

add(f1, f2);

}

print(f2);

Annotated

interface

Java code

T1 T3

T2 T4

Grids Clusters

Files

Page 9: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

HMMPfam: sample COMPSs application

•HMMER: set of tools for protein sequence analysis

• Based on statistical Hidden Markov Models (HMMs)

• hmmpfam: tool to compare a sequence against a database of HMMs

(protein families)

• Computationally intensive

• Embarassingly parallel

•HMMPfam: Java application that uses hmmpfam

• Query sequences / database segmentation

• Programmed in a totally sequential fashion

• Selection of remote methods using a separate Java interface

• hmmpfam computation, merging of results

9

Page 10: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

HMMPfam – Annotated interface

public interface HMMPfamItf {@ClassName("worker.hmmer.HMMPfamImpl")void hmmpfam(

@ParamMetadata(type = Type.STRING, direction = Direction.IN)String hmmpfamBin,@ParamMetadata(type = Type.STRING, direction = Direction.IN)String commandLineArgs,@ParamMetadata(type = Type.FILE, direction = Direction.IN)String seqFile,@ParamMetadata(type = Type.FILE, direction = Direction.IN)String dbFile,@ParamMetadata(type = Type.FILE, direction = Direction.OUT)String resultFile

);@ClassName("worker.hmmer.HMMPfamImpl")void mergeSameSeq(

@ParamMetadata(type = Type.FILE, direction = Direction.INOUT)String resultFile1,@ParamMetadata(type = Type.FILE, direction = Direction.IN)String resultFile2,@ParamMetadata(type = Type.INT, direction = Direction.IN)int aLimit

);@ClassName("worker.hmmer.HMMPfamImpl")void mergeSameDB(

@ParamMetadata(type = Type.FILE, direction = Direction.INOUT)String resultFile1,@ParamMetadata(type = Type.FILE, direction = Direction.IN)String resultFile2

);}

10

Page 11: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

HMMPfam – Main program

public static void main(String args[]) throws Exception { split(fSeq, fDB, seqFrags, dbFrags); // Segment the query sequences file, the database file or both (done sequentially)

for (String dbFrag : dbFrags) { //Launch hmmpfam for each pair of seq - db fragments for (String seqFrag : seqFrags) { HMMPfamImpl.hmmpfam(hmmpfamBin, finalArgs, seqFrag, dbFrag, output); seqNum++; } dbNum++; }

while (outputs.size() > 1) { ListIterator<String> li = outputs.listIterator(); while (li.hasNext()) { String firstOutput = li.next(); String secondOutput = li.hasNext() ? li.next() : null; if (secondOutput == null) break; if (sameSeqFragment(firstOutput, secondOutput)) // Merge output fragments of different db fragments (must take care when merging) HMMPfamImpl.mergeSameSeq(firstOutput, secondOutput, clArgs.getALimit());

else if (sameDBFragment(firstOutput, secondOutput)) // Merge output fragments of different sequence fragments (basically appending one to another) HMMPfamImpl.mergeSameDB(firstOutput, secondOutput);

else // Avoid merging two output fragments of different sequence and db fragmentsli.previous();

} }

}

11

Page 12: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

HMMPfam – Tasks

public static void hmmpfam(String hmmpfamBin, String commandLineArgs, String seqFile, String dbFile, String resultFile) throws

Exception { String cmd = hmmpfamBin + " " + commandLineArgs + " “ + dbFile + " " + seqFile; // Execute command line Process hmmpfamProc = Runtime.getRuntime().exec(cmd);

// Check the proper finalization of the process int exitValue = hmmpfamProc.waitFor(); if (exitValue != 0) {

throw new Exception(“Exit value for hmmpfam is “ + exitValue); } }

public static void mergeSameSeq(String resultFile1, String resultFile2, int aLimit) throws Exception {...}

12

Page 13: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

HMMPfam

13

Page 14: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

HMMPfam performance

14

Page 15: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

HMMPfam – EBI runs

•European Bioinformatics Institute used HMMPfam in

productions runs

•ELIXIR project, tests on the MareNostrum supercomputer

• 7.500.000 protein sequences, divided in 150 files with 50.000 sequences

each

• TIGRFAM database, containing 3418 models (HMMs)

• 150 jobs submitted (i.e. COMPSs-HMMPfam executions), one for each

input sequences file

• 12 hours of execution time per job, approximately

• 64 worker processors per job + 4 processors for the master

15

Page 16: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

EMOTIVE CLOUD

• EMOTIVE CLOUD – Barcelona

Elastic Management of Tasks for Virtualized Environments in the

CLOUD

– is an open-source software infrastructure for implementing 'cloud computing'

on clusters. (recently released v 1.0)

– is an open source collaborative software development project dedicated to

providing an extensible, standards-based platform to address a broad range of

needs in the resource management development space

Page 17: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

EMOTIVE Cloud

• Scheduler

• Selects where to execute a task

• Virtualized Resource Management and Monitoring

• VM lifecycle management

• Creation of VMs

• VM monitoring

• VM destruction

• Data management

• Migration

• Checkpointing

• Data infrastructure

• Distributed file system

•EMOTIVE architecture: three different layers

Page 18: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

SERA scheduler: SRLM and ERA

•SRLM• Receives customer requests: job execution• Negotiates the allocation with the resource agents

• Selects the resources which match with the job requests • Receives from ERA scheduling proposals to selected resources• Decides which is the best proposal

• Manages Execution Lifecycle• Monitorizes the execution, recovers in case of failure, tries to improve the

execution

•ERA• Perform scheduling proposals

• Find schedules for the job requests using the semantic information of the resource descriptions and the provider rules

• Interacts with the different resources

• Resources reservation• Creates VM for the execution• Submits the jobs on the selected resources

ERA

SRLM

Semantic SchedulerResource Manager

Resources

Page 19: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

Integration in a Service-Oriented and Cloud infrastructure

•Goal: moving the COMPSs runtime from the client side to a server

SOA platform

•Characteristics of this environment:

• Execution of application tasks offered as services

• N applications can be served simultaneously

• Several COMPSs can be deployed, to serve the tasks from one or more

applications

• Resource provisioning brought by a Cloud

19

Page 20: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

COMPSs and EMOTIVE Cloud – Step 1

VM1VM1 VMnVMnVM2VM2

1. Existing pool of EMOTIVE VMs

2. COMPSs executes tasks on these VMS

Page 21: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

COMPSs and EMOTIVE Cloud – Step 2

VM1VM1 VMnVMnVM2VM2

1. The Task Scheduler requests SERA a pool of VMs

2. COMPSs executes tasks on these VMS

3. COMPSs requests the creation of more or “bigger” VMs (memory, CPU, etc)

VMn+1VMn+1

Page 22: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

ServiceSs envisioned architecture

22

JavaApp

WS

Co

nta

iner

Ru

nti

me

Man

ag

er

COMPSsruntime

instance 1

CloudScheduler

COMPSsruntime instance N

WorkerVM 1

WorkerVM 1

WorkerVM 1

WorkerVM M

COMPSsApplication

Side

JavaApp

JavaApp

Cloud

WS

Co

nta

iner

UserSide

App

App

User

WS

Co

nta

iner

WS

Co

nta

iner

WorkerVM 1

Page 23: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

Conclusions

•COMPSs is platform unaware programming model that simplifies

the development of applications in distributed environments

• Transparent data managemet, task execution

• Parallelization at task level

• Independent of platform: clusters, grids, clouds

•COMPSs evolution on top of SERA and EMOTIVE cloud will enable

the execution on federated clouds

• SERA is already able to submit jobs to EC2

•Further evolution of COMPSs towards ServiceSs to enable the

composition of services

• Graphical IDE to help deployment of services and development of applications

• Evolved runtime to support new features

Page 24: Programming a service Cloud Rosa M. Badia, Jorge Ejarque, Daniele Lezzi, Raul Sirvent, Enric Tejedor Grid Computing and Clusters Group Barcelona Supercomputing

Cloud Futures Workshop, Redmond , WA, 8-9 April 2010

• www.bsc.es/grid

• www.emotivecloud.net