project18 communication design + parallelization camilo a silva bioinformatics summer 2008

21
Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Upload: jack-boyd

Post on 18-Jan-2018

219 views

Category:

Documents


0 download

DESCRIPTION

Main Structure A master node communicating with all slave nodes.

TRANSCRIPT

Page 1: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Project18 Communication Design +ParallelizationCamilo A SilvaBIOinformaticsSummer 2008

Page 2: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

GoalsGoals

Design a communication Design a communication structure for project18structure for project18

Provide a clear map and Provide a clear map and detailed instructions in detailed instructions in parallelizing the codeparallelizing the code

Oversee a self-managing fault Oversee a self-managing fault system for project18system for project18

Share ideas in how to self-Share ideas in how to self-optimize project18optimize project18

Page 3: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Main StructureMain Structure

A master node communicating with all slave nodes.

Page 4: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

ObjectiveObjectiveThe plan is to run project18 in The plan is to run project18 in different nodes at the same time. different nodes at the same time. Each node will create an output file, Each node will create an output file, which presents the discriminating which presents the discriminating probes found between the two probes found between the two genomes compared.genomes compared.

Page 5: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

How?How? The master node will acquire info from the user in The master node will acquire info from the user in

regards to the different genomes to be compared for regards to the different genomes to be compared for project18project18

The master node will administer the data and create jobs The master node will administer the data and create jobs to each slave node.to each slave node.

Each slave node will receive the data from the master Each slave node will receive the data from the master node and start execution of project18node and start execution of project18

After a node has completed its task, it will report its After a node has completed its task, it will report its completion to the master node which will determine if completion to the master node which will determine if there are more tasks to be completed. If there are, the there are more tasks to be completed. If there are, the proceeding task will be given to such node.proceeding task will be given to such node.

When the program has finished, all results shall be When the program has finished, all results shall be stored in a predefined directory where such would be stored in a predefined directory where such would be available for review.available for review.

Page 6: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Communication Drawing Communication Drawing DesignDesign

User input: { (g1,g2), (g1,g3), … , (g2, g3), (g3,g4), … }

(g2, g3)

Output:

g2_g3.txt

Etc…

(g1,g2) node0

(g1,g3) node1

(g2,g3) node7

(g3,g4) ?

Page 7: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Parallel Program DesignParallel Program Design

o o o o o o o

start endFM.N.

M.N.

C

1 2 3 4 5 6 7

M.N Master Node

1-7 Slave Nodes

F Finish

C Completion

Page 8: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Parallelization RoadmapParallelization Roadmap

In the following slides, each single section of the In the following slides, each single section of the parallel program design code will be explained in parallel program design code will be explained in order to parallelize project18order to parallelize project18Each slide will be representing each single Each slide will be representing each single diagram element or section of the parallel designdiagram element or section of the parallel design

NOTE: if by any chance you need detailed NOTE: if by any chance you need detailed information on the MPI functions go to this link:information on the MPI functions go to this link:

http://www.mpi-forum.org/docs/mpi2-report.pdfhttp://www.mpi-forum.org/docs/mpi2-report.pdf

Page 9: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

StartStart/* /* In order to start a program using MPI the In order to start a program using MPI the

following libraries must be present… One following libraries must be present… One may add as any other libraries as may add as any other libraries as necessarynecessary

*/*/#include <stdio.h>#include <stdio.h>#include <string.h>#include <string.h>#include <stdlib.h>#include <stdlib.h>#include “mpi.h”#include “mpi.h”#include “project18.h”#include “project18.h”

Page 10: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

start/*Sometimes one may want to define some constant variables.Also other functions to be used need to be defined.*/#define MASTER_NODE 0#define BUFFER 100#define TAG 0

void createFolder ( const char *filename, const char *newFileName ) ;

int checkQueue ( char *queue ) ;void assignSingleTasks ( int node , char *queue ) ;void taskControl (int node, char *queue )//etc…

Page 11: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

start/*To start a MPI program one needs to initialize it in main().Program variables should be defined here as well. */int main ( int argc , char *argv [ ] ) {MPI_Status status ;char filename [20] ;char fileToCreate [20] ;int my_rank, numOfNodes, queueItemsLeft, start = 1 ;... //as many as needed

//initializes MPI programMPI_Init ( &argc, &argv ) ;

//defines the rank of the node or simply determines the nodeMPI_Comm_rank ( MPI_COMM_WORLD , &my_rank) ;

//finds out how many processors are activeMPI_Comm_size ( MPI_COMM_WORLD , &numOfNodes ) ;

Page 12: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Master node startMaster node start/*/*The master node is selected in order for the The master node is selected in order for the

user to input some values for the project18 user to input some values for the project18 parametersparameters

*/*/if ( my_rank == MASTER_NODE ) {if ( my_rank == MASTER_NODE ) {//ask user for input…//ask user for input………//create queue or a data structure of the like…//create queue or a data structure of the like………

Page 13: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Master node start Master node start continued…continued…if ( my_rank == MASTER_NODE ) {if ( my_rank == MASTER_NODE ) {

……/* Now that a queue or data structure of the like contains the genomes to compare, /* Now that a queue or data structure of the like contains the genomes to compare,

they need to be sent to each node in order to start execution of the program.they need to be sent to each node in order to start execution of the program.Here’s just an example of how this task could be done.Here’s just an example of how this task could be done.*/*/int i ;int i ;for ( i = 1 ; i < numOfNodes ; i++ ) {for ( i = 1 ; i < numOfNodes ; i++ ) { //get item in queue… let’s suppose it is a string value like //this: char genomes [40] = //get item in queue… let’s suppose it is a string value like //this: char genomes [40] =

“genome1*genome2” ;“genome1*genome2” ;MPI_Send ( genomes , strlen (genomes) + 1 , MPI_CHAR , i , TAG , MPI_Send ( genomes , strlen (genomes) + 1 , MPI_CHAR , i , TAG ,

MPI_COMM_WORLD ) ;MPI_COMM_WORLD ) ;//find out the number of items left in the queue and send that number as well//find out the number of items left in the queue and send that number as wellMPI_Send ( queueItemsLeft , 1, MPI_INT , i , TAG + 1, MPI_COMM_WORLD ) ;MPI_Send ( queueItemsLeft , 1, MPI_INT , i , TAG + 1, MPI_COMM_WORLD ) ;

}}start = 0 ;start = 0 ;while ( !start && checkQueue ( queue ) ) {while ( !start && checkQueue ( queue ) ) {……} } }//end of if}//end of if

Page 14: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Receiving the message fromReceiving the message fromMaster nodeMaster node

/* After the if statement has sent the parameters to all nodes, each of /* After the if statement has sent the parameters to all nodes, each of them need to receive the messages independently.them need to receive the messages independently.

*/*/……if ( my_rank == MASTER_NODE ) {if ( my_rank == MASTER_NODE ) {……}}else {else {MPI_Recv ( queueItemsLeft , 1 , MPI_INT , 0 , TAG+1 , MPI_Recv ( queueItemsLeft , 1 , MPI_INT , 0 , TAG+1 ,

MPI_COMM_WORLD, &status ) ;MPI_COMM_WORLD, &status ) ;……}}

Page 15: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

IFIF/* Since each node now has the required parameters to start /* Since each node now has the required parameters to start

project18, it is required for us to know when a node its project18, it is required for us to know when a node its finished or the program is finishedfinished or the program is finished

*/*/……Else {Else {//right after the prior receive…//right after the prior receive…while ( queueItemsLeft ) {while ( queueItemsLeft ) {MPI_Recv ( genomes , BUFFER , MPI_CHAR , 0 , TAG , MPI_Recv ( genomes , BUFFER , MPI_CHAR , 0 , TAG ,

MPI_COMM_WORLD, &status ) ;MPI_COMM_WORLD, &status ) ;……}}}//end of else}//end of else

Page 16: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Project18 executionProject18 execution/* Project18 will be executed independently in each single machine. An output file and a completion code /* Project18 will be executed independently in each single machine. An output file and a completion code

are created at the end of the executionare created at the end of the execution*/*/else {else {……while ( queueItemsLeft ) {while ( queueItemsLeft ) {MPI_Recv() ;MPI_Recv() ;……//Project18 execution … all necessary code goes here//Project18 execution … all necessary code goes here//Since the output text files are independent and do not need to be collectively saved meaning that each //Since the output text files are independent and do not need to be collectively saved meaning that each

processor is writing onto the same file, the IO is carried as it is in C without the use of using MPI-IO.processor is writing onto the same file, the IO is carried as it is in C without the use of using MPI-IO.//At the end of the code add the following in order to send a completion code to node 0—the master node. //At the end of the code add the following in order to send a completion code to node 0—the master node.

Please be reminded that this is just an example, in practice this (the completion code) could be Please be reminded that this is just an example, in practice this (the completion code) could be changed:changed:

Char completion [ ] = “Process Completed” ;Char completion [ ] = “Process Completed” ;Char completionCode [50] ;Char completionCode [50] ;Sprintf( completionCode , “%s node%d_%s” , completion , my_rank, genomes ) ;Sprintf( completionCode , “%s node%d_%s” , completion , my_rank, genomes ) ;MPI_Send ( completionCode, strlen(completionCode) + 1, MPI_CHAR, 0 , TAG + 2 , MPI_Send ( completionCode, strlen(completionCode) + 1, MPI_CHAR, 0 , TAG + 2 ,

MPI_COMM_WORLD ) ;MPI_COMM_WORLD ) ;MPI_Recv (queueItemsLeft , 1 , MPI_INT , 0 , TAG+1 , MPI_COMM_WORLD , &status ) ;MPI_Recv (queueItemsLeft , 1 , MPI_INT , 0 , TAG+1 , MPI_COMM_WORLD , &status ) ;} // end of While} // end of While}//end of else}//end of else

Page 17: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Master Node 2nd Part

/* the master node is the administrator of each process being sent to a node. In the last slide, we saw that a message is sent by each nose specifying a completion code. In this part of the code, it is shown how the Master node is able to manage all tasks.

*/if ( my_rank == MASTER_NODE ) {…while ( checkQueue ( queue ) ) {MPI_Recv (completionCode, BUFFER, MPI_CHAR, MPI_ANY_SOURCE, TAG+2,

MPI_COMM_WORLD, &status ) ;//this is a special function that will be implemented as a self-healing applicationtaskControl ( status.MPI_SOURCE , queue )

//assigns a new task to the node that is available to receive oneassignSingleTasks ( status.MPI_SOURCE , queue ) ;

}

Page 18: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

assignSingleTasks (…) ;

This function hopes to check the data structure holding the tasks and select the new “genome” parameter to be processed.

Once the genome parameter is selected it is sent to the node that is available in this case it is represented as status.MPI_SOURCE

Page 19: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

A brief Pseudo code ofA brief Pseudo code ofProject18Project18

//libraries + definitions#include ……//mainInt main (…) {//variable definitions…If (rank == master node) { //ask user

for input, create the queue and initialize all tasks

While (//there are more items left in the queue) {

//receive completion signals, keep fault control and task control active, and assign new available tasks to available nodes

}//end while}end if

//continue on the right

Else {//receive the number of items left of the queueWhile (//there are more items left) {//receive from master node the genome parameterEXECUTE PRJECT18Create output filesSubmit completion code to node0Finally, wait to receive an updated “queueItemsList”}//end while} end of elseMPI_Finalize () ;}// end of main

Void checkqueueList(…) {Checks for items in queue or any other data

structure}

Void taskControl (…) {Makes sure that each task is completed accordingly

and is succesful}Void assignSingleTasks (…) {Finds next available queue item and sends it to the

available node for processing}

Page 20: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

self healing + self optimization Void taskControl () is the

function that will be in charge of revising that each node completes the assign task. This function will keep track of all completions codes as well. In case of a malfunction or unsuccesful completion, this function will make sure that the queue item that was not completed gets carried over and sent onto another node.

There could be a function that will help oversee the functionality and processing of each node and the communications with the Master node. In case, there is bottlenecking then this function could provide support in changing the communication to be asynchronous instead of synchronous.

Page 21: Project18 Communication Design + Parallelization Camilo A Silva BIOinformatics Summer 2008

Important thoughts This roadmap oversees in a “simple” manner

how the program could be parallelized. This roadmap is not taking into account any runtime challenges or any other types of issues

Please have in mind that this design could always be modified for a better one

Your input is surely appreciated