parallel corba objects corba may, 22 nd 2000 arc « couplage » christophe rené (irisa/ifsic)
TRANSCRIPT
Parallel CORBA Objects CORBA
May, 22nd 2000
ARC « Couplage »
Christophe René (IRISA/IFSIC)
Contents
Introduction
Parallel CORBA object concept
Performance evaluation
Encapsulation example
Conclusion
Introduction
Objective To design a Problem Solving Environment able to integrate a large
number of codes aiming at simulating a physical problem To perform multi-physics simulation (code coupling)
Constraints Simulation codes may be located on different machines
• distributed processing
Simulation codes may require high performance computers• parallel processing
Approach Combining both parallel and distributed technologies using a
component approach (MPI + CORBA)
CORBA Generalities
CORBA: Common Object Request Broker ArchitectureOpen standard for distributed object computing by the OMG
Software bus, object oriented Remote invocation mechanism Hardware, operating system and programming language
independence Vendor independence (interoperability)
Problems to face Performance issues Poor integration of high performance computing environments with
CORBA
How does CORBA work ?
Interface Definition Language (IDL)
Describe remote object
IDL compiler Stub and skeleton code
generation
IDL stub (proxy) Handle remote invocation
IDL skeleton Link between object
implementation and ORB
interface MatrixOperations {const long SIZE = 100;typedef double Vector[ SIZE ];typedef double Matrix[ SIZE ][ SIZE ];void multiply( in Matrix A, in Vector B,
out Vector C );};
Server
Client
IDL stub
Objetinvocation
Object Request Broker (ORB)
IDLcompiler
OA
IDL skeleton
Object implementation
Encapsulating MPI-based parallel codes into CORBA objects
Master/slave approach One SPMD code acts as the
master whereas the others act as slaves
The master drives the execution of the slaves through message-passing
Drawbacks Lack of scalability when
communicating through the ORB Need modifications to the original
MPI code
Advantage Can be used with any CORBA
implementation
SPMDcode
MPI Communication layer
Encapsulated MPI Code
MPI Slave processes
MP
I M
as
ter
pro
ce
ss
SPMDcode
SPMDcode
SPMDcode
SPMDcode
SPMDcode
SPMDcode
Skel.
OAStub
Scheduler
Client
CORBA ORB
Master has to Select the method to invoke within the slave processes Scatter data to slave processes Gather data from slave processes
Master process CORBA + MPI
initialization
Slave processes MPI initialization
Master / Slave approach in details
MPI Communication layer
Encapsulated MPI Code
MP
I S
lav
e p
roc
es
se
s
MP
I M
as
ter
pro
ce
ss
SPMDcode
Skel.
OAStub
Scheduler
Client
CORBA ORB
SPMDcode
SPMDcode
Disp. Disp. Disp.
SPMDcode
Parallel CORBA object concept
A collection of identical CORBA objects
Transparent to the client Parallel remote invocation Data distribution
CORBA ORB
MPI Communication layer
Parallel CORBA Object
SPMDcode
Skel.
OA
SPMDcode
Skel.
OA
SPMDcode
Skel.
OA
Parallel Server
Stub
Client
SequentialClient
Problems to face
Communication between a sequential client and a parallel server a parallel client and a sequential server a parallel client and a parallel server
Implementation constraints Do not modify the ORB core to keep interoperability features
Approach Modify stub and skeleton code Extend the IDL compiler
Extended-IDL
Collection specification Size specification = number of requests to send Shape specification used to distribute arrays
Data distribution specification Scatter and gather elements of an array
Reduction operator specification Perform collective operations using request replies
Specifying number of objects in a collection
Several ways: integer value interval of integer value mathematical function
• power
• exponential
• multiple
character “*”
interface[ 4 ] Example1 { /* ... */};
interface[ 2 .. 8 ] Example2 { /* ... */};
interface[ 2 ^ n ] Example3 { /* ... */};
interface[ * ] Example4 { /* ... */};
Shape depends on data distribution specification, but users may add special requirements
Shape of the object collection
How can we organize 8 objets ?
Shape of the object collection (cont’d)
Specification of the shape
size of one dimension• integer value
• mathematical function
–multiple dependence between
dimensions
interface[ 8: 2, 4 ] Example1 { /* ... */};
interface[ *: 2 ] Example2 { /* ... */};
interface[ *: *, 2 ] Example3 { /* ... */};
interface[ *: 2 * n ] Example3 { /* ... */};
interface[ x ^ 2: n, n ] Example4 { /* ... */};
Inheritance mechanism
Under some constraints numbers of processors must match
shapes of virtual nodes array must match
interface[ * ] Example1 { /* ... */};interface[ 2 ^ n ] Example2 : Example1 { /* ... */};
interface[ 2 ^ n ] Example1 { /* ... */};interface[ * ] Example2 : Example1 { /* ... */};
interface[ * ] Example1 { /* ... */};interface[ * : 2 ] Example2: Example1 { /* ... */};
interface[ *: 2 ] Example1 { /* ... */};interface[ *: 3 ] Example2: Example1 { /* ... */};
Inheritance not allowed Inheritance allowed
Inheritance allowed Inheritance not allowed
Specifying Data distribution
New keyword: dist
Only arrays and sequences may be distributed
Available distribution mode: BLOCK BLOCK( size ) CYCLIC CYCLIC( size ) “*”
interface[ * ] Example { typedef double Arr1[ 8 ]; typedef Arr1 Arr2[ 8 ]; typedef sequence< double > Seq; void Op1( in dist[ CYCLIC ] Arr1 A, in Arr1 B, out dist[ BLOCK ][ * ] Arr2 C );
void Op2( in dist[ BLOCK ] Seq A, inout Seq B );};
Block( 5 )
Block = Block( BlockSize )
BlockSize = ( ArrayLength + ProcNb - 1 ) / ProcNb
Cyclic( 3 )
Cyclic = Cyclic( 1 )
Distribution examples on 2 processors
0 1
2 3
0 1
2 3
Mapping
Vector distribution on a processor matrix
interface[ * ] Example { typedef double Arr[ 8 ]; void Op1( in dist[ BLOCK ] Arr A ); void Op2( in dist[ BLOCK, 2 ] Arr A );};
Op2 Op1
Extended-IDL specification
Mapping (cont’d)typedef double Arr1[ 8 ];typedef Arr1 Arr2[ 8 ];
interface[ * ] Example1 { void Op1( in dist[ * ][ CYCLIC ] Arr2 A ); void Op2( in dist[ CYCLIC, 2 ][ * ] Arr2 A );};
interface[ * ] Example2 { void Op1( in dist[ BLOCK, 2 ][ BLOCK, 1 ] Arr2 A, out dist[ BLOCK ][ BLOCK ] Arr2 B );};
typedef double Arr1[ 8 ];typedef Arr1 Arr2[ 8 ];
interface[ * ] Example3 { void Op2( in dist[ CYCLIC, 2 ][ CYCLIC ] Arr2 A );};
Specification not allowed
Specification allowed
Reduction operators
Reduction operator available: min, max addition (sum), multiplier (prod) bitwise operation (and, or, xor) logical operation (and, or, xor)
interface[ * ] Example1 { typedef double Arr[ 8 ];
cland boolean Op1( in dist[ BLOCK ] Arr A, in double B ); void Op2( in dist[ CYCLIC ] Arr A, inout cmin double B ); void Op3( in dist[ CYCLIC( 3 ) ] Arr A, out csum double B );};
interface MatrixOperations { const long SIZE = 100;
typedef double Vector[ SIZE ]; typedef double Matrix[ SIZE ][ SIZE ];
void multiply( in Matrix A, in Vector B, out Vector C );
double minimum( in Vector A );};
interface[ * ] MatrixOperations { const long SIZE = 100;
typedef double Vector[ SIZE ]; typedef double Matrix[ SIZE ][ SIZE ];
void multiply( in Matrix A, in Vector B, out Vector C );
double minimum( in Vector A ); };
Collection specification
interface[ * ] MatrixOperations { const long SIZE = 100;
typedef double Vector[ SIZE ]; typedef double Matrix[ SIZE ][ SIZE ];
void multiply( in dist[ BLOCK ][ * ] Matrix A, in Vector B, out dist[ BLOCK ] Vector C );
double minimum( in Vector A ); };
Collection specificationData distribution specification
interface[ * ] MatrixOperations { const long SIZE = 100;
typedef double Vector[ SIZE ]; typedef double Matrix[ SIZE ][ SIZE ];
void multiply( in dist[ BLOCK ][ * ] Matrix A, in Vector B, out dist[ BLOCK ] Vector C );
csum double minimum( in Vector A ); };
Collection specificationData distribution specificationReduction operator specification
Summary
Code Generation Problems
New type for distributed parameters: distributed array Amount of data to be sent to remote objects is known at runtime An extension of CORBA sequence Data distribution specification stored in distributed arrays
Skeleton code generation Provide access to data distribution specification
Stub code generation Scatter and gather data among remote objects Manage remote operation invocations
stu
b
void multiply( const Matrix A, const Vector B, Vector C );
...pco->multiply( A, B, C );...cl
ien
t
MPI Communication
layer
SPMDcode
SPMDcodeA
B
Stub code generation
void multiply(in dist[BLOCK][*] Matrix A,
in Vector B,
out dist[BLOCK] Vector C );
Parallel CORBA Object
Skel.
OA
Skel.
OA
CORBA ORB
void multiply( const Matrix_Seq A, const Vector_Seq B, Vector_Seq C );
void multiply( const Matrix_DArray A, const Vector_DArray B, Vector_DArray C );C
A
B
#2
A
B
#1
Requests
A
B
#2
C
A
B
#1
C
Parallel CORBA Object as client
Stub code generation when the client is parallel
Assignment of remote object references to the stubs
Use of distributed data type as operation parameters in the stubs
Exchange of data through MPI by the stubs• to build requests
• to propagate results
CORBA ORB
MPI Communication layer
Parallel CORBA Object
SPMDcode
Stub
SPMDcode
Stub
SPMDcode
Stub
Skel.
OA
Objectimpl.
SequentialServer
Parallel Client
Parallel CORBA Object as client (cont’d)
Only one process has lot of works gather distributed data from other processes send the alone request scatter distributed data to
other processes broadcast value of non
distributed data
Parallel Server
MPI Communication layer
Parallel CORBA Object (size p)
SPMDcode
Skel.
OA
SPMDcode
Skel.
OA
SPMDcode
Skel.
OA
CORBA ORB
MPI Communication layer
Parallel CORBA Object (size n)
SPMDcode
Stub
SPMDcode
Stub
SPMDcode
Stub
Parallel Client
Parallel CORBA Object as client (cont’d)
p requests are dispatched among n objects (cyclic distribution) p < n: data distribution handled by the stub p > n: data distribution handled by the skeleton p = n: user choice
Naming Service
Currently (as defined by the OMG): Provide some methods to access a remote object through a symbolic
name Associate a symbolic name with an object reference and only one
Our needs: Associate a symbolic name with a collection of object references
Implementation constraint: Object reference to the Standard Naming Service and the Parallel Naming
Service must be the same:orb->resolve_initial_reference( “NameService” );
Our solution: Add new methods to the Naming Service interface
Example_impl* obj = new Example_impl();NamingService->join_collection( Matrix_name, obj );...NamingService->leave_collection( Matrix_name, obj );
Server side
objs = NamingService->resolve_collection( Matrix_name );srv = Example::_narrow( objs );...srv->op1( A, B, C );
Client side
module CosNaming { ... interface NamingContext { ... typedef sequence<Object> ObjectCollection; void join_collection( in Name n, in Object obj ); void leave_collection( in Name n, in Object obj ); ObjectCollection resolve_collection( in Name n ); };};
Extensionto the
CosNaming IDL
specification
Extension to the Naming Service
Implementation
Using MICO implementation of CORBA
Library (not included in the ORB core) Parallel CORBA object base class Functions to handle distributed data Data redistribution library interface
Extended-IDL compiler (extension of MICO IDL compiler)
Parser Semantic analyzer Code generator
Experimental platform Cluster of PCs Parallel machine (Cenju - NEC)
0
10
20
30
40
50
60
70
80
90
0 10000 20000 30000 40000 50000 60000 70000 80000
Message size (bytes)
Throughput (Mb/s)
MPI
CORBA
Comparison between CORBA and MPI
Benchmark: send / receive Platform:
2 Bi - Pentium III 500 Mhz Ethernet 100 Mb/s
Latency: MPI: 0,35 ms CORBA: 0,52 ms
Differences due to: Protocol Memory allocation
interface Bench { typedef sequence< long > Vector; void sendrecv( in Vector in_a, out Vector out_a );};
Performance evaluation (cont’d)
Four experiments:
CORBA ORB
MPI Communication layer
Skel.
OA
Objectimpl.SPMDcode
Parallel CORBA Object
Skel.
OA
Objectimpl.SPMDcode
Skel.
OA
Objectimpl.SPMDcode
Skel.
OA
Objectimpl.SPMDcode
MPI Communication layer
Skel.
OA
Objectimpl.SPMDcode
Parallel CORBA Object
Skel.
OA
Objectimpl.SPMDcode
Skel.
OA
Objectimpl.SPMDcode
Skel.
OA
Objectimpl.SPMDcode
Code 1 Code 2
...
...
...
...
...
...
MPI Communication layer
...Client
Stub 1 Stub 2
Client
Stub 1 Stub 2
ParallelScheduler
Parallel CORBA object• through the ORB
CORBA ORB
Stub 2
Client
Scheduler
MPI Communication layer
Skel. 2
OA
Code 2
SPMDCodeSPMD
CodeSPMDCodeSPMD
CodeSPMDCodeSPMD
CodeSPMDCode
MPI Slave processesMP
I Mas
ter
pro
cess
MPI Communication layer
Skel. 1
OA
Code 1
SPMDCodeSPMD
CodeSPMDCodeSPMD
CodeSPMDCodeSPMD
CodeSPMDCode
MPI Slave processesMP
I Mas
ter
pro
cess
Stub 1
SPMDCode
SPMDCode
Master/slave• through file exchange
– ASCII file– XDR file
• through the ORB
Performance evaluation (cont’d)
0
50
100
150
200
250
1 2 4 8Number of objects belonging to the collection
ms
Matrix order = 256 ; Element type = long
ORB
PaCO
ASCII
XDR
0
500
1000
1500
2000
2500
1 2 4 8
Number of objects belonging to the collection
ms
224 Mb/s
47 Mb/s
56 Mb/s
37 Mb/s
Performance evaluation (cont’d)
ORB
PaCO0
100
200
300
400
500
600
700
800
900
1 2 4 8Number of objects belonging to the collection
ms
Matrix order = 512
258 Mb/s
0
500
1000
1500
2000
2500
3000
3500
1 2 4 8Number of objects belonging to the collection
ms
Matrix order = 1024
293 Mb/s
int main( int argc, char* argv[] ){ /* ... */
MPI_Init( &argc, &argv );
MPI_Comm_rank( MPI_COMM_WORLD, &id ); MPI_Comm_size( MPI_COMM_WORLD, &size ); /* ... */
MPI_Send( ... ); MPI_Recv( ... );
/* ... */
MPI_Finalize();}
1. Adapt the original source code
new code
int main( int argc, char* argv[] ){ /* ... */
/* MPI_Init( &argc, &argv ); */
MPI_Comm_rank( MPI_COMM_WORLD, &id ); MPI_Comm_size( MPI_COMM_WORLD, &size ); /* ... */
MPI_Send( ... ); MPI_Recv( ... );
/* ... */
/* MPI_Finalize(); */}
Remove invocation to MPI_Init() and to MPI_Finalize()
Encapsulation example
int main( int argc, char* argv[] ){ /* ... */
MPI_Init( &argc, &argv );
MPI_Comm_rank( MPI_COMM_WORLD, &id ); MPI_Comm_size( MPI_COMM_WORLD, &size ); /* ... */
MPI_Send( ... ); MPI_Recv( ... );
/* ... */
MPI_Finalize();}
int app_main( int argc, char* argv[] ){ /* ... */
/* MPI_Init( &argc, &argv ); */
MPI_Comm_rank( MPI_COMM_WORLD, &id ); MPI_Comm_size( MPI_COMM_WORLD, &size ); /* ... */
MPI_Send( ... ); MPI_Recv( ... );
/* ... */
/* MPI_Finalize(); */}
Rename the main function
Original code
Encapsulation example (cont’d)
typedef sequence< string > arg_type;
interface[ * ] Wrapper { void compute( in string directory, in arg_type arguments ); void stop();};Interface IDL
2. Define the IDL interface
void Wrapper_impl::compute( const char* directory, const arg_type_DArray& arguments ){ int argc = arguments.length() + 1; char** argv = new (char *) [ argc ];
argv[ 0 ] = strdup( “...” ); /* Application name */ for( int i0 = 1; i0 < argc; ++i0 ) argv[ i0 ] = strdup( arguments[ i0 - 1 ] );
chdir( directory ); app_main( argc, argv );
for( int i1 = 0; i1 < argc; ++i1 ) free( argv[ i1 ] ); delete [] argv;}
Objectimplementation
3. Write method implementation
Encapsulation example (cont’d)
int main( int argc, char* argv[] ){ /* ... */
PaCO_DL_init( &argc, &argv ); orb = ORB_init( argc, argv );
/* ... */
srv = new Wrapper_impl();
/* ... */
orb->run();
/* ... */
PaCO_DL_exit();}
server
4. Write the main functionof the server
int main( int argc, char* argv[] ){ /* ... */
arg_type arguments; arguments.length( 2 );
arguments[ 0 ] = strdup( “...” ); /* arg 1 */ arguments[ 1 ] = strdup( “...” ); /* arg 2 */
pos->compute( working_directory, arguments );
/* ... */}
client
5. Write the client
Conclusion
We show that MPI and CORBA can be combined for distributed and parallel programming
Implementation depends on the CORBA implementation Need to have a standardized API for the ORB
Response to the OMG RFI “Supporting Aggregated Computing in CORBA”