(very) loose proposal to revamp mpi_init and mpi_finalize
TRANSCRIPT
![Page 1: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/1.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 1© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 1
(Very) Loose Proposalto Revamp MPI_INIT and
MPI_FINALIZEThese are the kinds
of crazy ideasthat we discuss
at the MPI ForumJeffrey M. Squyres
Cisco Systems23 September 2015
![Page 2: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/2.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 2
Before MPI-3.1, this could be erroneous
int my_thread1_main(void *context) { MPI_Initialized(&flag); // …}
int my_thread2_main(void *context) { MPI_Initialized(&flag); // …}
int main(int argc, char **argv) { MPI_Init_thread(…, MPI_THREAD_FUNNELED, …); pthread_create(…, my_thread1_main, NULL); pthread_create(…, my_thread2_main, NULL); // …}
These mightrun at the same time (!)
![Page 3: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/3.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
The MPI-3.1 solution• MPI_INITIALIZED (and friends) are allowed to be called at any time
…even by multiple threads…regardless of MPI_THREAD_* level
• This is a simple, easy-to-explain solutionAnd probably what most applications do, anyway
• But many other paths were investigated
![Page 4: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/4.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
MPI_INIT / FINALIZE limitations• Cannot call MPI_INIT more than once• Cannot set error behavior of MPI_INIT• Cannot re-initialize MPI after it has been finalized• Cannot init MPI from different entities within a process without a priori
knowledge / coordination
MPI Process// Library 1MPI_Initialized(&flag);if (!flag) MPI_Init(…);
// Library 2MPI_Initialized(&flag);if (!flag) MPI_Init(…);
![Page 5: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/5.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
MPI_INIT / FINALIZE limitations• Cannot call MPI_INIT more than once• Cannot set error behavior of MPI_INIT• Cannot re-initialize MPI after it has been finalized• Cannot init MPI from different entities within a process without a priori
knowledge / coordination
MPI Process// Library 1MPI_Initialized(&flag);if (!flag) MPI_Init(…);
// Library 2MPI_Initialized(&flag);if (!flag) MPI_Init(…);
THIS IS INSUFFICIENT / POTENTIALLY ERRONEOUS
![Page 6: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/6.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 6
1994 called.
They want their API design back.
![Page 7: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/7.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
What we should have• Call MPI_INIT as many times as you like• By whomever wants to call it
MPI Process
// Library 3MPI_Init(…);
// Library 4MPI_Init(…);
// Library 5MPI_Init(…);
// Library 6MPI_Init(…);// Library 7
MPI_Init(…);
// Library 8MPI_Init(…);
// Library 9MPI_Init(…);
// Library 10MPI_Init(…);
// Library 11MPI_Init(…);
// Library 12MPI_Init(…);// Library 2
MPI_Init(…);// Library 1MPI_Init(…);
![Page 8: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/8.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
…but that has its own complicationsDo you have to call MPI_FINALIZE exactly that many times?
Do you allow MPI_INIT after MPI_FINALIZE?
Or perhaps you only allow MPI_INIT before MPI has been finalized?
How can you tell if it’s safe to call MPI_INIT? Atomic “test-and-init”?
I IS CONFUSED
![Page 9: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/9.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
We need something new
![Page 10: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/10.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 10© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
The following are just (incomplete) crazy ideas
WARNING!
![Page 11: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/11.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
New MPI concept: a session
int my_thread1_main(void *context) { MPI_Session session; MPI_Session_create(…, &session);
// Do MPI things
MPI_Session_free(&session);}
int my_thread2_main(void *context) { MPI_Session session; MPI_Session_create(…, &session);
// Do MPI things
MPI_Session_free(&session);}
int main(int argc, char **argv) { pthread_create(…, my_thread1_main, NULL); pthread_create(…, my_thread2_main, NULL); …}
![Page 12: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/12.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
New MPI concept: a session
int my_thread1_main(void *context) { MPI_Session session; MPI_Session_create(…, &session);
// Do MPI things
MPI_Session_free(&session);}
int my_thread2_main(void *context) { MPI_Session session; MPI_Session_create(…, &session);
// Do MPI things
MPI_Session_free(&session);}
int main(int argc, char **argv) { pthread_create(…, my_thread1_main, NULL); pthread_create(…, my_thread2_main, NULL); …}
Now featuring
100% less MPI_INIT!
![Page 13: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/13.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Create communicators from sessionsint my_thread1_main(void *context) { MPI_Session session; MPI_Session_create(&session); MPI_Comm_create_from_session(session, &comm)
// Do MPI things with comm
MPI_Comm_free(&comm); MPI_Session_free(&session);}
int my_thread1_main(void *context) { MPI_Session session; MPI_Session_create(&session); MPI_Comm_create_from_session(session, &comm)
// Do MPI things with comm
MPI_Comm_free(&comm); MPI_Session_free(&session);}
![Page 14: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/14.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Problems that sessions solve
Each entity (library?) in an OS process can have its own session
Any session-local state can be encapsulated in the handle
Entities can create / destroy sessions at any time …in any thread
![Page 15: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/15.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
…but what about MPI_COMM_WORLD?
![Page 16: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/16.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 16
MPI_COMM_WORLD. Sigh.• When is MPI_COMM_WORLD created (and/or initialized)?• When is MPI_COMM_WORLD destroyed?• Can you use MPI_COMM_WORLD with any session?
There doesn’t seem to be an obvious relation between MCW and individual sessions (ditto for MPI_COMM_SELF)
![Page 17: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/17.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
What if we get rid of MPI_COMM_WORLD?
![Page 18: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/18.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Problems that solves• Addresses logical inconsistency with session concept• Clean separation of communicators between sub-entities
…maybe slightly better than we have it today (sub-entities dup’ing COMM_WORLD)
• Side effects:Fault tolerance issues become easierOpens some possibilities for scalability improvements
![Page 19: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/19.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
Problems that creates• Users will riot
…but what if they don’t?
![Page 20: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/20.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
Open questions• What would be the forward / backward compatibility strategy?
E.g., deprecate INIT, FINALIZE, INITIALIZED, FINALIZED…?
• What are the other arguments to MPI_SESSION_CREATE?• Can you call both MPI_INIT and MPI_SESSION_CREATE in the same
process?• Can you do anything else with a session?
![Page 21: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/21.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 21
Sooo… what happens next?
![Page 22: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/22.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 22© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 22
Come to MPI Forum meetings
Discuss this and otherscintillating MPI topics
![Page 23: (Very) Loose proposal to revamp MPI_INIT and MPI_FINALIZE](https://reader035.vdocument.in/reader035/viewer/2022081520/58805efa1a28ab0b098b5403/html5/thumbnails/23.jpg)
© 2015 Cisco and/or its affiliates. All rights reserved. Cisco Public 23
Thank you.