xtreemos application execution management: a scalable approach ramon nou, jacobo giralt, julita...
TRANSCRIPT
![Page 1: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/1.jpg)
XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez, Toni Cortes
Barcelona Supercomputing Center (BSC – CNS)
XtreemOS is funded by the European Commission through the Information Society Technology under contract IST-FP6-033576.
![Page 2: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/2.jpg)
Outline• XtreemOS Overview• Application Execution Manager• Job Execution Flow• Monitoring• Performance and scalability• Job Execution• Job Status
• Future
![Page 3: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/3.jpg)
XtreemOS overview• What is?• A Linux-based operating system to support Virtual
Organizations for Grid.
• Several layers
![Page 4: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/4.jpg)
XtreemOS overview• Some key features:• The Grid easy to use (like a Linux)• Highly scalable.• Fault Tolerant.• Able to run interactive jobs.• Extensible
• 3 nodes types (can be replicated):• Core• Resource• Client
![Page 5: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/5.jpg)
Application Execution Manager• Job management, Monitoring and resource management.• Access Point to submit and control jobs.• Distributed and asynchronous.• Extensible• Linux concepts in Grid world:• Process-Thread paradigm.• Signals.
![Page 6: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/6.jpg)
Application Execution Manager• Several distributed services:
• Job Manager.• Execution Manager.• Reservation Manager.• …
• Semantics:• JobUnit• Set of processes of a Job running in a resource.
• Job• Set of JobUnits. Identified by a JobID. [Process-Thread]
![Page 7: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/7.jpg)
Job Execution Flow
XOSD JobMng
User
XOSD ExecMng
JobDirectory
RSS
Any XOSD
Kernel
JID = createJob(JSDL)
JID
runJob(JID)
getResources(JSDL)
Schedules & Executes process
Job finished (all processes finished)
![Page 8: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/8.jpg)
Monitoring• System metrics.• User defined metrics.• Different levels of information.• Buffering.
• Each service mantains its monitoring information (SCOPE).• ExecMng has information about processes.• JobMng has information about jobs.• ResMng has information about resources.
![Page 9: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/9.jpg)
Performance & scalability• Key points:• Collaboration with Linux Kernel.• No central storage. (DHT’s)• Can be replicated.• Don’t search for best global scheduling, only for a good
enough local scheduling.
• What is the performance without DHT’s?• Typical VO, small (100 nodes) local grid.
![Page 10: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/10.jpg)
Job Execution• O(X2):• Need resource
management for each submitted process.
• All processes are from the same job. (in other systems they would be independent jobs)
![Page 11: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/11.jpg)
Job status• Ask all processes information of the job with low overhead. • Look job finished status in 0.012 seconds (0.014 in GT5) without contacting ExecMng’s
![Page 12: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/12.jpg)
Future improvements• Reduced internal communication times.• Caching to reduce overhead.
• Some conclusions:• Kernel Collaboration with «middleware» is important.• DHT’s (not evaluated) are a good option to distribute data.• But still no high performance.
• Including the concept 1 Job-> n Process gives the user a lot of benefits.• Easy to understand, easy to manage.
![Page 13: XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez,](https://reader036.vdocument.in/reader036/viewer/2022072016/56649ee75503460f94bf8a6d/html5/thumbnails/13.jpg)
XTREEMOS APPLICATION EXECUTION MANAGEMENT: A SCALABLE APPROACH Ramon Nou, Jacobo Giralt, Julita Corbalan, Enric Tejedor, J.Oriol Fitó, Josep M. Perez, Toni Cortes
Barcelona Supercomputing Center (BSC – CNS)
XtreemOS is funded by the European Commission through the Information Society Technology under contract IST-FP6-033576.