Download - BOINC Workshop 10
![Page 1: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/1.jpg)
BOINC Workshop 10
Hien Nguyen, Eshwar Rohit
University of Houston
Supervisors:
Dr. Jaspal Subhlok
University of Houston
Dr. David P. Anderson
SSL – U.C, Berkeley
Enabling interprocess communication for BOINC applications
![Page 2: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/2.jpg)
2
RESEARCH GOAL
Hien Nguyen University of Houston
•Enable BOINC to efficiently support apps that require interprocess communication.
Goals: Easier programming for communicating applicationsReduce execution time (not increase throughput)
![Page 3: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/3.jpg)
3
Example Applications
Hien Nguyen University of Houston
REMD Protein Folding applicationEach process runs a standard molecular simulation at different temperature
270 280 290 300 310 320 330 340
280 270 290 300 320 310 340 330
280 290 270 300 320 340 310 330
P1 P2 P3 P4 P5 P6 P7 P8
STEP-1
STEP-2
STEP-3
![Page 4: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/4.jpg)
4
Example Applications
Hien Nguyen University of Houston
Or many other applications:
Differential equation solvers (grid) (synchronous)Game playing with alpha/beta pruning (asynchronous)Search application.…..
Suitable applications: moderate amount of communication.
![Page 5: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/5.jpg)
Synchronization point
5
DIFFICULTIES
Hien Nguyen University of Houston
Job execution
Fast host
Slow host
X
X
Slow down overall execution speed
X
XWorse as number of
host increases
![Page 6: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/6.jpg)
6
OUTLINE
Hien Nguyen University of Houston
1. Volpex Dataspace Overview• IPC for volunteer environment
2. Integration With BOINC• Process management• Host selection
3. Future and Related Work
![Page 7: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/7.jpg)
7
Volpex Dataspace
Hien Nguyen University of Houston
•Dataspace: global shared space that processes can use for information exchange without a temporal or spatial coupling.
Volpex Dataspace Server
Put(ABC, 800)
Get(ABC,?) (800)
![Page 8: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/8.jpg)
8
Volpex Dataspace – Fault Tolerance
Hien Nguyen University of Houston
Put(ABC, 800)
Get(ABC,?) (800)
replicated
X
Volpex DSS is designed to support redundant Put/Get operations unlike Linda & variants
![Page 9: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/9.jpg)
9
Volpex Dataspace
Hien Nguyen University of Houston
•Why centralized? Scale issues
•But: firewalls, no incoming connections
![Page 10: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/10.jpg)
10
INTEGRATION WITH BOINC
Hien Nguyen University of Houston
•Mechanics: Process managementSimultaneous process startingFault toleranceCheckpoint/restart
•Policy: Host selection.
![Page 11: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/11.jpg)
11
Job execution scheme
Hien Nguyen University of Houston
X
BOINC Scheduler
Volpex Dataspace Server
Get work
Put data item
Get data item
Get checkpoint
Put checkpoint
![Page 12: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/12.jpg)
12
PROCESSES MANAGEMENT
Hien Nguyen University of Houston
•Simultaneous process starting:All processes start computation together: reduce wasted resources because processes will have to wait for eachothers.
Volpex jobs have highest (infinite) priority: uninterruptible by other jobs.
While waiting for all processes of a Volpex job to be ready: host can do other finite priority volunteer jobs.
Use of boinc_temporary_exit()
![Page 13: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/13.jpg)
13
PROCESSES MANAGEMENT
Hien Nguyen University of Houston
•Fault tolerance
Dead instance spotted by heartbeat mechanism: process instances regularly send heartbeat to Volpex DSS.
BOINC scheduler recruits a new volunteer host to replace the dead one.
![Page 14: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/14.jpg)
14
PROCESSES MANAGEMENT
Hien Nguyen University of Houston
•Hot spare policy: BOINC Scheduler
Volpex Dataspace Server
XFast replacement
Hot spare group
![Page 15: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/15.jpg)
15
PROCESSES MANAGEMENT
Hien Nguyen University of Houston
•Checkpointing:
Process instance commits and uploads checkpoints to Volpex DSS (only stores latest checkpoint for each process).
Restarted process instance requests checkpoint from Volpex DSS.
![Page 16: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/16.jpg)
16
HOST SELECTION
Hien Nguyen University of Houston
•Volpex job: consists of processes that form a job, submitted by scientist.•Has requirements on:
DeadlineNumber of processes
•Has estimates of:Total flops per process.Flops between 2 consecutive checkpoints.Memory usage, disk usage
![Page 17: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/17.jpg)
17
HOST SELECTION POLICY
Hien Nguyen University of Houston
Criteria for selecting volunteer hosts to assign to a Volpex job: Speed and Availability.
Availability: the interval that a host is available w/o interruption (BOINC client allowed to compute).
![Page 18: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/18.jpg)
18
HOST SELECTION POLICY
Hien Nguyen University of Houston
Job’s minimum requirements:
•Minimum speed : Fast enough to meet job’s deadlineMin speed = Total flops / Deadline
•Minimum expected availability : host is continuously available for x hours to commit at least 1 checkpoint.
![Page 19: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/19.jpg)
19
Evaluate Host's Availability
Hien Nguyen University of Houston
•We want to predict host's length of availability interval.
•Method based on : Exploiting Non-Dedicated Resources for Cloud Computing Artur Andrzejak, Derrick Kondo, David P. Anderson. (NOMS10)
![Page 20: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/20.jpg)
20
Evaluate Host's Availability
Hien Nguyen University of Houston
Last value predictor: simplistic predictor which uses the availability value in the last hourly interval before prediction as the prediction of availability for the next x hours interval.
Combined with ranking hosts by predictability: number availability changes per week.
![Page 21: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/21.jpg)
21
Evaluate Host's Availability
Hien Nguyen University of Houston
In essence: select hosts which change availability very rarely.
A process assigned to a host with high predictability does not necessarily need to be replicated.
![Page 22: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/22.jpg)
22
IMPLEMENTATION STATUS
Hien Nguyen University of Houston
•Volpex utilities: for scientists to submit, abort or query status of a Volpex job.
•Modified BOINC scheduler: includes host selection for Volpex job.
•Modified Volpex DSS: handling new type of requests.
![Page 23: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/23.jpg)
23
IMPLEMENTATION STATUS
Hien Nguyen University of Houston
BOINC Server
submit job specs
Database
Create job & WU
Volpex Dataspace Server
X
Hot spare group
Scheduler request
Scheduler reply
dynamically create result from WU
BOINC Scheduler
heartbeat
checkpoint
request procID
get procID
procID of failed instance
replacement
Scientist
![Page 24: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/24.jpg)
24
FUTURE WORK
Hien Nguyen University of Houston
Experiment and evaluate with different degrees of freedom:
•Number of processes 10-1M•Communication pattern (local/global, synch/asynch)•Size and frequency of communication
Goal: study (via live experiment or simulation) the performance of Volpex/BOINC over this space
Application: Eratosthenes, REMD Protein Folding
Enhance host selection policy
![Page 25: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/25.jpg)
25
OTHER WORK
Hien Nguyen University of Houston
Volpex MPI: •An MPI library designed for executing parallel applications in volunteer environment.•Direct communication between processes.•Key Features
Controlled redundancy Receiver based direct communication Distributed sender based logging
More detail: “VolpexMPI: an MPI Library for Execution of Parallel Applications on Volatile Nodes” by Troy LeBlanc, Rakhi Anand, Edgar Gabriel, and Jaspal Subhlok.
![Page 26: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/26.jpg)
26
If you have application?
Hien Nguyen University of Houston
We would be happy to cooperate with you.
Our team contacts:•Dr. Jaspal Subhlok: [email protected]•Dr. David Anderson: [email protected]•Dr. Edgar Gabriel: [email protected]•Hien Nguyen: [email protected]•Eshwar Rohit: [email protected]•Rakhi Anand: [email protected]
Our Website: http://www2.cs.uh.edu/~jsteach/volpex/index.htm
![Page 27: BOINC Workshop 10](https://reader035.vdocument.in/reader035/viewer/2022062410/56815a87550346895dc7f667/html5/thumbnails/27.jpg)