cloudmc: a cloud computing map-reduce implementation for radiotherapy. ruben jimenez & hector...
DESCRIPTION
Session presented at Big Data Spain 2012 Conference 16th Nov 2012 ETSI Telecomunicacion UPM Madrid www.bigdataspain.org More info: http://www.bigdataspain.org/es-2012/conference/cloudMC-a-cloud-computing-map-reduce-implementation-for-radiotherapy/ruben-jimenez-and-hector-mirasTRANSCRIPT
![Page 1: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/1.jpg)
CloudMC: A cloud computing map-reduce implementation
for radiotherapy
Rubén Jiménez MarrufoHéctor Miras del RíoCarlos Miras del RíoCarles Gomà Estadella
Big Data Spainhttp://www.bigdataspain.org
Madrid, November 16th, 2012
![Page 2: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/2.jpg)
Contents
IntroductionRadiotherapyMonte Carlo simulations for radiation transportMonte Carlo parallelizationClustering vs. Cloud ComputingCloud Computing for clinical radiation transportCloudMC
DEMO STARTArchitectureMap ReduceElasticityHow did Radarc help us?ResultsIs it reinventing the wheel?RoadmapDEMO RESULTS
Questions & Answers
![Page 3: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/3.jpg)
Introduction
Héctor Miras del RíoDepartment of Medical Physics, Virgen Macarena Hospital, Seville, Spain Rubén Jiménez MarrufoR&D Division, Icinetic TIC S.L., Seville, Spain
Carlos Miras del RíoR&D Division, Wedoit Innovacion Tecnologica, Seville, SpainCarles GomàCentre for Proton Therapy, Paul Scherrer Institute, Villigen PSI, Switzerland
![Page 4: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/4.jpg)
Introduction
Monte Carlo Simulations
Radiotherapy
Cloud Computing
![Page 5: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/5.jpg)
Radiotherapy
Radiotherapy: is the medical use of ionizing radiation, generally as part of cancer treatment to control or kill malignant cells.
Radiotherapy treatment planning: is the process for calculating the radiation dose to be absorbed by an object to be irradiated, prior to radiotherapy.
![Page 6: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/6.jpg)
Monte Carlo simulations for radiation transport
![Page 7: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/7.jpg)
Monte Carlo simulations for radiation transport
![Page 8: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/8.jpg)
+👍 Gold standard algorithms for radiation calculations
- 👎 Extremely computationally intensive and very time-consuming.
Monte Carlo simulation for radiation transport
Monte Carlo Simulations:
![Page 9: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/9.jpg)
Monte Carlo parallelization
Parallelization: Execute simultaneously one simulation in several nodes and merge the results.
Monte Carlo simulations are highly parallelizable since the primary events are independent.
![Page 10: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/10.jpg)
Parallelization: Clustering vs. Cloud Computing
![Page 11: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/11.jpg)
Cloud Computing for clinical radiation
calculations
100 cores cluster ≈ 20 000 €
Cost / plan
2 €
tCPU = 100 h
Number instanc
esn = 100
T(n) = 1.44 h
Extra-small
0.0142 € / h
1000 patients
/ year
Cost / year
2 000 €
160 years of computing time in an extra-small instance
![Page 12: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/12.jpg)
CloudMC
CloudMC offers an implementation of map/reduce over Windows Azure cloud computing platform, for the parallelization of MC simulations of radiation therapy dose distribution.
Non-intrusive
Multi-application: Penelope Geant4 EGSnrc
Elasticity: Resources are not reserved 1 hour simulation costs 1 hour
![Page 13: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/13.jpg)
CloudMC: DEMO
![Page 14: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/14.jpg)
CloudMC Architecture
Worker Roles
UI
Service Management
Simulation filesMessages Queues
Cloud Storage
Cloud Hosted Services
SQL Azure
Users & Simulation
Repositories
Provisioning
MapReduceFactory
Entities
Services
![Page 15: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/15.jpg)
1. New simulation
3. Parallel execution 4. Reduce 5. End of
simulation2. Map
5. End of Simulation
- Finished simulation metadata is saved on SQL Azure.
- Mail notices to the user of the end of the simulation to proceed to download the results.
2. Map
- Generation of n initial independent seeds.- Mapper: Modification of simulation config to divide histories by n. - Provisioning of the n worker roles.- Sending of n messages of “start”.
1. New simulation
- Simulation metadata is saved on SQL Azure.
- Simulation files are uploaded to the Azure Storage.
4. Reduce
- When the web role reads the n messages of end of simulation, Resolver merges the n results uploaded to the storage.
- n-1 worker roles are scaled down.
3. Parallel Execution
Every worker role:
1. Reads a message from the queue and downloads the simulation files.
2. Executes the “fragmented” simulation.
3. Sends the results to the storage.
4. Sends an “end of simulation” message.
CloudMC: MapReduce
Sequence of actions when carrying out a MC simulation on n instances:
![Page 16: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/16.jpg)
CloudMC: Map
Input A: Configuratio
nFiles
• Simulation parameters• Histories count• Geometry & materials
files• …• MapReduce
Parameters
ExecutableHistories: 1015
Input B
Histories: 215
ExecutableExecutableExecutableExecutableMapped Executable
Mapper: parametrized mapper to set histories number and seeds in the input files
Most of MC applications for radiation transport simulation read the configuration from textual files.
![Page 17: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/17.jpg)
CloudMC: Reduce
The result of MC applications for radiation transport simulation are dose, energy or any magnitude distribution files formatted in columns.
ExecutableExecutableExecutableExecutableMapped Executable
ExecutableExecutableExecutableExecutableDose distribution
files
Output
Reducer: parametrized reducer to combine columns depending on the column type:- Magnitude column- Uncertainty column
![Page 18: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/18.jpg)
CloudMC: MapReduce DSL
CloudMC uses a MapReduce DSL to read parameters to adapt Mapper and Reducer to specific MC applications.
Mapper parameters Reducer parameters
![Page 19: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/19.jpg)
CloudMC: Elasticity
Users choose the number of instances to use for each simulation.
CloudMC scales up worker role to run simulation and scales down when it finishes.
Windows Azure Service Management allows roles scaling:
👍 REST API 👍 Based on XML config files
👎 Minimum of 1 instance 👎 Impossible to scale down
specific instances (Multi-tenant)
![Page 20: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/20.jpg)
Worker Roles
UI
Service Management
Simulation files
Messages Queues
User account
s
Cloud Storage
Cloud Hosted Services
SQL Azure
Users & Simulation
Repositories
Provisioning
MapReduce
FactoryEntities
ServicesFormula Azure
≃ 50% generated code:
• ASP.Net MVC 3 UI
• C# App Services
• C# POCO Entities
• EF CodeFirst• SQL Azure DB
Focus on domain core: map/reduce, provisioning, fault tolerance, etc.
CloudMC: How did Radarc help us?
![Page 21: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/21.jpg)
CloudMC: Results
Case Study:Simulation: 125I seed in ophtalmic applicator.Number of histories: 3·109
MC Code: PENELOPE, main program PenEasy.
Results:Worker instances size: extra-smallClock time in 1 instance: 30 hClock time in 64 instances: 48 min
(speed up = 37x)
![Page 22: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/22.jpg)
T(n): Clock time for 1 simulation in n instances.
tcpu: Overall time used only in the simulation of n histories.
Dt0: Non-parallelizable time for 1 instance.
a: Non-parallelizable part of time proportional to n.
CloudMC: Results
Time vs number of instances study
![Page 23: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/23.jpg)
CloudMC: Is it reinventing the wheel?
http://stackoverflow.com/questions/1190520/is-it-possible-to-write-map-reduce-jobs-for-amazon-elastic-mapreduce-using-net
Why not using Amazon Elastic MapReduce? (http://aws.amazon.com/es/elasticmapreduce)
• Our mapper and reducer were written for .Net
Why not using Hadoop On Azure? (http://www.hadooponazure.com)
• First preview released on 2012.• The cluster size must be reserved.
![Page 24: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/24.jpg)
Roadmap
Testing with more MC applications: Geant4, EGSnrc, etc.
Support packages with specific MapReduce implementations• Application to different domains• Use of MEF to provide Mappers and Reducers in
simulation packages
SDK to develop specific MapReduce implementation packages.• Visual Studio Templates could facilitate the
development of CloudMC packages
Enable multi-tenant environments• Concurrent simulations require scaling down of
specific instances that is not possible on Windows Azure.
![Page 25: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/25.jpg)
Questions
![Page 26: CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN JIMENEZ & HECTOR MIRAS at Big Data Spain 2012](https://reader035.vdocument.in/reader035/viewer/2022081414/549d0448ac7959f12a8b48fd/html5/thumbnails/26.jpg)
CloudMC soon available at:
https://cloudmontecarlo.cloudapp.net
Thank you for your attention …
[email protected] @hmiras
[email protected] @rjimenez