adam e science_2013_1
DESCRIPTION
TRANSCRIPT
![Page 1: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/1.jpg)
• Clck to edit Master 0tle style
• Click to edit Master 0tle style
WS-‐VLAM Workflow Management System and its Applica0ons
Adam Belloum Ins0tute of Informa0cs University of Amsterdam [email protected]
Lunch meeting, Netherlands eScience Center, Amsterdam 2013
![Page 2: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/2.jpg)
• Clck to edit Master 0tle style
Outline
• Introduc0on – Life cycle of e-‐Science Workflow
• Different approaches to workflow scheduling – Workflow Process Modeling & Management In Grid/Cloud
– Workflow and Web services (intrusive/non intrusive) • Provenance • Compu0ng in the browser • Conclusions
![Page 3: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/3.jpg)
• Clck to edit Master 0tle style
Workflow Management Systems Workflow management system coordinates the execu/on of a scien0fic applica0ons on a set of compu/ng distributed resources
Compu0ng task management
Grid/Cloud middleware
Data Security
Data provenance Workflow Engine
Repositories
Data Management
http://www.youtube.com/watch v=R6bTFrzaR_w&feature=player_embedded
![Page 4: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/4.jpg)
• Clck to edit Master 0tle style
Life Cycle of a Scien0fic Experiment
(1) Problem investigation: • Look for relevant problems • Browse available tools • Define the goal • Decompose into steps
(2) Experiment Prototyping: • Design experiment workflows • Develop necessary components
(3) Experiment Execution: • Execute experiment processes • Control the execution • Collect and analysis data
(4) Results Publication: • Annotate data • Publish data
Shared repositories
Collaborative e-Science experiments: from scientific workflow to knowledge sharing A.S.Z. Belloum, Vladimir Korkhov, Spiros Koulouzis, Marcia A Inda, and Marian Bubak JULY/AUGUSTInternet Computing, IEEE, vol.15, no.4, pp.39-47, July-Aug. 2011 doi: 10.1109/MIC.2011.87
![Page 5: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/5.jpg)
• Clck to edit Master 0tle style
Model of computa0on
• Target: stream-‐based Applica0ons – Engine co-‐allocates all workflow components
• Communica0on: 0me coupled – Assumes components are running – Simultaneously – Synchronized p2p
WS-‐VLAM Engine: Architecture (1st genera0on)
Service host(s) compute element(s)
GRAM services
GT4 Java Container
RTSM Factory
Delegation service
Worker nodes
Pre/ws-GRAM/GT5
Clie
nt
Delegate
RTSM Instance
Workflow components
![Page 6: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/6.jpg)
• Clck to edit Master 0tle style
WS-‐VLAM Features • Provide streaming facili0es between applica0ons executed on resource
geographically distributed. • Composi/on and the execu0on of hierarchical workflows. • Remote graphical output. • Detach/a:ach capability for long running workflows. • Provides a monitoring facili0es based on the WS-‐no0fica0on. • Provides workflow farming possibili0es.
More features: http://staff.science.uva.nl/~gvlam/wsvlam/demos/wsvlam-about.html
SigWin-Detector workflow has been developed in the VL-e project to detect ridges in for instance a Gene Expression sequence or Human transcriptome map, BMC Research Notes 2008, 1:63 doi:10.1186/1756-0500-1-63.
DNA curvature of the Escherichia Coli chromosome
![Page 7: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/7.jpg)
• Clck to edit Master 0tle style
Easy Deployment czxc` Workflow composer (java Web Start) • WS-VLAM composer • VBrowser • Semantic tools SAW: Semantic Annotation for Workflow CLAMP: Connecting LAnguage for Modules & Programs HAMMER: Hybrid-bAsed MatchMaker for e-Science Resources
Sara: National super computing center
Server host
Production Grid
Experimental Environment
Storage
WSRF Services (2 services) - WS-VLAM engine - workflow component repository
![Page 8: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/8.jpg)
• Clck to edit Master 0tle style
WS-VLAM composer
• Workflow engine may be invoked form other systems like § Taverna, Kepler, Pgrade
• Workflow may be made available to entire community § using Web 2.0 approach
Workflow Sharing and re-‐usability
![Page 9: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/9.jpg)
• Clck to edit Master 0tle style
Comparing to other Workflow Management Systems
• WS-‐VLAM was studied by Elts and Bungartz in 2010 from the Ins0tute of informa0cs of the Technical university of Munich as a poten0al pla]orm for a PhD work
Grid-Workflow-Management-Systeme fur die Ausfuhrung wissenschaftlicher Prozessablaufe Ekaterina Elts, Hans-Joachim Bungartz http://staff.science.uva.nl/~gvlam/wsvlam/Publications/workflow_review_Elts.pdf
![Page 10: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/10.jpg)
• Clck to edit Master 0tle style
WS-‐VLAM Comparing to other Workflow Management Systems
by Ekaterina Elts TUMunchen
![Page 11: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/11.jpg)
• Clck to edit Master 0tle style
WS-‐VLAM Comparing to other Workflow Management Systems
by Ekaterina Elts TUMunchen
![Page 12: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/12.jpg)
• Clck to edit Master 0tle style
WS-‐VLAM Engine: architecture (2nd genera0on)
• Target: loosely couple applica0ons – components scheduled depending on data – components only ac/vated when data is available – no need for co-‐alloca/on
• Communica0on: 0me decouples – messaging communica0on system. – components not synchronized
![Page 13: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/13.jpg)
• Clck to edit Master 0tle style
WS-‐VLAM Engine: Architecture (2nd genera0on) Data driven Workflow coordina0on
Reginald Cushing, Spiros Koulouzis, Adam S. Z. Belloum, Marian Bubak, Prediction-based Auto-scaling of Scientific Workflows, MGC’2011, December 12th, 2011, Lisbon, Portugal
![Page 14: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/14.jpg)
• Clck to edit Master 0tle style
Farming with WS-‐VLAM
• Task farming: task replica0on – parameter sweep applica0on, – DNA Sequencing, – Monte Carlo, – …
• Implements 3 types of farming: – Auto Farming: the number of tasks/services to run is propor0onal to the load
– One-‐to-‐One Farming: A task replicated for every message received. – Fixed Farming: user defined farming.
![Page 15: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/15.jpg)
• Clck to edit Master 0tle style
Farming and Auto-‐scaling of Workflows
Reginald Cushing, Spiros Koulouzis, Adam S. Z. Belloum, Marian Bubak, Prediction-based Auto-scaling of Scientific Workflows, Proceedings of the 9th International Workshop on Middleware for Grids, Clouds and e-Science, ACM/IFIP/USENIX December 12th, 2011, Lisbon, Portugal
![Page 16: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/16.jpg)
• Clck to edit Master 0tle style
Workflow as a Service (WFaaS) Reduce Scheduling Overhead
Workflow as a Service: An Approach to Workflow Farming, Reginald Cushing, Adam S. Z. Belloum, V. Korkhov, D. Vasyunin, M.T. Bubak, C. Leguy ECMLS’12, June 18, 2012, Delft, The Netherlands
![Page 17: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/17.jpg)
• Clck to edit Master 0tle style
Resource on-‐demand for Applica0ons with Hard Deadline (Urgent Compu0ng)
Resource on-demand using multiple cloud providers, Super-computing 2010, and SCALE 2012
![Page 18: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/18.jpg)
• Clck to edit Master 0tle style
Outline
(1) Problem
investigation:
(2) Experiment
Prototyping
(3) Experiment
Execution:
(4) Results
Publication:
Shared repositories
• Introduc0on – Life cycle of e-‐Science Workflow
• Different approaches to workflow scheduling – Workflow Process Modeling & Management In Grid/Cloud
– Workflow and Web services (intrusive/non intrusive)
• Provenance • Compu0ng in the browser • Conclusions
![Page 19: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/19.jpg)
• Clck to edit Master 0tle style
Web Services in eScience with WS-‐VLAM
• WS offer interoperability and flexibility in a large scale distributed environment.
• WS can be combined in a workflow so that more complex opera0ons may be achieved, but any workflow implementa0on is poten0ally faced with a data transport problem or being overload
Scale up the number of Web service to keep up with the incoming load
![Page 20: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/20.jpg)
• Clck to edit Master 0tle style
Scaling up the number of Web services
• Tasks/Jobs can be queued on the runqueue.
• The service submission listens on the runqueue and picks up new tasks to submit
• Resources such as Grid or Cloud are abstracted using submieers plugins – Enabling a new resource is a maeer of
wri0ng its submieer (Condor, ibis, …)
Reginald Cushing, Spiros Koulouzis, Adam S. Z. Belloum, Marian Bubak, Dynamic Handling for Cooperating Scientific Web Services, 7th IEEE International Conference on e-Science, December 2011, Stockholm, Sweden
![Page 21: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/21.jpg)
• Clck to edit Master 0tle style
Sequence Alignment Use Case
• Workflow with 2 pipelines. The pipelines perform sequence alignments using data from UniProtKB
• Each pipeline performs 22500 alignments i.e. 45100 total alignments in all
• All modules are standard web services which are hosted in the modified Axis2 container
• The alignments where performed using BioJava api
• Source and sink are part of the bootstrapping sequence. – Source submits the getSequenceId service – while sink waits for output from the htmlRenderer
Reginald Cushing, Spiros Koulouzis, Adam S. Z. Belloum, Marian Bubak, Dynamic Handling for Cooperating Scientific Web Services, 7th IEEE International Conference on e-Science, December 2011, Stockholm, Sweden
![Page 22: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/22.jpg)
• Clck to edit Master 0tle style
Scale up web services
Peaks in load(lej) will result in peaks in instances(right).
The fuzzy controllers scale up the web services to meet the demands
![Page 23: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/23.jpg)
• Clck to edit Master 0tle style
Enabling web services to consume and produce large distributed
• In service orchestra0on, all data is passed to the workflow engine before delivered to a consuming WS
• Data transfers are made through SOAP, which is unfit for large data transfers
Enabling web services to consume and produce large distributed datasets Spiros Koulouzis, Reginald Cushing, Konstantinos Karasavvas, Adam Belloum, Marian Bubak to be published JAN/FEB, IEEE Internet Computing, 2012
![Page 24: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/24.jpg)
• Clck to edit Master 0tle style
Enabling web services to consume and produce large distributed
Enabling web services to consume and produce large distributed datasets Spiros Koulouzis, Reginald Cushing, Konstantinos Karasavvas, Adam Belloum, Marian Bubak to be published JAN/FEB, IEEE Internet Computing, 2012
Indexing Web Services for Information Retrieval (NER) are tools that help biologists to identify and retrieve information • Index 8.4GB of medline documents
![Page 25: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/25.jpg)
• Clck to edit Master 0tle style
Outline
(1) Problem
investigation:
(2) Experiment
Prototyping
(3) Experiment
Execution:
(4) Results
Publication:
Shared repositories
• Introduc0on – Life cycle of e-‐Science Workflow
• Different approaches to workflow scheduling – Workflow Process Modeling & Management In Grid/Cloud
– Workflow and Web services (intrusive/non intrusive)
• Provenance • Compu0ng in the browser • Conclusions
![Page 26: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/26.jpg)
• Clck to edit Master 0tle style
Provenance/ Reproducibility
• “A complete provenance record for a data object allows the possibility to reproduce the result and reproducibility is a cri0cal component of the scien0fic method”
• Provenance: The recording of metadata and provenance informa0on during the various stages of the workflow lifecycle
Workflows and e-Science: An overview of workflow system features and capabilities Ewa Deelmana, Dennis Gannonb, Matthew Shields c, Ian Taylor, Future Generation Computer Systems 25 (2009) 528540
![Page 27: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/27.jpg)
• Clck to edit Master 0tle style
History-‐tracing XML (FH Aachen)
provides data/process provenance following an approach that • maps the workflow graph to a
layered structure of an XML document.
• This allows an intui0ve and easy processable representa0on of the workflow execu0on path
• Workflow components can be eventually, electronically signed.
<patternMatch> <events> <PortResolved>provenance data</PortResolved> <ConDone> provenance data </ConDone> ... </events> <fileReader2> <events> ... </events> <sign-fileReader2> ... </sign-fileReader2> </fileReader2> <sffToFasta> Reference </sffToFasta> <sign-patternMatch> ... </sign-patternMatch> </patternMatch>
<patternMatch> <events> <PortResolved> provenance data </PortResolved> <ConDone>provenance data </ConDone> ... </events> <fileReader2> <events> ... </events> <sign-fileReader2> ... </signfileReader2> </fileReader2> <sffToFasta> Reference </sffToFasta> <sign-patternMatch> ... </sign-patternMatch> </patternMatch>
M. Gerards, Adam S. Z. Belloum, F. Berritz, V. Snder, S. Skorupa, A History-tracing XML-base Provenance Framework for workflows, WORKS 2010, New Orleans, USA, November 2010
![Page 28: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/28.jpg)
• Clck to edit Master 0tle style
[Biomedical engineering Cardiovascular biomechanics group TUE])
wave propagation model of blood flow in large vessels using an approximate velocity profile function: a biomedical study for which 3000 runs were required to perform a global sensitivity analysis of a blood pressure wave propagation in arteries
User Interface to compose workflow (top right), monitor the execution of the farmed workflows (top left), and monitor each run separately (bottom left) data
Query interface for the provenance data collected from 3000 simulations of the “wave propagation model of blood flow in large vessels using an approximate velocity profile function”
BigGrid project 2009, presented EGI/BigGrid technical forum 2010
Wave Propaga0on in Blood Flow
![Page 29: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/29.jpg)
• Clck to edit Master 0tle style
Alignment of DNA Sequence (Blast)
For Each workflow run • The provenance data is collected an stored
following the XML-tracing system • User interface allows to reproduce events that
occurred at runtime (replay mode) • User Interface can be customized (User can
select the events to track) • User Interface show resource usage
The aim of the application is the alignment of DNA sequence data with a given reference database.
on-going work UvA-AMC-fh-aachen
[Department of Clinical Epidemiology, Biostatistics and Bioinformatics (KEBB), AMC ]
![Page 30: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/30.jpg)
• Clck to edit Master 0tle style
Sensi0vity Analysis of Cardiovascular Models
Study covers: 164 patient-specific input parameters (i.e. vessel diameter and length), • 1494000 Monte Carlo runs are needed for this
first analysis. • Expected execution time on a PC 14.525 hours
(20 Months)
FP7 Project VPH-Share " Virtual Physiological Human: Sharing for Healthcare”
![Page 31: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/31.jpg)
• Clck to edit Master 0tle style
Protein Folding
![Page 32: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/32.jpg)
• Clck to edit Master 0tle style
Outline
(1) Problem
investigation:
(2) Experiment
Prototyping
(3) Experiment
Execution:
(4) Results
Publication:
Shared repositories
• Introduc0on – Life cycle of e-‐Science Workflow
• Different approaches to workflow scheduling – Workflow Process Modeling & Management In Grid/Cloud
– Workflow and Web services (intrusive/non intrusive)
• Provenance • Compu0ng in the browser • Conclusions
![Page 33: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/33.jpg)
• Clck to edit Master 0tle style
Compu0ng in the Browser (different approach to Grid-‐desktop)
• Objec/ves
– Distributed compu0ng using web browsers
• Features:
– volunteer compu0ng instantly (no third party sojware installa0on)
• How does it work: – Social media mediates the trust between the user and the
volunteers asked to join the network. • A user with a distributed applica0on uses social media to get colleagues and friends to
donate CPU • Colleagues and friends join the network by simply opening the shared URL. • Compu/ng starts almost instantly.
![Page 34: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/34.jpg)
• Clck to edit Master 0tle style
Compu0ng in the Browser
Distributed computing on an Ensemble of Browsers, R. Cushing, G.a Putra, S. Koulouzis, A.S.Z Belloum, M.T. Bubak, C. de Laat IEEE Internet Computing, PrePress 10.1109/MIC.2013.3, January 2013
• Figure shows the distribution of 33,000
bio-informatics tasks on the global cluster of
browsers.
• Figure shows the distribution of 33,000
bio-informatics tasks on the global cluster of
browsers.
Application • Computing 33,000 bio-informatics tasks on
the global cluster of browsers
• Announcing the experiment using social media: via social media tools: twitter, FaceBook, Linkedin, and project mailing lists.
• Volunteers were asked to open the Weevil web page http://elab.lab.uvalight.net/~weevil/ and agree to donate their CPU for 3 hours on Friday December 2011 from 12:00-14:00
0.4
0.6
0.8
1
1.2
1.4
00:00 01:00 02:00 03:00 04:00 05:000.0e+00
5.0e+03
1.0e+04
1.5e+04
2.0e+04
2.5e+04
3.0e+04
3.5e+04
4.0e+04
Estim
ated
GFL
OPS
Com
plet
ed T
asks
Time HH:MM
TasksAggregated Power (GFLOPS)
New Web Technologies make JavaScript engines more powerful: • Web workers • web socket • WebGL • WebCL
![Page 35: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/35.jpg)
• Clck to edit Master 0tle style
Conclusions
• WS-‐VLAM has interes0ng features (farming, automa0c scaling, hierarchical workflow, provenance, …) which proved to be interes0ng for a number scien0fic applica0ons
• WS-‐VLAM harnesses various type of compu0ng resources (desktops, Grid, and Cloud resources)
• WS-‐VLAM has modular design which make easy to adapt/extend
![Page 36: Adam e science_2013_1](https://reader034.vdocument.in/reader034/viewer/2022052522/5480b3b3b379593a2b8b5aab/html5/thumbnails/36.jpg)
• Clck to edit Master 0tle style
http://www.science.n/~gvlam/wsvlam/
http://www.commit-nl.nl/