starting work ow tasks before they’re...
TRANSCRIPT
![Page 1: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/1.jpg)
Starting Workflow TasksBefore They’re Ready
Wladislaw Gusew, Bjorn Scheuermann
Computer Engineering Group, Humboldt University of Berlin
![Page 2: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/2.jpg)
Agenda
I Introduction
I Execution semantics
I Methods and tools
I Simulation results
I Experimental results
I Conclusion
1 / 21
![Page 3: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/3.jpg)
Big data in research
2 / 21
![Page 4: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/4.jpg)
Scientific workflow example
I Directed Acyclic Graph(DAG)
I Executed on distributedsystems
I Aggregation and broadcasttypes of tasks
I Demanding for networkresources
3 / 21
![Page 5: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/5.jpg)
Execution semantics
4 / 21
![Page 6: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/6.jpg)
Execution semantics
4 / 21
![Page 7: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/7.jpg)
Execution semantics
I But in reality resources are limited
I Execute only a subset of parent tasks concurrently(insufficient number of workers)
I Congestion of network (all parent tasks have the same priority)
4 / 21
![Page 8: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/8.jpg)
Example execution
5 / 21
![Page 9: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/9.jpg)
Example execution
5 / 21
![Page 10: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/10.jpg)
Example execution
5 / 21
![Page 11: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/11.jpg)
Example execution
I Network congestion can slow down processing even further(effects of data losses at the transport protocol layer)
I High delay to the start of the aggregation task
I Low performance andhigh execution costs (e.g., in computation clouds)
5 / 21
![Page 12: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/12.jpg)
What can we do to improve this?
6 / 21
![Page 13: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/13.jpg)
What can we do to improve this?
6 / 21
![Page 14: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/14.jpg)
What can we do to improve this?
6 / 21
![Page 15: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/15.jpg)
What can we do to improve this?
6 / 21
![Page 16: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/16.jpg)
What can we do to improve this?
6 / 21
![Page 17: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/17.jpg)
What can we do to improve this?
6 / 21
![Page 18: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/18.jpg)
What can we do to improve this?
List of actions:
1. Obtain information on task’s input characteristics
2. Refine the workflow and inform the execution engine
3. Let the aggregation task ”feel comfortable” in changed setting
6 / 21
![Page 19: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/19.jpg)
What can we do to improve this?
List of actions:
1. Obtain information on task’s input characteristics
2. Refine the workflow and inform the execution engine
3. Let the aggregation task ”feel comfortable” in changed setting
6 / 21
![Page 20: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/20.jpg)
Obtaining input characteristics
1. Annotations to workflows
2. Manual code review
3. Automated profiling
7 / 21
![Page 21: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/21.jpg)
Automated profiling
I Operating system instrumentation tool
I Enables interception of system calls(file open, read/write, file close)
I Record and evaluate logfiles withtraces of conducted file accesses.
0
0.5
1
1.5
2
2.5
3
0 0.5 1 1.5 2 2.5 3
Read accesses [M
B]
Execution progress [108 CPU cycles]
Reads by mAdd in a small workflow
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
0 2 4 6 8 10 12 14 16 18
Read accesses [M
B]
Execution progress [108 CPU cycles]
Reads by mAdd in a medium sized workflow
8 / 21
![Page 22: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/22.jpg)
Automated profiling
I Operating system instrumentation tool
I Enables interception of system calls(file open, read/write, file close)
I Record and evaluate logfiles withtraces of conducted file accesses.
0
0.5
1
1.5
2
2.5
3
0 0.5 1 1.5 2 2.5 3
Read accesses [M
B]
Execution progress [108 CPU cycles]
Reads by mAdd in a small workflow
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
0 2 4 6 8 10 12 14 16 18
Read accesses [M
B]
Execution progress [108 CPU cycles]
Reads by mAdd in a medium sized workflow
8 / 21
![Page 23: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/23.jpg)
Refining workflow by transforming DAG
9 / 21
![Page 24: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/24.jpg)
Refining workflow by transforming DAG
9 / 21
![Page 25: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/25.jpg)
Refining workflow by transforming DAG
9 / 21
![Page 26: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/26.jpg)
Refining workflow by transforming DAG
9 / 21
![Page 27: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/27.jpg)
Realizing virtual task split
I Real task is transparently wrapped
I FUSE enables the setup of a virtualFile system in USEr space
I Access to input files is performedthrough our wrapper
I Wrapper is responsible for maintainingthe correct execution logic
10 / 21
![Page 28: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/28.jpg)
Evaluation with the Montage workflow
11 / 21
![Page 29: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/29.jpg)
Simulating workflow execution
I Java-based simulation framework for scientific workflows
I Simulates an execution on a Pegasus/HTCondor stack
I Use provided Montage workflows with 25, 50, 100, 1000 tasks
I Python script conducted DAG transformation of DAX files
I Network configured as bottleneck (by bandwidth limitation)
W. Chen and E. Deelman, ”WorkflowSim: A toolkit for simulating scientific workflowsin distributed environments,” in eScience’12.
12 / 21
![Page 30: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/30.jpg)
Simulation results
13 / 21
![Page 31: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/31.jpg)
Simulation results
13 / 21
![Page 32: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/32.jpg)
Variation of number of tasks
1
10
100
1000
25 50 100 1000
Total workflow runtime (log.) [s]
Number of tasks
Simulation results for 50 workers and max-min
Normal Split
15% 19%25%
31%
14 / 21
![Page 33: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/33.jpg)
Variation of workers
15 / 21
![Page 34: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/34.jpg)
Variation of workers
100
150
200
250
300
350
400
450
5 10 50 100
Total workflow runtime [s]
Number of workers
Simulation results for Montage100 and min-min
Normal Split
10%
14%
26% 25%
16 / 21
![Page 35: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/35.jpg)
Variation of scheduling algorithms
17 / 21
![Page 36: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/36.jpg)
Variation of scheduling algorithms
0
50
100
150
200
250
300
350
Min-minMax-min
Round-robin
HEFTDHEFT
Random
Total workflow runtime [s]
Scheduling algorithm
Simulation results for Montage100 on 100 workers
Normal Split
25% 27% 28% 25%
17% 34%
18 / 21
![Page 37: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/37.jpg)
Evaluation in a computing cluster
I Small cluster of up to 10 compute nodes
I Intel i7 CPU@ 2.5GHz, 8GB RAM, connected to commonnetwork switch with 1Gbit/s
I Execute Montage 133 workflow in Pegasus/HTCondor
I Network bandwidth was limited on application layer to10Mbit/s
I 10 repetitions, mean values with 95% confidence intervals
19 / 21
![Page 38: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/38.jpg)
Measurement results
0
20
40
60
80
100
120
140
160
180
200
1 2 3 4 5 6 7 8 9 10
Total workflow runtime [s]
Number of computing nodes
Computing cluster results for 1...10 workers
Original Montage133Transformed Montage133
20 / 21
![Page 39: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/39.jpg)
Conclusion
I Many ”legacy” workflows exist which are executed with classicsemantics
I Our approach is applicable to aggregation tasks that are oftenthe most time intensive tasks in a workflow
I By using DAG transformation, no changes to taskimplementations and execution engines are required
I Simulation and real experiment show that performance can beimproved by up to 15%
I Potential of outperforming the original workflow grows withincreasing #workers and #tasks
21 / 21
![Page 40: Starting Work ow Tasks Before They’re Readyescience-2016.idies.jhu.edu/wp-content/uploads/2016/11/... · 2016-11-01 · Starting Work ow Tasks Before They’re Ready Wladislaw Gusew,](https://reader034.vdocument.in/reader034/viewer/2022042317/5f066ebf7e708231d417f7a4/html5/thumbnails/40.jpg)
Conclusion
I Many ”legacy” workflows exist which are executed with classicsemantics
I Our approach is applicable to aggregation tasks that are oftenthe most time intensive tasks in a workflow
I By using DAG transformation, no changes to taskimplementations and execution engines are required
I Simulation and real experiment show that performance can beimproved by up to 15%
I Potential of outperforming the original workflow grows withincreasing #workers and #tasks
21 / 21