1
Towards Workflow Ecosystems
Through
Semantic and Standard Representations
Yolanda Gil
Information Sciences Institute and Department of Computer Science
University of Southern California http://www.isi.edu/~gil
@yolandagil [email protected]
Daniel Garijo, Oscar Corcho
OEG-DIA Facultad de Informática,
Universidad Politécnica de Madrid http://purl.org/net/dgarijo
@dgarijo,@ocorcho {dgarijo,ocorcho}@fi.upm.es
2
Outline
“Towards Workflow Ecosystems
Through Semantic and Standard Representations”
Motivation
What is a workflow ecosystem
The WEST workflow ecosystem
Semantic and standard representations in WEST
3
Proliferation of Workflow Systems and
Workflow Functions
Workflow design
Workflow validation
Workflow execution
Workflow visualization
Workflow mining
End users need a more fluid way to utilize workflow functions based on initial application requirements and as those requirements evolve over time
4
Workflow ecosystems
Workflow ecosystems are integrations of heterogeneous workflow capabilities that scale up along 3 core dimensions: • Functional heterogeneity: Diversity of workflow tools and
workflow functions integrated, developed by independent parties
• Usage heterogeneity: A workflow output by a tool can be consumed by at least two other workflow tools
• Abstraction heterogeneity: A tool can import abstract or detailed views of a workflow based on the granularity the tool can handle
5
Interoperability of Workflow Systems:
Prior Work
Integrations of workflow systems • PEGASUS: Condor (execution) [Deelman et al 2006], PASSOA
(provenance) [Miles et al 2007], WINGS (generation) [Gil et al 2007], nanoHUB (creation) [McLennan et al 2013]
• Taverna: Galaxy (execution) [Abouelhoda et al 2012], myExperiment (repository) [De Roure et al 2009], DistillFlow (mining) [Starlinger et al 2012]
Workflow interchange languages • Provenance: Open Provenance Model (OPM) [Moreau et al 2011],
W3C PROV standard [Gil and Miles 2013]
• Workflows: OPMW [Garijo et al 2013], D-PROV [Missier et al 2013], Wfprov [Belhajjame et al 2013], P-PLAN [Garijo and Gil 2012]
• IWIR [Plankensteiner et al WORKS’11]
• WS-BPEL, BPMN
6
Workflow ecosystems
Workflow ecosystems are integrations of heterogeneous workflow capabilities that scale up along 3 core dimensions: • Functional heterogeneity: Diversity of workflow tools and
workflow functions integrated, developed by independent parties
– Today: 1-2 workflow tools with 2-3 workflow functions developed by 2-3 institutions
• Usage heterogeneity: A workflow output by a tool can be consumed by at least two other workflow tools
– Today: only one consumer tool, and if more than one they have same function (eg two execution engines)
• Abstraction heterogeneity: A tool can import abstract or detailed views of a workflow based on the granularity the tool can handle
– Today: workflows exported must be fully imported
7
WEST (Workflow Ecosystems through STandards)
WEST
8
WEST: Workflow Capabilities and Tools
WEST
9
Workflow Capabilities and Tools in WEST:
(1) Workflow Generation
[Gil et al 2011]
10
Workflow Capabilities and Tools in WEST:
(2) Workflow Mapping and Execution
[Torri et al 2012]
[Deelman et al 2005]
[Mattmann et al 2006]
11
Workflow Capabilities and Tools in WEST:
(3) Workflow Mining
[Garijo et al 2013]
12
Workflow Capabilities and Tools in WEST:
(4) Workflow Visualization
[Hoekstra and Groth 2014]
13
Workflow Capabilities and Tools in WEST:
(4) Workflow Browsing
WExp
[Garijo et al 2011]
14
Workflow Capabilities and Tools in WEST:
(5) Workflow Documentation
[Gil et al 2012]
15
Workflow Capabilities and Tools in WEST:
(6) Workflow Sharing Repository
[Garijo et al 2013]
16
WEST: Semantic and Standard Representations
WEST
17
Types of Workflows Exchanged
WT WI WE
18
Overview of Semantic and Standard
Representations in WEST
Workflow template
Plan Definition
Workflow execution
OPM, PROV
P-Plan
OPMW
Generic Provenance
Plan Execution
Execution of
19
Overview of Semantic and Standard Representations in WEST:
PROV and OPM
[Moreau et al 2013] [Moreau et al 2011]
20
Overview of Semantic and Standard Representations in WEST:
PROV and P-PLAN
[Moreau et al 2013]
[Garijo and Gil 2013]
21
Overview of Semantic and Standard Representations in WEST:
OPMW
[Garijo et al 2012]
22
Abstraction Heterogeneity:
Mapping Across Models Through Queries
CONSTRUCT{ ?activity2 p-plan:isPrecededBy ?activity. } WHERE{ ?activity a opmw:WorkflowTemplateProcess. ?activity2 a opmw:WorkflowTemplateProcess. ?activity2 opmw:uses / opmw:isGeneratedBy ?activity. }
Returning P-Plan from OPMW objects:
CONSTRUCT{ ?activity a prov:Activity. ?activity2 a prov:Activity. ?activity2 prov:used ?u1 . ?u1 prov:wasGeneratedBy?activity. } WHERE{ ?activity a opmw:WorkflowExecutionProcess. ?activity2 a opmw:WorkflowExecutionProcess. ?activity2 opmv:used ?u1. ?u1 opmv:wasGeneratedBy ?activity. }
Returning PROV from OPMW objects:
P-Plan
OPMW
PROV
OPMW
23
Abstraction Heterogeneity:
Mapping to Other Models Through Queries
CONSTRUCT{ ?activity a wfprov:ProcessRun. ?activity2 a wfprov:ProcessRun. ?activity2 wfprov:usedInput ?u1. ?u1 wfprov:wasOutputFrom ?activity. } WHERE{ ?activity a opmw:WorkflowExecutionProcess. ?activity2 a opmw:WorkflowExecutionProcess. ?activity2 opmv:used ?u1. ?u1 opmv:wasGeneratedBy ?activity. }
Returning WfProv from OPMW objects:
WfProv
OPMW
24
WEST: Information Flow (1)
WEST
25
WEST: Information Flow (2)
WEST
26
Is There Functional, Usage, and Abstraction
Heterogeneity in WEST? WEST
27
Functional Heterogeneity in WEST
USC/ISI ISD
UCLA/USC
USC/ISI CST
NASA/JPL & Apache
UPM
VUA
WExp
USC/ISI ISD
UPM
USC/ISI ISD
Generation
Mining
Repository Execution
Execution
Execution Visualization
Browsing
Documentation
28
Usage Heterogeneity in WEST
Workflow output Consumed by (2 to 5 Tools)
Workflow Instance (from WINGS) Pegasus, Apache OODT
Workflow Execution (from Apache
OODT)
WINGS, OPMW Repository, Organic
Data Science Wiki
Workflow Execution (from Pegasus) WINGS, OPMW Repository, Organic
Data Science Wiki
Workflow Execution (from WINGS) OPMW Repository, Organic Data
Science Wiki, FragFlow, WExp, Prov-
o-viz
Workflow Template (from WINGS) OPMW Repository, Organic Data
Science Wiki, FragFlow, WExp, Prov-
o-viz
Workflow Template (from LONI) OPMW Repository, Organic Data
Science Wiki, FragFlow, WExp, Prov-
o-viz
29
Abstraction Heterogeneity:
Different Tools Retrieve Different Workflow Views
PROV
P-PLAN, OPMW
P-PLAN
OPMW
WExp
30
Conclusions:
Workflow ecosystems and WEST
Workflow ecosystems scale up integration: • Functional heterogeneity: Diversity of workflow tools and
workflow functions integrated, developed by independent parties
– Today: 1-2 workflow tools with 2-3 workflow functions from 2-3 institutions
– WEST: 9 workflow functions from 6 research groups
• Usage heterogeneity: A workflow output by a tool can be consumed by at least two other workflow tools
– Today: only 1 consumer tool, or if several then same function
– WEST: 2-5 consumer tools with different functions
• Abstraction heterogeneity: A tool can import abstract or detailed views of a workflow based on the granularity the tool can handle
– Today: workflows exported must be fully imported
– WEST: 4+ models provide different granularity
31
Benefits
Interoperability across tools with different workflow functions
Flexibility to interchange data at different granularities across tools
Facilitating the integration of content modeled in other (compatible) vocabularies
WEST
32
Limitations and Future Work
Less expressivity than IWIR and D-PROV
Converters should be included in each workflow tool
No general “workflow ecosystem APIs” yet
WEST
33
Thank you!
http://www.wings-workflows.org
http://www.isi.edu/~gil
Wings contributors: Varun Ratnakar, Daniel Garijo (UPM), Ricky Sethi, Hyunjoon Jo, Jihie Kim, Yan Liu, Dave Kale, Ralph Bergmann (U Trier), William Cheung (HKBU), Pedro Gonzalez & Gonzalo Castro (UCM), Paul Groth (VUA)
Wings collaborators: Ewa Deelman & Gaurang Mehta & Karan Vahi (USC), Sofus Macskassy (ISI), Natalia Villanueva & Ari Kassin (UTEP)
Wings/OODT: Chris Mattmann (JPL), Paul Ramirez (JPL), Dan Crichton (JPL), Rishi Verma (JPL)
Biomedical workflows: Phil Bourne & Sarah Kinnings (UCSD), Chris Mason (Cornell), Joel Saltz & Tahsin Kurk (Emory U.), Jill Mesirov & Michael Reich (Broad), Randall Wetzel (CHLA), Shannon McWeeney & Christina Zhang (OHSU)
Geosciences workflows: Chris Duffy (PSU), Paul Hanson (U Wisconsin, Tom Harmon & Sandra Villamizar (U Merced), Tom Jordan & Phil Maechlin (USC), Kim Olsen (SDSU)
And many others!