towards workflow ecosystems through semantic and standard representations

33
1 Towards Workflow Ecosystems Through Semantic and Standard Representations Yolanda Gil Information Sciences Institute and Department of Computer Science University of Southern California http://www.isi.edu/~gil @yolandagil [email protected] Daniel Garijo, Oscar Corcho OEG-DIA Facultad de Informática, Universidad Politécnica de Madrid http://purl.org/net/dgarijo @dgarijo,@ocorcho {dgarijo,ocorcho}@fi.upm.es

Upload: dgarijo

Post on 03-Jul-2015

136 views

Category:

Engineering


1 download

DESCRIPTION

Workflows are increasingly used to manage and share scientific computations and methods. Workflow tools can be used to design, validate, execute and visualize scientific workflows and their execution results. Other tools manage workflow libraries or mine their contents. There has been a lot of recent work on workflow system integration as well as common workflow interlinguas, but the interoperability among workflow systems remains a challenge. Ideally, these tools would form a workflow ecosystem such that it should be possible to create a workflow with a tool, execute it with another, visualize it with another, and use yet another tool to mine a repository of such workflows or their executions. In this paper, we describe our approach to create a workflow ecosystem through the use of standard models for provenance (OPM and W3C PROV) and extensions (P-PLAN and OPMW) to represent workflows. The ecosystem integrates different workflow tools with diverse functions (workflow generation, execution, browsing, mining, and visualization) created by a variety of research groups. This is, to our knowledge, the first time that such a variety of workflow systems and functions are integrated

TRANSCRIPT

Page 1: Towards Workflow Ecosystems Through Semantic and Standard Representations

1

Towards Workflow Ecosystems

Through

Semantic and Standard Representations

Yolanda Gil

Information Sciences Institute and Department of Computer Science

University of Southern California http://www.isi.edu/~gil

@yolandagil [email protected]

Daniel Garijo, Oscar Corcho

OEG-DIA Facultad de Informática,

Universidad Politécnica de Madrid http://purl.org/net/dgarijo

@dgarijo,@ocorcho {dgarijo,ocorcho}@fi.upm.es

Page 2: Towards Workflow Ecosystems Through Semantic and Standard Representations

2

Outline

“Towards Workflow Ecosystems

Through Semantic and Standard Representations”

Motivation

What is a workflow ecosystem

The WEST workflow ecosystem

Semantic and standard representations in WEST

Page 3: Towards Workflow Ecosystems Through Semantic and Standard Representations

3

Proliferation of Workflow Systems and

Workflow Functions

Workflow design

Workflow validation

Workflow execution

Workflow visualization

Workflow mining

End users need a more fluid way to utilize workflow functions based on initial application requirements and as those requirements evolve over time

Page 4: Towards Workflow Ecosystems Through Semantic and Standard Representations

4

Workflow ecosystems

Workflow ecosystems are integrations of heterogeneous workflow capabilities that scale up along 3 core dimensions: • Functional heterogeneity: Diversity of workflow tools and

workflow functions integrated, developed by independent parties

• Usage heterogeneity: A workflow output by a tool can be consumed by at least two other workflow tools

• Abstraction heterogeneity: A tool can import abstract or detailed views of a workflow based on the granularity the tool can handle

Page 5: Towards Workflow Ecosystems Through Semantic and Standard Representations

5

Interoperability of Workflow Systems:

Prior Work

Integrations of workflow systems • PEGASUS: Condor (execution) [Deelman et al 2006], PASSOA

(provenance) [Miles et al 2007], WINGS (generation) [Gil et al 2007], nanoHUB (creation) [McLennan et al 2013]

• Taverna: Galaxy (execution) [Abouelhoda et al 2012], myExperiment (repository) [De Roure et al 2009], DistillFlow (mining) [Starlinger et al 2012]

Workflow interchange languages • Provenance: Open Provenance Model (OPM) [Moreau et al 2011],

W3C PROV standard [Gil and Miles 2013]

• Workflows: OPMW [Garijo et al 2013], D-PROV [Missier et al 2013], Wfprov [Belhajjame et al 2013], P-PLAN [Garijo and Gil 2012]

• IWIR [Plankensteiner et al WORKS’11]

• WS-BPEL, BPMN

Page 6: Towards Workflow Ecosystems Through Semantic and Standard Representations

6

Workflow ecosystems

Workflow ecosystems are integrations of heterogeneous workflow capabilities that scale up along 3 core dimensions: • Functional heterogeneity: Diversity of workflow tools and

workflow functions integrated, developed by independent parties

– Today: 1-2 workflow tools with 2-3 workflow functions developed by 2-3 institutions

• Usage heterogeneity: A workflow output by a tool can be consumed by at least two other workflow tools

– Today: only one consumer tool, and if more than one they have same function (eg two execution engines)

• Abstraction heterogeneity: A tool can import abstract or detailed views of a workflow based on the granularity the tool can handle

– Today: workflows exported must be fully imported

Page 7: Towards Workflow Ecosystems Through Semantic and Standard Representations

7

WEST (Workflow Ecosystems through STandards)

WEST

Page 8: Towards Workflow Ecosystems Through Semantic and Standard Representations

8

WEST: Workflow Capabilities and Tools

WEST

Page 9: Towards Workflow Ecosystems Through Semantic and Standard Representations

9

Workflow Capabilities and Tools in WEST:

(1) Workflow Generation

[Gil et al 2011]

Page 10: Towards Workflow Ecosystems Through Semantic and Standard Representations

10

Workflow Capabilities and Tools in WEST:

(2) Workflow Mapping and Execution

[Torri et al 2012]

[Deelman et al 2005]

[Mattmann et al 2006]

Page 11: Towards Workflow Ecosystems Through Semantic and Standard Representations

11

Workflow Capabilities and Tools in WEST:

(3) Workflow Mining

[Garijo et al 2013]

Page 12: Towards Workflow Ecosystems Through Semantic and Standard Representations

12

Workflow Capabilities and Tools in WEST:

(4) Workflow Visualization

[Hoekstra and Groth 2014]

Page 13: Towards Workflow Ecosystems Through Semantic and Standard Representations

13

Workflow Capabilities and Tools in WEST:

(4) Workflow Browsing

WExp

[Garijo et al 2011]

Page 14: Towards Workflow Ecosystems Through Semantic and Standard Representations

14

Workflow Capabilities and Tools in WEST:

(5) Workflow Documentation

[Gil et al 2012]

Page 15: Towards Workflow Ecosystems Through Semantic and Standard Representations

15

Workflow Capabilities and Tools in WEST:

(6) Workflow Sharing Repository

[Garijo et al 2013]

Page 16: Towards Workflow Ecosystems Through Semantic and Standard Representations

16

WEST: Semantic and Standard Representations

WEST

Page 17: Towards Workflow Ecosystems Through Semantic and Standard Representations

17

Types of Workflows Exchanged

WT WI WE

Page 18: Towards Workflow Ecosystems Through Semantic and Standard Representations

18

Overview of Semantic and Standard

Representations in WEST

Workflow template

Plan Definition

Workflow execution

OPM, PROV

P-Plan

OPMW

Generic Provenance

Plan Execution

Execution of

Page 19: Towards Workflow Ecosystems Through Semantic and Standard Representations

19

Overview of Semantic and Standard Representations in WEST:

PROV and OPM

[Moreau et al 2013] [Moreau et al 2011]

Page 20: Towards Workflow Ecosystems Through Semantic and Standard Representations

20

Overview of Semantic and Standard Representations in WEST:

PROV and P-PLAN

[Moreau et al 2013]

[Garijo and Gil 2013]

Page 21: Towards Workflow Ecosystems Through Semantic and Standard Representations

21

Overview of Semantic and Standard Representations in WEST:

OPMW

[Garijo et al 2012]

Page 22: Towards Workflow Ecosystems Through Semantic and Standard Representations

22

Abstraction Heterogeneity:

Mapping Across Models Through Queries

CONSTRUCT{ ?activity2 p-plan:isPrecededBy ?activity. } WHERE{ ?activity a opmw:WorkflowTemplateProcess. ?activity2 a opmw:WorkflowTemplateProcess. ?activity2 opmw:uses / opmw:isGeneratedBy ?activity. }

Returning P-Plan from OPMW objects:

CONSTRUCT{ ?activity a prov:Activity. ?activity2 a prov:Activity. ?activity2 prov:used ?u1 . ?u1 prov:wasGeneratedBy?activity. } WHERE{ ?activity a opmw:WorkflowExecutionProcess. ?activity2 a opmw:WorkflowExecutionProcess. ?activity2 opmv:used ?u1. ?u1 opmv:wasGeneratedBy ?activity. }

Returning PROV from OPMW objects:

P-Plan

OPMW

PROV

OPMW

Page 23: Towards Workflow Ecosystems Through Semantic and Standard Representations

23

Abstraction Heterogeneity:

Mapping to Other Models Through Queries

CONSTRUCT{ ?activity a wfprov:ProcessRun. ?activity2 a wfprov:ProcessRun. ?activity2 wfprov:usedInput ?u1. ?u1 wfprov:wasOutputFrom ?activity. } WHERE{ ?activity a opmw:WorkflowExecutionProcess. ?activity2 a opmw:WorkflowExecutionProcess. ?activity2 opmv:used ?u1. ?u1 opmv:wasGeneratedBy ?activity. }

Returning WfProv from OPMW objects:

WfProv

OPMW

Page 24: Towards Workflow Ecosystems Through Semantic and Standard Representations

24

WEST: Information Flow (1)

WEST

Page 25: Towards Workflow Ecosystems Through Semantic and Standard Representations

25

WEST: Information Flow (2)

WEST

Page 26: Towards Workflow Ecosystems Through Semantic and Standard Representations

26

Is There Functional, Usage, and Abstraction

Heterogeneity in WEST? WEST

Page 27: Towards Workflow Ecosystems Through Semantic and Standard Representations

27

Functional Heterogeneity in WEST

USC/ISI ISD

UCLA/USC

USC/ISI CST

NASA/JPL & Apache

UPM

VUA

WExp

USC/ISI ISD

UPM

USC/ISI ISD

Generation

Mining

Repository Execution

Execution

Execution Visualization

Browsing

Documentation

Page 28: Towards Workflow Ecosystems Through Semantic and Standard Representations

28

Usage Heterogeneity in WEST

Workflow output Consumed by (2 to 5 Tools)

Workflow Instance (from WINGS) Pegasus, Apache OODT

Workflow Execution (from Apache

OODT)

WINGS, OPMW Repository, Organic

Data Science Wiki

Workflow Execution (from Pegasus) WINGS, OPMW Repository, Organic

Data Science Wiki

Workflow Execution (from WINGS) OPMW Repository, Organic Data

Science Wiki, FragFlow, WExp, Prov-

o-viz

Workflow Template (from WINGS) OPMW Repository, Organic Data

Science Wiki, FragFlow, WExp, Prov-

o-viz

Workflow Template (from LONI) OPMW Repository, Organic Data

Science Wiki, FragFlow, WExp, Prov-

o-viz

Page 29: Towards Workflow Ecosystems Through Semantic and Standard Representations

29

Abstraction Heterogeneity:

Different Tools Retrieve Different Workflow Views

PROV

P-PLAN, OPMW

P-PLAN

OPMW

WExp

Page 30: Towards Workflow Ecosystems Through Semantic and Standard Representations

30

Conclusions:

Workflow ecosystems and WEST

Workflow ecosystems scale up integration: • Functional heterogeneity: Diversity of workflow tools and

workflow functions integrated, developed by independent parties

– Today: 1-2 workflow tools with 2-3 workflow functions from 2-3 institutions

– WEST: 9 workflow functions from 6 research groups

• Usage heterogeneity: A workflow output by a tool can be consumed by at least two other workflow tools

– Today: only 1 consumer tool, or if several then same function

– WEST: 2-5 consumer tools with different functions

• Abstraction heterogeneity: A tool can import abstract or detailed views of a workflow based on the granularity the tool can handle

– Today: workflows exported must be fully imported

– WEST: 4+ models provide different granularity

Page 31: Towards Workflow Ecosystems Through Semantic and Standard Representations

31

Benefits

Interoperability across tools with different workflow functions

Flexibility to interchange data at different granularities across tools

Facilitating the integration of content modeled in other (compatible) vocabularies

WEST

Page 32: Towards Workflow Ecosystems Through Semantic and Standard Representations

32

Limitations and Future Work

Less expressivity than IWIR and D-PROV

Converters should be included in each workflow tool

No general “workflow ecosystem APIs” yet

WEST

Page 33: Towards Workflow Ecosystems Through Semantic and Standard Representations

33

Thank you!

http://www.wings-workflows.org

http://www.isi.edu/~gil

Wings contributors: Varun Ratnakar, Daniel Garijo (UPM), Ricky Sethi, Hyunjoon Jo, Jihie Kim, Yan Liu, Dave Kale, Ralph Bergmann (U Trier), William Cheung (HKBU), Pedro Gonzalez & Gonzalo Castro (UCM), Paul Groth (VUA)

Wings collaborators: Ewa Deelman & Gaurang Mehta & Karan Vahi (USC), Sofus Macskassy (ISI), Natalia Villanueva & Ari Kassin (UTEP)

Wings/OODT: Chris Mattmann (JPL), Paul Ramirez (JPL), Dan Crichton (JPL), Rishi Verma (JPL)

Biomedical workflows: Phil Bourne & Sarah Kinnings (UCSD), Chris Mason (Cornell), Joel Saltz & Tahsin Kurk (Emory U.), Jill Mesirov & Michael Reich (Broad), Randall Wetzel (CHLA), Shannon McWeeney & Christina Zhang (OHSU)

Geosciences workflows: Chris Duffy (PSU), Paul Hanson (U Wisconsin, Tom Harmon & Sandra Villamizar (U Merced), Tom Jordan & Phil Maechlin (USC), Kim Olsen (SDSU)

And many others!