ict for health care and life sciences

64
V School of Information Engineering Master of Science in Information Engineering ICT for Health Care and Life Sciences Dipartimento di Elettronica e Informazione Davide Chicco [email protected] http://www.davidechicco.it/

Upload: others

Post on 03-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

V School of Information Engineering

Master of Science in Information Engineering

ICT for Health Care and Life Sciences

Dipartimento di

Elettronica e Informazione

Davide Chicco [email protected] http://www.davidechicco.it/

2

ICT for Health Care and Life Sciences

Service Oriented Architectures (SOA) (Bio)Web Services

2nd part

Davide Chicco [email protected]

Marco Masseroli [email protected]

3

Workflows, a reference

What is a workflow?

• Definition: a workflow is a sequence of connected steps

• A depiction of a sequence of operations, declared as work of a person, a group of persons, or one or more simple or complex mechanisms.

• Workflow may be seen as any abstraction of real work

• The flow being described may refer to a document or product that is being transferred from one step to another

from Wikipedia in English

4

Workflows, a reference (2)

What is a workflow?

• A workflow is a model to represent real work for further assessment, e.g., for describing a reliably repeatable sequence of operations.

• More abstractly, a workflow is a pattern of activity enabled by a systematic organization of resources, defined roles and mass, energy and information flows, into a work process that can be documented and learned.

• Workflows are designed to achieve processing intents of some sort, such as physical transformation, service provision, or information processing.

from Wikipedia in English

5

Workflows, a reference (3)

What makes a workflow?

• A workflow can usually be described using formal or informal flow diagramming techniques, showing directed flows between processing steps.

• Single processing steps or components of a workflow can basically be defined by three parameters:

1. input description: the information required to complete the step

2. transformation rules, algorithms, which may be carried out by associated human roles or machines, or a combination

3. output description: the information, material and energy produced by the step and provided as input to downstream steps.

from Wikipedia in English

6

Workflows, a reference (4)

What does a workflow need?

• Components can only be plugged together if the output of one previous (set of) component(s) is equal to the mandatory input requirements of the following component.

• Thus, the essential description of a component actually comprises only input and output that are described fully in terms of data types and their meaning (semantics).

• The algorithms‘ description or rules' description need only be included when there are several alternative ways to transform one type of input into one type of output – possibly with different accuracy, speed, etc.

from Wikipedia in English

7

Workflows, a reference (5)

What does a workflow need?

• When the components are non-local services that are invoked remotely via a computer network, such as Web Services, additional descriptors, such as quality of service (QoS) and availability, also must be considered.

from Wikipedia in English

8

Workflows, a reference (6)

Example: teaching plan

9

Workflows, a reference

(7)

Example 2:

e-form compiling

in a hospital

10

Workflows, a reference (8)

Workflows and UML

• Unified Modeling Language - UML (nothing to do with UMLS -Unified Medical Language System!!!)

• Workflow graphic depiction: Activity Diagrams

• representations of workflows of stepwise activities and actions with support for choice, iteration and concurrency

• activity diagrams can be used to describe the business and operational step-by-step workflows of components in a system.

• An activity diagram shows the overall flow of control.

Maybe you (should!) have studied workflows previously… ..during Software Engineering course?

11

Workflows, a reference (9)

Activity diagram:

an example for

brainstorming

12

Workflows, a reference (10)

Software to create Workflows:

• Open Source:

• Dia http://live.gnome.org/Dia

• Calligra Flow http://www.calligra-suite.org/flow

• Proprietary:

• Microsoft Visio http://visio.microsoft.com

13

Workflows, a reference (11)

Easy exercise

• write a simple workflow, by using paper or your pc, of this process

• a division of two values ( X / Y )

• Steps: reading of X; reading of Y; control if Y is different from zero; if it is, compute the division; else, repeat the reading of Y.

• The shapes to be used are: Start / End

Process (generic)

Decision

14

Workflows,

a reference (11)

Easy exercise:

solution

15

Workflows through webservices

The original problem: many services/operations of the workflow are located in different resources

16

Webservices: state of the art

• As organizations expand and technology evolves, application integration becomes increasingly important.

• Component reuse and interoperability requirements have driven companies to move toward a Service-Oriented Architecture (SOA), where self-contained business logic can be exposed and shared efficiently across applications and platforms.

• At the heart of recent success in SOA designs are Web services, a technology that enables disjoint applications to communicate with each other in a platform-independent and language-independent manner.

from McPressOnline.com

17

Webservices: state of the art

• Specifically, a Web service is a self-contained piece of software available via standard network protocols (such as HTTP(S), FTP, and SMTP) and exposed by a standardized interface, the Web Service Description Language (WSDL).

• The WSDL is a schema-defined XML document that includes all of the information an application requires to call, or consume, the Web service.

• Data is exchanged between the application and the Web service, using a standard XML messaging format called Simple Object Access Protocol (SOAP).

from McPressOnline.com

18

Webservices: state of the art

• Web services, like all technology, are rapidly evolving. Implementations range from basic Remote Procedure Calls (RPCs) to loosely coupled, document-style messaging.

• The WSDL and SOAP specifications are expanding to include support for secure, reliable, transactional Web services.

• Enhancements in these areas are helping to make Web services the preferred solution for application integration throughout the industry.

from McPressOnline.com

19

Webservices: advantages

The original problem: many services/operations of the workflow are located in different resources

1) Interoperability - This is the most important benefit of Web Services.

• Web Services typically work outside of private networks, offering developers a non-proprietary route to their solutions.

• Services developed are likely, therefore, to have a longer life-span, offering better return on investment of the developed service.

• Web Services also let developers use their preferred programming languages.

• In addition, thanks to the use of standards-based communications methods, Web Services are virtually platform-independent.

from Msdn Microsoft

20

Webservices: advantages

2) Usability

• Web Services allow the business logic of many different systems to be exposed over the Web.

• This gives your applications the freedom to chose the Web Services that they need.

• Instead of re-inventing the wheel for each client, you need only include additional application-specific business logic on the client-side.

• This allows you to develop services and/or client-side code using the languages and tools that you want.

from Msdn Microsoft

21

Webservices: advantages

3) Reusability

• Web Services provide not a component-based model of application development, but the closest thing possible to zero-coding deployment of such services.

• This makes it easy to reuse Web Service components as appropriate in other services.

• It also makes it easy to deploy legacy code as a Web Service.

from Msdn Microsoft

22

Webservices: advantages

4) Deployability

• Web Services are deployed over standard Internet technologies.

• This makes it possible to deploy Web Services even over the fire wall to servers running on the Internet on the other side of the globe.

• Also thanks to the use of proven community standards, underlying security (such as SSL) is already built-in.

from Msdn Microsoft

23

Webservices: disadvantages

Unfortunately, webservices present disadvantages, too...

1) Simplicity is not always good

• Although the simplicity of Web services is an advantage in some respects, it can also be a hindrance.

• Web services use plain text protocols that use a fairly verbose method to identify data.

• This means that Web service requests are larger than requests encoded with a binary protocol.

• The extra size is really only an issue over low-speed connections, or over extremely busy connections.

from Msdn Microsoft

24

Webservices: disadvantages

2) Long-term sessions

• Although HTTP and HTTPS (the core Web protocols) are simple, they weren't really meant for long-term sessions.

• Typically, a browser makes an HTTP connection, requests a Web page and maybe some images, and then disconnects. In a typical CORBA or RMI environment, a client connects to the server and might stay connected for an extended period of time. The server may periodically send data back to the client.

• This kind of interaction is difficult with Web services, and you need to do a little extra work to make up for what HTTP doesn't do for you.

from Msdn Microsoft

25

Webservices: disadvantages

3) Client and server ain’t aware of

each other

• The problem with Http and Https when it comes to web services is that these protocols are stateless

• The interaction between the server and client is typically brief and when there is no data being exchanged, the server and client have no knowledge of each other.

• For example, if a client makes a request to the server, receives some information, and then immediately crashes due to a power outage, the server never knows that the client is no longer active.

• The server needs a way to keep track of what a client is doing and also to determine when a client is no longer active.

from Msdn Microsoft

26

Webservices: disadvantages

4) Timeout

• Typically, a server sends some kind of session identification to the client when the client first accesses the server. The client then uses this identification when it makes further requests to the server.

• This enables the server to recall any information it has about the client. A server must usually rely on a timeout mechanism to determine that a client is no longer active.

• If a server doesn't receive a request from a client after a predetermined amount of time, it assumes that the client is inactive and removes any client information it was keeping. This extra overhead means more work for Web service developers.

from Msdn Microsoft

27

Webservices: some other benefits

The original problem: many services/operations of the workflow are located in different resources

Exposing the function on the network

• A Web service is a unit of managed code that can be remotely invoked using Http, that is, it can be activated using Http requests

• So, web services allows you to expose the functionality of your existing code over the network.

• Once it is exposed on the network, other application can use the functionality of your program

from JavaBeat.net

28

Webservices: some other benefits

Connecting different applications

• Web Services allows different applications to talk to each other and share data and services among themselves.

• Other applications can also use the services of the web services. For example VB or .NET application can talk to java web services and vice versa.

• So, Web services is used to make the application platform and technology independent.

from JavaBeat.net

29

Webservices: some other benefits

Standardized protocol

• Web Services uses standardized industry standard protocol for the communication.

• All the four layers (Service Transport, XML Messaging, Service Description and Service Discovery layers) uses the well defined protocol in the Web Services protocol stack.

• This standardization of protocol stack gives the business many advantages like wide range of choices, reduction in the cost due to competition and increase in the quality.

from JavaBeat.net

30

Webservices: some other benefits

Low cost of communication

• Web Services uses REST or SOAP over HTTP protocol for the communication, so you can use your existing low cost internet for implementing Web Services.

• This solution is much less costly compared to proprietary solutions like EDI/B2B.

from JavaBeat.net

31

Webservices: some other benefits

Support for other communication

• Beside SOAP over HTTP, Web Services can also be implemented on other reliable transport mechanisms.

• So, it gives flexibility use the communication means of your requirement and choice.

• For example Web Services can also be implemented using ftp protocol (Web services over FTP).

from JavaBeat.net

32

Webservices: some other benefits

Web services sharing

• These days due to complexness of the business, organizations are using different technologies like EAI, EDI, B2B, Portals etc. for distributing computing.

• Web Services supports all these technologies, thus helping the business to use existing investments in other technologies.

Web services are self describing

• Web Services are self describing applications, which reduces the software development time.

• This helps the other business partners to quickly develop application and start doing business. This helps business to save time and money by cutting development time.

from JavaBeat.net

33

Webservices: some other benefits

Automatic discovery

• Web Services automatic discovery mechanism helps the business to easily find the service providers. This also helps your customer to find your services easily.

• With the help of Web Services your business can also increase revenue by exposing their own Web Services available to others.

Business opportunity

• Web Services has opened the door to new business opportunities by making it easy to connect with partners.

from JavaBeat.net

(Bio) Web Services 34

Taverna

35

(Bio) Web Services 36

Galaxy

37

Galaxy

What is Galaxy?

• Galaxy is a scientific workflow and data integration

platform that aims to make computational biology

accessible to research scientists that do not have

computer programming experience.

• Although it was initially developed for genomics research,

it is largely domain agnostic and is now used as a

general bioinformatics workflow management

system.

from Wikipedia in English

38

Galaxy

When and where

• Develop by a research group at Penn State University,

Pennsylvania, Usa

• First version in 2005

• Developers and users community involved in the project

39

Galaxy

http://galaxy.psu.edu

40

Galaxy

“Galaxy: A platform for interactive large-scale genome analysis”

Belinda Giardin et al. (Genome Research, 2005)

[…] An interactive system that combines the power of

existing genome annotation databases with a simple Web

portal to enable users to search remote resources,

combine data from independent queries, and visualize the

results.

The heart of Galaxy is a flexible history system that stores

the queries from each user; performs operations such as

intersections, unions, and subtractions; and links to other

computational tools. […]

41

Galaxy

Foundations in Galaxy

• Galaxy is "an open, web-based platform for performing

accessible, reproducible, and transparent genomic

science”

Accessibility

• Galaxy stresses a simple user interface over the

ability to build complex workflows.

• This design choice makes it relatively easy to build

typical analyses, but more difficult to build complex

workflows that include, for example, looping

constructs. from Wikipedia in English

42

Galaxy

Reproducibility

• Reproducibility is a key goal of science: when

scientific results are published the publications should

include enough information that others can repeat the

experiment and get the same results.

• Galaxy supports reproducibility by capturing

sufficient information about every step in a

computational analysis, so that the analysis can be

repeated, exactly, at any point in the future.

• This includes keeping track of all input, intermediate,

and final datasets, as well as the parameters provided

to, and the order of each step of the analysis.

43

Galaxy

Transparency

• Galaxy supports transparency in scientific research by

enabling researchers to share any of their Galaxy

Objects either publicly, or with specific individuals.

• Shared items can be examined in detail, rerun at will

and copied and modified to test hypotheses.

44

Galaxy

(Bio) Web Services 45

MyExperiment

46

MyExperiment

What is MyExperiment?

“The myExperiment Virtual Research Environment enables you and your colleagues to share digital items associated with your research — in particular it enables you to share and execute scientific workflows.

You can use MyExperiment.org to find publicly shared workflows […]”

definition on the MyExperiment.org website

http://www.myexperiment.org

47

MyExperiment

What is MyExperiment?

“MyExperiment is a social website for researchers sharing research objects such as scientific workflows.

Its website […] contains a significant collection of scientific workflows for a variety of workflow systems, most notably Taverna, but also other tools such as Bioclipse.

myExperiment has a REST API and is based on an open source Ruby on Rails codebase.

[…]”

definition on Wikipedia in English

48

MyExperiment

MyExperiment details

• Started in 2007

• By research groups at the University of Manchester

and University of Southampton, United Kingdom

• Nowadays (November 2011) it has:

• over 5,000 members

• over 250 groups

• over 2,000 workflows

• over 450 files

• over 150 packs

49

MyExperiment

50

MyExperiment

(Bio) Web Services 52

BioCatalogue

53

BioCatalogue

What is BioCatalogue?

“The BioCatalogue is a centralised registry of curated life science web services.

It allows you to easily discover, register, annotate, monitor and use web services”

definition on the BioCatalogue.org website

http://www.biocatalogue.org

“The BioCatalogue is a curated catalogue of Life Science Web Services”

definition from Wikipedia in English

54

BioCatalogue

“BioCatalogue: a universal catalogue

of web services for the life sciences”

Jiten Bhagat et al. (Nucleic Acids Res. 2010)

“[...] The BioCatalogue provides a common interface for registering, browsing and annotating Web Services to the Life Science community.

Services in the BioCatalogue can be described and searched in multiple ways based upon their technical types, bioinformatics categories, user tags, service providers or data inputs and outputs. “

55

BioCatalogue

“Services in the BioCatalogue are also

subject to constant monitoring, allowing the identification of service problems and changes and the filtering-out of unavailable or unreliable resources.

The system is accessible via a human-readable ‘Web 2.0’-style interface and a programmatic Web Service interface. The BioCatalogue follows a community approach in which all services can be registered, browsed and incrementally documented with annotations by any member of the scientific community. [...]“

56

BioCatalogue

BioCatalogue, some details

• Started in 2009

• Collaboration between myGrid project at the

University of Manchester (UK) and the European

Bioinformatics Institute

• based on an open source Ruby on Rails codebase

• Nowadays (November 2011) is has:

• 2,261 services

• 155 service providers

• 620 members

57

BioCatalogue

(Bio) Web Services 58

BioMoby

59

BioMoby

What is BioMoby?

“MOBY is a system for interoperability between biological data hosts and analytical services.

The MOBY-S system defines an ontology-based messaging standard through which a client will be able to automatically discover and interact with task-appropriate biological data and analytical service providers, without requiring manual manipulation of data formats as data flows from one provider to the next.”

definition on the BioMoby.org website

60

BioMoby

What is BioMoby?

“BioMOBY is a registry of web services used in bioinformatics.

It allows interoperability between biological data hosts and analytical services by annotating services with terms taken from standard ontologies.”

definition from Wikipedia in English

61

BioMoby

"BioMOBY: An open source

biological web services proposal"

Mark Wilkinson et al. (Briefings in Bioinformatics 2002)

“BioMOBY is an Open Source research project which aims to generate an architecture for the discovery and distribution of biological data through web services.

Data and services are decentralised, but the availability of these resources, and the instructions for interacting with them, are registered in a central location called MOBY Central.“

62

BioMoby

“BioMOBY adds to the web

services paradigm, as exemplified by Universal Data Discovery and Integration (UDDI), by having an object-driven registry query system with object and service ontologies.

This allows users to traverse expansive and disparate data sets where each possible next step is presented based on the data object currently in-hand […]”

63

BioMoby

64

The end