health grids: overview and added-value, partner work ......metadata and catalog services is o ered...

39
PARTNER Grant Agreement Number 215840 WP22 - D.1 Existing ITC based medical collaborative infrastructures and applications using it, existing deficiencies and problems and expected added value Faustin Laurentiu Roman Host Organisation CERN Supervisors Prof. Manjit Dosanjh Prof. Jose Bernabeu Date November 2009

Upload: others

Post on 13-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

PARTNER Grant Agreement Number 215840

WP22 - D.1 Existing ITC based

medical collaborative infrastructures and applications using it, existing

deficiencies and problems and expected added value

Faustin Laurentiu Roman

Host Organisation 

CERN

Supervisors  Prof. Manjit Dosanjh Prof. Jose Bernabeu

Date November 2009

Page 2: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Health Grids: overview and added-value

PARTNER Work Package 22, Deliverable 1

Prototype Grid hadron therapy testbed

Faustin Laurentiu Roman

PARTNER ESR, CERN

[email protected]

November 2009

Page 3: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Abstract

Following is an overview of the state-of-the-art IT medical collaborative infrastructuresusing Grids as their underling technology. The evaluation looks mainly at the Grid in-frastructure and subsystems, comparing the projects approaches to secure medical datasharing. While ACGT and caBIG projects are analyzed for their extensive middlewarecomponents and tools, BEinEIMRT radiotherapy Grid is able to securely access externalresources negotiated via Service Level Agreements. Lastly, two science gateways, HOPEand NCRI ONIX, provide an example for both clinical and research data exchange.

Based on the projects evaluation a proposal is made for the PARTNER prototype Gridtestbed. Conceptually the Platform will be based on open and standard Grid middlewareservices for semantic data integration, providing data anonymization enforced by strongsecurity polices.

This is the 1st Deliverable of the PARTNER Work Package 22 within the Marie CurieInitial Training Fellowship of the European Community's Seventh Framework Programmeunder contract number (PITN-GA-2008-215840-PARTNER).

Supervisors:CERN (host institute): Prof. Manjit Dosanjh, [email protected] (associated instute): Prof. Jose Bernabeu, bernabeu@i�c.uv.es

Co-Supervisors:IFIC: Dr. Jose Salt Cairols, salt@i�c.uv.esIFIC: Dr. Gabriel Amoros, amoros@i�c.uv.es

Page 4: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Contents

1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 PARTNER and Heath Grids . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Report Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Grid technology 32.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 gLite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2.1 Data Deployment Model . . . . . . . . . . . . . . . . . . . . . . . . 52.2.2 Computational Deployment Model . . . . . . . . . . . . . . . . . . 6

2.3 Medical Data Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 ACGT 93.1 Project overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.3 ACGT Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.4 ACGT Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4 caBIG 144.1 Project overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.2 Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.3 Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.3.1 caGrid Metadata Services components: . . . . . . . . . . . . . . . . 154.3.2 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.3.3 Other Components . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5 Radiotherapy Grids 185.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185.2 eIMRT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185.3 BEinEIMRT project and Service Level Agreements . . . . . . . . . . . . . 195.4 Grid Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

6 Medical Knowledge Portals 216.1 HOPE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216.2 NCRI ONIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2

Page 5: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 0

7 Conclusions 237.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247.2 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247.3 Grid Added Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257.4 Ethics and legislation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

8 Recommendations 278.1 Platform Proposal Draft . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278.2 Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

8.2.1 Grid middleware . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288.2.2 Data Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

8.3 Sources of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308.4 Standard interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308.5 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308.6 Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3

Page 6: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 1

Introduction

1.1 Motivation

In Europe cancer is the second largest cause of death, responsible for 25% of all deathsand is the biggest killer of people aged 45-64 [1]. Once diagnosed, cancer is usually treatedmainly with a combination of surgery, chemotherapy and radiotherapy. Radiation therapyinjures or destroys cells in the area being treated (the "target tissue") by damaging theirgenetic material. The goal of radiation therapy is to damage as many cancer cells aspossible, while limiting harm to nearby healthy tissue.

Hadron therapy is a form of radiotherapy that uses beams of energetic protons or ions(also called �hadrons�) for cancer treatment. Hadron therapy makes use of the BraggPick e�ect and Relative Biological E�ectiveness of the hadrons to induce local damageinto tumors [2], and is best suited for the treatment of radio-resistant tumors which canaccount for 10% of cancer cases.

There are more that 25 proton therapy centers worldwide (end of 2009) and only twocenters using carbon ions in operation [3], both in Japan. Europe will receive a boostin particle therapy [4] in the form of a new facility in Heidelberg, Germany, HeidelbergIon Therapy Center (HIT) [5], that just started in November 2009, followed by anotherfacility in Pavia, Italy, National Center for Oncological Hadrontherapy (CNAO) [6], by2010. Future plans include a french facility ETOILE [7] and an Austrian one MedAustron[8].

Looking at numbers, 15% of the approximately 20 000 patients treated for every 10million inhabitants, with conventional radiation would receive a better treatment withhadron beams[9]. This requires a proton therapy center for every 10 million people and acarbon ion center for every 50 million people.

All these are reasons to connect the future centers and the largest number of Europeanhospitals with a powerful and integrated e-infrastructure based on shared use of knowledgefor an optimized treatment of patients.

1.2 PARTNER and Heath Grids

A key issue for the Particle Training Network for European Radiotherapy (PARTNER)Project [10] is the optimal use of the limited hadron-therapy facilities. This implies

1

Page 7: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 1

computer-assisted distribution and scheduling of patients as well as sharing and trans-ferring of clinical data between practitioners with easy and e�cient communication toolsand common standards for digitized, secure patient data.

�Grid� concept means coordinated resource sharing and problem solving in dynamic,multi-institutional virtual organizations (VOs) whist ensuring strict con�dentiality. Shar-ing concerns not only �le exchange but rather direct access to computers, software, data,and other resources [11].

Health Grids are Grid infrastructures comprising applications, services or middlewarecomponents that deal with the speci�c problems arising in the processing of biomedicaldata. Resources in health Grids are databases, computing power, medical expertise andeven medical devices[12].

The PARTNERWork Package 22 [13] objective is to create a such a collaborative Gridtest-bed for hadron therapy based on collaboration between centers in di�erent Europeancountries. An important part of the work will be to identify the common set of rulesto allow the use for scienti�c analysis of the medical data respecting the national andEuropean rules protecting patient data.

The Grid Platform involves the collaborative work of two work packages (22 and 23)and having as a testcase service the rare tumor database (Work Package 24), as describedin Annex I of the PARTNER Proposal [10].

1.3 Report Structure

The previous Sec. has introduced the motivation for this report and the importance ofGrids in Hadron Therapy. The remainder of this report is structured as follows: Chapter2 introduces the Grid technology, the basic components and high level services; Chapter3 to 6 looks at the evaluated projects infrastructure and related subsystems; Chapter 7summarizes the projects evaluation and discusses open issues and Chapter 8 outlines aproposal for the PARTNER Platform.

2

Page 8: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 2

Grid technology

2.1 Introduction

The �Grid� is a collection of resources, e.g. computers, storage, and services that candynamically join and leave the Grid, heterogeneous in every aspect, geographically dis-tributed and connected by a wide-area network. The key concept is the ability to negotiateresource-sharing arrangements among a set of participating parties federated in VOs andthen use the resulting resource pool for some purpose [14].

Grid Middleware refers to the security, resource management, data access, instru-mentation, policy, accounting, and other services required for applications, users, andresource providers to operate e�ectively in a Grid environment. Middleware acts as a sortof `glue' which binds these services together.

A simpli�ed picture of the layered Grid architecture [11] compared to the Internetprotocol architecture is shown in Figure 2.1. Grid architecture components within eachlayer share common characteristics and can build on capabilities and behaviors providedby any lower layer. Resource and Connectivity protocols facilitate the sharing of in-dividual resources implemented on a Fabric layer. These in turn can be used to constructa wide range of global services and applications at the Collective layer. The �nal layercomprises the user applications that operate within a VO environment. Applicationsare constructed in terms of, and by calling upon, services de�ned at any layer.

2.2 gLite

In order to exemplify the various services of Grid middleware I chose gLite [15]as anGrid middleware implementation because of the very broad spectrum of applications andprojects using it.

The gLite middleware, developed by the EGEE project [16], is a Service OrientedGrid middleware providing services for managing distributed computing and storage re-sources and the required security, auditing and information services. The architecture isconverging towards the Open Grid Service Architecture (OGSA) [17] and builds on top ofthe Web Services Architecture [18] plus a set of extensions (e.g., Web Services ResourceFramework [19]).

The following Sec. brie�y presents gLite middleware components grouped by ser-vice types. Only major gLite components considered to be relevant to the scope of the

3

Page 9: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 2

Figure 2.1: Grid compared with the Internet protocol architecture

Figure 2.2: gLite Services

4

Page 10: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 2

PARTNER prototype project are listed and described. The Figure 2.2 depicts the gLitehigh-level services, which can thematically be grouped into �ve service groups: AccessServices, Information and Monitoring Services, Job Management Services,Data Management Services and Security Related Services [20].

2.2.1 Data Deployment Model

Since the PARTNER Platform is oriented towards data sharing services below is a list ofcore gLite services that enable data sharing capabilities [21]. Figure 2.3 shows how thecomponents work together in a Grid infrastructure [22].

• Access Services

· User Interface (UI) combines all the clients that allow the user to directlyinteract with the Grid services, e.g. CLI, API

· Portals can give the same functionality as UI and being more universal andsimple.

• Security services

· based on Public Key Infrastructure (PKI) x509 technology using Certi�cateAuthorities (CAs)

· Virtual Organization Membership Service (VOMS) is used for managingauthorization, the membership and member rights within a VO.

· Grid Policy BOX (G-PBox) is a policy framework used by VO and siteadministrators that helps in creation and application of authentication policiesbetween Grid services

· myProxy is used as secure proxy store, e.g. for long term operations.

· Hydra is a secure key storage and together with the encrypted storage Clibrary provides on-the-�y block level data encryption and decryption.

• Data Management Services

· Data storage is done using Storage Resource Management (SRM) system,that manage disk space, tape space or a combination of the two, e.g. DPM ordCache, CASTOR SRM implementations.

· Disk Pool Manager (DPM) is a recommended solution for lightweight de-ployment of smaller sites, bigger sites usually choose dCache because of ro-bustness, scalability and advanced features and CERN Advanced STORage(CASTOR) is an implementation used by sites that have both disk and tapestorage.

· Grid File Transfer Protocol (GridFTP) is a high-performance, secure, re-liable data transfer protocol optimized for high-bandwidth wide-area networks.Is used as a primary data transfer interface to Storage Elements.

5

Page 11: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 2

· Metadata and Catalog Services is o�ered using LCG File Catalog (LFC)that provides a hierarchical view of store locations of �les and replicas. LFCserver communicates with a database (MySQL), where all the data is stored.LFC catalog also exposes a Data Location Interface (DLI) - a web service usedby applications and Resource Brokers.

· AMGA is a generic metadata catalog and can replicate metadata betweendi�erent AMGA instances allowing the federation of metadata. Highly used inbiomedical VOs.

· The Local Transfer Service and the File Transfer Agents provides �letransfer/�le placement service (FTS/FPS) and it is used for moving �les be-tween SE.

• Information and Monitoring Services

· Grid services provide information about their status in a form de�ned by theGrid Laboratory for a Uniform Environment (GLUE) schema: hierarchical forBDII and relational for R-GMA.

· Berkeley Database Information Index (BDII) provides information aboutthe status of Grid services and available resources.

· R-GMA Server is information and monitoring system for use both by theGrid middleware and by applications.

· Service Discovery is a facility for locating suitable services o�ered to bothend users and other services. It is implemented as a client library front-end toone or more information systems.

2.2.2 Computational Deployment Model

For completion here are the extra services required for the computational Grid:

• Job Management Services

· A Computing Element (CE) interfaces the local resource management sys-tem, LRMS (e.g. LSF, PBS) to the Grid middleware. CE includes a GridGate (GG) - Gatekeeper for CE based on Globus - which acts as a genericinterface to the cluster LRMS, and the cluster itself - a collection of nodes orjust one multiprocessor system where the jobs are run.

· TheWorker Nodes behind the LRMS host all the necessary clients to interactwith the Grid middleware from within a job.

· CREAM (Computing Resource Execution And Management) is thenew type of CE developed in EGEE project is a simple, lightweight servicethat implements all the operations required at the CE level. The interface ofgLite-CE and CREAM with the underlying LRMS is implemented via BLAH.

· The Batch Local Ascii Helper (BLAH) acts as an interface to a LRMS,whose submission interface and commands are available locally on a given host.Currently LSF, PBS and Condor are supported, with SGE in the plan. The

6

Page 12: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 2

Figure 2.3: gLite Architecture

BLAH service provides a common interface for job submission, job hold, jobresume, job status, job cancel and proxy renewal.

• Workload Management services

· The Workload Management Service (WMS) is used to submit and mon-itor jobs to the Grid

· The Resource Broker (RB) takes the decision of which resource shouldbe used based on a matchmaking process between submission requests andavailable resources.

· The Logging and Bookkeeping service (LB) keeps track of the job statusinformation.

2.3 Medical Data Manager

On top of the �classic� Grid middleware components we can use software frameworks thatcan enhance the usual middleware capabilities. The Medical Data Manager (MDM)[24] system provides sensitive data management on the EGEE Grid infrastructure.

MDM provides access to medical sources for Grid services and users while takinginto account the constraints related to clinical practice. In particular patient privacy isguaranteed, data location is maintained in acquisition centers, and the standard clinicaldata �ow is preserved.

7

Page 13: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 2

Figure 2.4: Medical Data Manager Architecture

In technological terms MDM service provides an interface between the DPM storageelement and the Digital Imaging and Communications in Medicine (DICOM) storagesystem [25]. This DICOM system is designed for internal hospital usage and the DPM-DICOM interface provides a method to use medical images securely in a Grid environment.AMGA metadata catalog is used to store the medical-speci�c metadata of the images thatare transferred to the DPM storage element. The architecture of this system can be seenin Figure 2.4.

A minimal metadata schema is de�ned in the MDM service for all stored images.It provides basic information on the patient owning the image, the image propertiesand acquisition parameters. A centralized AMGA server is used to store all metadatasegments extracted from the di�erent images stored in several DICOM servers. A morerealistic deployment scenario in clinical environment would be to distribute metadataover the di�erent sites owning the data both for improving reliability (avoiding to createa single point of failure) and making the system more acceptable to the end users (sensitiveinformation remain stored inside the hospital).

Looking at the Medical Data Manager software platform give us a very good exampleof a Grid infrastructure that can share medical data and also we can see how various gLitecomponents are combined to achieve this goal.

8

Page 14: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 3

ACGT

3.1 Project overview

Advancing Clinico-Genomic Clinical Trials on Cancer (ACGT) is a EU FP6 projectaiming at developing open-source, semantic and Grid-based technologies in support ofpost genomic clinical trials in cancer research [26].

The overall objective of the project is to develop an IT infrastructure that will supporta large variety of analysis taking place in the context of clinical trials and bio-lab researchin 2 cancer areas: breast cancer and child nephroblastoma. The duration of the projectis February 2006 - January 2010 and the work is divided among 25 partners [27].

3.2 Scenarios

There are two clinical Trials tested in the ACGT Platform:

• The SIOP trial, conducted by Saarland University, regarding Wilms tumor, looksat the identi�cation of markers in serum that can be used as predictor of patientresponse to chemotherapy.

· In this scenario there is a small amount of data and complex patient follow-upthat makes it a perfect test case for the ACGT ObTiMA and Oncosimulatortools, see Sec. 3.4

• The TOP trial, led by Institut Jules Bordet, regarding Breast carcinoma, tries toassess patient clinical response to di�erent strategies in neo-adjuvant treatment withepirubicin

· This trial will access large data sets (expression microarray-based) and is theideal testcase for the ACGT anonymisation and data mining pipeline services.

3.3 ACGT Architecture

Figure 3.1 shows the basic layers of the ACGT architecture [28] looking at di�erent func-tions performed within the ACGT environment and the data �ow inside the infrastructure.

9

Page 15: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 3

There is a logic separation of the Grid fabric into 2 levels that are supported by a securitylayer. The lower level has the following three layers:

• Hardware layer: basic hardware fabric and resources, network and primary databases

• Common Grid infrastructure provides remote access to individual resourcesfrom the Hardware layer and is based on Globus Toolkit 4 components [29]:

· Grid Resource Allocation and Management (GRAM) service provides a singleinterface for requesting and using remote system resources [30].

· GridFTP service for data transfer and

· MDS service for resource monitoring and discovery.

• Advanced Grid Middleware makes use of Gridge Toolkit [31] to provide an extraabstraction layer on top of the common Grid infrastructure by using the followingservices:

· GAS Authorization service which is a key decision point for the rest of com-ponents

· GDMS: for data management, providing a common access interface, metadatarepository and data storage.

· GRMS: GridLab Resource Management System, to provide meta-schedulingcapabilities[32]

· and apart from Gridge Toolkit there is OGSA-DAI[33] that allows data re-sources, such as �le collections, relational or XML databases, to be accessed,integrated and federated inside the Platform.

The high level consists of:

• ACGT Business Process Services that provides integration of di�erent data andresources, i.e. Ontology Services, Knowledge Discovery Services, VO ManagementService, Mediator Services, Analytical Services, Work�ow enactor.

• User Access and High Level Interoperability layer that gives end user ac-cess (standalone applications or portals), by enabling speci�c ACGT scenarios andVisualization Tools. The portal is based on Gridsphere technology [34].

All 5 layers presented above are supported by a Security Layer enforced by standardGrid security mechanisms like GSI (Grid Security Infrastructure) and GAS (Gridge Au-thorization Service).

On top of this Security Layer there is a Data Protection Framework (DPF) en-forced by a third-party non-pro�t organisation the Center for Data Protection [35].

10

Page 16: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 3

Figure 3.1: The ACGT Architecture

3.4 ACGT Tools

Looking at the ACGT services and individual relations within the platform, Fig. 3.2, wehave the following tools (from bottom to top) :

• The Trial Builder: ObTiMA: Ontology Based Clinical Trial Management, cap-tures data de�nition and further design speci�cations for a clinical trial in a stan-dardized way based on a formal ontology.

• Custodix Anonymisation Tool (CAT): anonymisation tool that sits betweenthe hospital �rewall and outside world.

• Master Ontology and Mediator: support the integration of multi-level biomed-ical data

• Work�ow editor and enactor: graphical front-end for experiments building usingACGT components.

• Oncosimulator: cancer treatment optimization on a patient-individualized basisby performing in-silico experiments simulating the response of tumours and a�ectednormal tissues to therapeutic schemes.

• GridR: Gridi�ed version of the R statistical environment to provide a powerfulframework for the analysis of clinico-genomic trials involving large amount of data,e.g. microarray-based clinical trials.

• Literature Mining: �nds interesting connections between seemingly disparatechunks of knowledge that may help solve a task at hand.

As an example of how di�erent Grid components are working together Figure 3.3 showsthe usage of Oncosimulator. The execution of the batch job is done in the Grid environ-ment, involving complex computation, image manipulation and moreover anonymizationof data collected from di�erent sources.

11

Page 17: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 3

Figure 3.2: ACGT Architecture Components

12

Page 18: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 3

Figure 3.3: Oncosimulator Scenario Work�ow

13

Page 19: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 4

caBIG

4.1 Project overview

caBIG [36] stands for the Cancer Biomedical Informatics Grid, an information networkenabling the cancer community to share data and knowledge. The initiative is overseen bythe National Cancer Institute Center for Biomedical Informatics and Information Tech-nology (CBIIT) and as of August 2008 there are 46 NCI-designated Cancer Centers and16 Community Cancer Centers connected to caBIG tools and infrastructure.

4.2 Services

caBIG platform comprises many services and depending on the area of use caBIG providesframeworks that comprise all the tools and services for data integration and the Gridinfrastructure associated. There are six caBIG domains where these tools are located,called Knowledge Centers:

1. caGrid - Knowledge resources for communities interested in learning about, using,and contributing to caGrid.

2. Vocabulary - Knowledge resources for individuals and institutions interested inmaking use of or extending caBIG tools and other vocabulary tools. Steward of thefollowing tools: LexBIG/LexEVS, NCI Protege and LexWiki.

3. Data Sharing and Intellectual Capital - Centralized, authoritative repositoryof processes, model agreements, and other resources to encourage and facilitate datasharing to advance scienti�c discovery consistent with applicable legal, regulatory,ethical, and contractual requirements.

4. Clinical Trials Management Systems - clinical trials management, includ-ing: Cancer Central Clinical Participant Registry (C3PR), Cancer Adverse EventsReporting System (caAERS), Patient Study Calendar (PSC), caXchange, Lab-viewer/CTODS and C3D Connector.

5. Molecular Analysis Tools - molecular analysis, including: caArray, caIntegrator,geWorkbench and GenePattern.

14

Page 20: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 4

6. Tissue/Biospecimen Banking and Technology Tools - biospecimens manage-ment, including: caTissue Suite, caTissue Core, and caTIES.

In the following Sec.s only the caGrid infrastructure components will be described as theseare relevant to the present PARTNER Platform objectives.

4.3 Infrastructure

caGrid [37] is the backbone for data and message exchange across all tools. It providesthe connectivity among the upper layers, a common identity and security management,message transport and routing and secure access, query, and retrieval of data across tools.

caGrid is designed to support the sharing of well de�ned data. Some of the waysthis is achieved include the use of controlled vocabularies to label data items, the use ofdetailed information models to describe how data items relate to each other, and the useof concepts de�nitions which indicate the precise meaning of items in the models.

caGrid uses Globus Toolkit [29] and Mobius [38], and tools developed by the NCI, suchas the caCORE infrastructure [39], to deliver functionality. caGrid services are standardWSRF v1.2 services and can be accessed by any speci�cation-compliant client.

4.3.1 caGrid Metadata Services components:

• Cancer Data Standards Repository (caDSR): registers the projects data mod-els as Common Data Elements (CDEs) which are semantically harmonized and thencentrally stored and managed. The caDSR Grid service provides model discoveryand traversal and caGrid standard metadata generation capabilities

• Enterprise Vocabulary Services (EVS) is set of services and resources for con-trolled vocabulary and provides query access to the data semantics and the con-trolled vocabulary managed by the EVS

• Global Model Exchange (GME) is a DNS-like data de�nition registry and ex-change service that is responsible for storing and linking together data models inthe form of XML schema.

• Globus Information Services: provides a generic framework for aggregation ofservice metadata, a registry of running Grid services, and a dynamic data generatingand indexing node, suitable for use in a hierarchy or federation of services

4.3.2 Security

A software package called �GAARDS� (Grid Authentication and Authorizationwith Reliably Distributed Services) implements core security services for caGrid asseen in Figure 4.2. GAARDS was developed on top of the Globus Toolkit and extendsGlobus Grid Security Infrastructure (GSI) by using the following components:

• Dorian service for the provisioning and management of Grid users accounts.

15

Page 21: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 4

Figure 4.1: caGrid Architecture

• Grid Trust Service (GTS) for maintaining and provisioning a federated trustfabric consisting of trusted certi�cate authorities, allowing Grid services to makeauthentication decisions against the most recent information.

• Grid Grouper for a group-based authorization solution for the Grid.

• Authentication Service for issuing SAML assertions for existing credential providersso they may easily integrate with Dorian and other Grid credential providers.

• Credential Delegation Service for a client (the delegator) to be able to expressa delegation policy, entitling a prescribed collection of other Grid entities (the del-egates) to assume the delegator's identity for a limited time.

4.3.3 Other Components

The Introduce toolkit [40] helps a service developer by coordinating the various stepsof the service development and deployment via the tools provided by the GT and bymanaging the necessary directories and �les. It abstracts away the details of invoking thevarious tools so that the developer is freed up to concentrate on the details of implementinghis/her domain-speci�c code.

caGrid Query Language (CQL) is a custom object-oriented query language forall data services deployed to the Grid. That is, all Grid queries are expressed in CQLand each caGrid-compliant data service is required to be able to consume CQL queries.caGrid provides a federated query processing service. Federated Query Processor(FQP) takes DCQL (an extended form of CQL) as input to perform federated query.DCQL queries are broken into composite CQL queries and passed to individual dataservices.

16

Page 22: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 4

Figure 4.2: caBIG Security

Lastly, there are two main types of caGrid services provided by service providers:

• Data Services share data on the Grid. The data might reside in a data repositorysuch as a relational database (RDBMS), XML database, or �le system. A commontype of data service in caBIG� is a "caCORE-backed" data service. This is a dataservice for which the back-end implementation is created by the caCORE SDK.caCORE provides the building blocks and tools needed to develop interoperableinformation management systems.

• Analytical Services are services that provide access to analysis routines over theGrid either using an object-oriented client APIs or strongly typed interfaces.

17

Page 23: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 5

Radiotherapy Grids

5.1 Overview

In Radiotherapy a treatment plan needs to be set up for each patient to know the dosedelivered to the tumor and neighboring healthy tissue and organs. For this generally areused commercially available software called Treatment Planning Systems (TPS), that areoptimized to a particular treatment machine by using analytical algorithms. However itwould be preferable to use Monte Carlo TPS especially in cases involving anatomicallyinhomogeneous areas of the body, e.g. lungs, head and neck, where the current TPS arenot performing adequately. The main use cases of Monte Carlo TPS are for the:

1. veri�cation of treatments and comparison with commercial TPS

2. search of treatment alternatives based on patient images and therapy con-straints.

The challenges of these scenarios is to �nish the computationally intensive Monte Carlosimulations in reasonable time.

5.2 eIMRT

e-IMRT [41] started in 2005 by CESGA, University of Santiago de Compostela, Univer-sity of Vigo and Complexo Hospitalario Universitario de Santiago (all of them in Galicia,Spain) with the collaboration of the Computer Sciences Department of the University ofWisconsin-Madison. The project looks in exploiting the computational power of Grids inorder to speed-up the Monte Carlo TPS.

Following is a description of eIMRT computational architecture, Fig. 5.2, having:

• a client that connects through Web Services to

• an Web server that analyzes the task and

• transfers it via an computing interface

• to computing resources that �nally executes the jobs, either locally or on the Grid.

18

Page 24: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 5

Figure 5.1: eIMRT Architecture

Figure 5.2: BEinEIMRT Architecture

5.3 BEinEIMRT project and Service Level Agreements

In going from proof-of-concept to production infrastructure, eIMRT platform moved to abusiness experiment in the BEinGrid project (Business Experiments in Grid) [42]. In thisway the eIMRT platform was able to securely access external resources negotiated viaService Level Agreements (SLAs), thus adding Quality of Services (QoS) to Grids [43].

The new platform makes use of GridWay meta-scheduler for resource negotiation andallocation and enhances the security with the components provided by Vordel and Ax-iomatic, Fig. 5.3.

As an example, BEinEIMRT uses normally the services on local resources. If availableresources are not enough the negotiation starts and adds more resources automaticallyusing SLA. This process is transparent for the �nal user, reducing the time to solutionand increasing the QoS.

19

Page 25: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 5

Figure 5.3: Policy Based Access Control Work�ow

5.4 Grid Components

The GridWay framework manages job execution and resource brokering on hetero-geneous and dynamic Grids making it a �meta-scheduler�. It provides fault recoverymechanisms, dynamic scheduling, migration on-request and opportunistic migration.

A Service Level Agreement (SLA) is a formal written agreement between theservice provider and the client in which is stated the delivery parameters of the service.SLAs contain causes, incentive awards and penalty provisions and are used to generateQoS agreements.

GridWay adapted the SLA model by adding a plugin, BrokerGW-SLA, that facil-itates resource negotiation when the resources are scarce. The negotiation is based onSLA or pre-Agreements for the requirements and limits and is performed through WebServices, using WS-Agreement standard [45].

Given the con�dential nature of the data, security is enforced using a Policy BasedAccess Control [48], that has two components 1:

• Policy Enforcement Point (PEP) is an XML Security Gateway that checks andvalidates all service requests against a security policy. PEP delegates access controlrequests to an authorization service, Policy Decision Point (PDP). Makes use ofseveral standards: WS-Policy, WS-Trust and WS-Security.

• Policy Decision Point (PDP) allows multiple di�erent applications, with em-bedded PEPs, to share the same policies. Is an authorization service based onthe XACML standard for representing and evaluating access control policies andrequests.

In short, PEP mediates the access requests submitted by the users and enforces the accessdecisions established by the PDP according to the access control policies, Fig. 5.4.

1Radiotherapy Grid uses commercial products: Vordel XML Gateway 5 for the PEP and Axiomatics

Policy Server 3 for PDP

20

Page 26: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 6

Medical Knowledge Portals

In the following subSec.s I describe two science gateways that enable researcher andclinicians to access data in a collaborative and secure way. The �rst portal is focusedmore on clinical side whereas the second one is focused on the research one.

Both portals make use of the Grid technologies, using Globus middleware, and haveconnections with the projects discussed in the previous Sec.. e.g. using gLite or caBIGcomponents.

6.1 HOPE

HOPE (HOspital Platform for E-health) is a web platform developed jointly at CNRSand HealthGrid in France that allows hospital sites to exchange medical information.The platform o�ers a complete, transparent and secure way to manage medical data �lescontaining images, physician's prescription and treatment plans[23].

The platform is designed to enable telemedicine in a Grid environment. The concept oftelemedicine covers all the potential means to exchange medical data or images whateverthe distance between two physicians. As an example of gLite implementation, HOPEarchitecture has the following characteristics:

• Grid services provided by gLite middleware.

• Web portal developed with the Gridsphere [34] portlet container

• Medical information is stored locally in the hospital where it is produced using theAMGA metadata catalog [51].

• Information between services deployed in di�erent locations is exchanged using theSOAP messaging protocol.

• Medical images are stored anonymized and encrypted on the Grid while their cor-responding metadata are stored in the local AMGA server.

6.2 NCRI ONIX

The UK National Cancer Research Institute (NCRI) ONcology InformationeXchange (ONIX) [49] is an online portal makes available clinical and experimental

21

Page 27: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 6

cancer research data and resources from disparate sources through a common user inter-face allowing researchers to perform more e�cient and integrative science.

Currently ONIX's main features are:

• Resource Catalog - a central register of resources including databases, analyticaltools and services created in collaboration with the community;

• Quick Search - a tool for targeted searching and orderly retrieval and presentationof information from a selection of highly relevant cancer research resources;

• Terminology Browser - an electronic dictionary providing users with controlled ter-minology for data annotation and modeling, or to generate ontology-driven (`intel-ligent') queries of cancer research data.

Globus Toolkit is the underlying Grid technology for the ONIX Portal but also has com-ponents from the caBIG platform like the ontology and semantics part, e.g. caDSR, GME,LexEVS.

22

Page 28: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 7

Conclusions

Early focus of Grids was on computational aspects, speeding up healthcare related ap-plications, and having lower priority for data con�dentiality. The next generation Gridslooked at data federation and interoperability. This had an impact mostly on privacy [52].Today Grids are characterized by the adoption of Web Services, metadata, interoperabilityand service automation.

The application of Grid technology to healthcare is crystallized in the concept ofHealthGrid [12]. A number of projects have been exploring this concept, ACGT andcaBIG developed middleware and tools for semantic data sharing, others used the com-putational power of the Grids and added quality of service and resource negotiation usingSLAs, while others took the available middleware and worked on data fusion and newways in exposing it to the users.

From the various projects analyzed a sum of common infrastructure featuresemerge that should be addressed in the PARTNER Platform:

• Grid middleware: used as a backbone, e.g. Globus Toolkit 4, GridGE toolkit(Sec. 3.3), gLite

• Data collection: sources need to be interfaced using standards and providinganonymization on-the-�y, e.g. OGSA-DAI (Sec. 3.3), MDM (Sec. 2.3)

• Semantics and ontologies: the core of data translation and linking (Sec. 4.3.1)

• User access: web portals and WSRF Web Services, e.g. Gridsphere [34]

• Strong security polices: Grid Security Infrastructure and extra mechanisms,e.g. GAS and DPF (Sec. 3.3), GAARDS (Sec. 4.3.2), or PBAC (Sec. 5.4)

• SLA Negotiation can be used both for resource sharing and security control,like PBAC (Sec. 5.4) or CoreGrid [46]

Looking how these projects implemented the HealthGrid concept we can induce:

• the set of requirements needed by the hadron therapy community and

• the challenges in meeting these constraints,

23

Page 29: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 7

• the Grid added value and

• the ethical and legal constraints of the environment.

7.1 Requirements

The PARTNER software platform should address the following requirements:

• Consist of easy to use applications as healthcare professionals are too busy tolearn complex tools, i.e. hidden Grid complexity.

• Be adaptable to the medical �eld, �exible to the �modus operandi� of its users.

• Should be based on open, standard and widely used software frameworks.

• Compatible with existing hospital data and image archiving systems (PACS).

• Be fault tolerant in the perspective of clinical routine.

• Security should enforce the legal and ethical policies.

7.2 Challenges

We want to enable users to use the knowledge of cancer biomedical research community.Conceptualy this means data sharing, i.e. making available the knowledge base to thewidest possible audience. At the more practical level this is about allowing the communityto be able to mine the ever increasing volume of empirical data.

Sharing medical knowledge has challenges that need to be solved:

• Centers are in heavy production use (tools must be carefully thought out and astrategic plan developed)

• Cross-border legal policy implementation

• Patient records need to be linked and translated into unique semantic representa-tions

• Technical speci�cations and di�erent formats for electronic health records (EHRStandards: HL-7, ENV13606, OpenEHR)

• Incompatibility between standard versions (DICOM, HL7)

• Compatibility with legacy or closed (non-standard) systems

• Politics, mentality and willingness to share data.

24

Page 30: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 7

7.3 Grid Added Value

As stressed in a recent report from the European Commission and the World HealthOrganization, �Information and communication technologies are changing health caredelivery and are at the core of e�ective, responsive health systems. These technologiesare key to connecting people, information and research to improve health in countries.�[50]

The main added values of HealthGrids are:

• capability to query distant databases in a secure way

• computationally intensive algorithms can be executed much quicker

• high-bandwith reliable networks support large data transfers

• outsourcing IT services to a third-party reduces costs

• collaborative work can be done in across hospital, regional and national borders

7.4 Ethics and legislation

Healthcare has similar security requirements as other Grid applications, but consequencesof `failure' can have more impact due to the nature of the data, as privacy violation isirreversible. In dealing with personal sensitive information con�dentiality is key. Thereare two sides of data privacy:

• ethics: nature of the data has consequences on how it is treated

• legislation: failure to secure privacy of data can have devastating consequencesdue to public opinion or law suits

PARTNER will deal with prospective partly randomized studies to evaluate the e�ective-ness and toxicity of ion therapy. Therefore clinical trials will be carried out according tothe current European and National legislation, e.g::

1. Clinical investigation of medical devices for human subjects, EN ISO 14155-2,

2. Declaration of Helsinki in the latest version

3. EC Directive 2001/20/EC as it is implemented by the national legislation.

Data required by PARTNER will contain clinical, imaging and molecular data, and forresearch must be �identi�able� thus making this task a sensitive problem. As an exampleof solving this legislative problem ACGT constructed on top of the normal Grid securityinfrastructure a Data Protection Framework (DPF) in order to easily comply with Eu-ropean data protection legislation. DPF makes use of anonymization of data, contractsbetween involved parties, consent forms and having an audit trail of every step data takes(see Sec. 3.3).

25

Page 31: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 7

In the PARTNER Project the sensitive aspects will be safeguarded through reviewsof an internal Advisory Board for Ethical and Legal Issues (Integrated into the Supervi-sory Board), the ethical committees at the various institutes and the national competentauthorities.

The requirements for the patients data security at the national level and at the Eu-ropean level is part of this project (as part of WP22 and 23) and will be analyzed in afuture report.

26

Page 32: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 8

Recommendations

The previous chapters look at the existing projects in order to de�ne the PARTNERPlatform, a prototype Grid hadron therapy testbed.

At present there is no infrastructure to support consistent storage and sharing of clin-ical data and analysis results among the emerging hadron therapy centers. Instead, dataand results usually lie around on DVDs or individuals' computers. There are additonalreasons for having a collaborative data-sharing platform:

• Cancer research data being generated is already signi�cant and sometimes is notfully recorded.

• Provide data integrated and translated into knowledge that can be easily accessed,analyzed and exploited.

• A unique opportunity to treat patients using novel techniques that can in turn leadto new discoveries.

Thus in order to achieve optimum results within the PARTNER project we will need:

• active interaction with hadron therapy centers for platform integration

• collaboration within PARTNER network, especially CERN, IFIC, Oxford andSurrey experts

• visit reference Hospitals for systems and work�ow hand-on experience

• compare di�erent approaches and standards for clinical data management

• setup legal and security environment for medical data sharing

8.1 Platform Proposal Draft

Based on the PARTNER brainstorming meetings and the discussions with the otherESR involved in the creation of the PARTNER Grid Platform a proposed infrastruc-ture emerged, depicted in Figure 8.1.

The PARTNER Platform will have the folowing components:

27

Page 33: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 8

• Infrastructure gathers data from di�erent sources relying on a set of Gridmiddleware services through a data integration layer.

• Data Sources represent the databases which will be accessible in the infras-tructure, e.g. particle therapy centers, hospitals, other repositories...

• Users and Use Cases will access the data integrated in the infrastructurethrough a common access portal.

These three parts communicate via

• Standard Interfaces and are embedded in a

• Security Framework which regulates data access and user rights.

In the following paragraphs, each part of the PARTNER software platform will bedescribed in more detail.

8.2 Infrastructure

The infrastructure is the backbone of the whole platform and should provide a connectionbetween user services and the data sources.

Despite the data coming from disparate sources (mainly hospitals and particle therapycenters) the infrastructure should o�er a common interface to all this data. In order toachieve this, the respective (clinical) data bases have to be �integrated� using semanticmediation and ontologies.

These technologies will bridge the semantic gaps which exist across the di�erent datasources. The infrastructure will therefore consist of two parts, a Grid infrastructure anda layer for data integration, explained in the following two subSec.s:

8.2.1 Grid middleware

The PARTNER platform will be connected using the gLite EGEE middleware. Thisrequires a set of core services which have to be set up, like VOMS, MyProxy, LFC,BDII, DPM, AMGA and Hydra.

On top of gLite Grid middleware, higher services have to be deployed, like MedicalData Manager for sharing of anonymized image data. Direct user access to thedata is achieved using web-portals, e.g. Gridsphere.

8.2.2 Data Integration

The PARTNER platform will integrate several databases and will provide a single anduniform query interface to the shared data. This means that the data integrationunit has to translate user requests (stated in a standardized �platform-language�)into requests appropriate for the relevant databases.

As an example of data integration, a user might send a query to the infrastructurewhich is �translated� by the data integration unit and forwarded to the databases. The

28

Page 34: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 8

Figure 8.1: PARTNER Proposed Grid Platform

29

Page 35: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 8

results of this query are then returned to the user, respecting con�dentiality for the dataand the user's access rights.

This part of the infrastructure comprises work on standard vocabularies, seman-tics and ontology of the platform and higher level services which allow mapping theavailable data into a uni�ed structure.

8.3 Sources of Data

Data sources will be primarily the emerging particle therapy centers in Europe,providing clinical data on radiation/particle therapy. Other clinics and referring hospitalsmight also become providers of patient data.

In providing access to medical data it is important that the privacy of patients isguaranteed: the patient data has to be pseudo-anonymized before leaving the hospital.Thus, the patient's identity is not visible by default but can be recovered if necessary.

Since the hospital's data will typically be stored within the boundary of a �rewall,approaches for accessing this data and standard interfaces to DICOM and HL7 compatiblesystems have to be discussed.

Additional information from laboratories or e.g. tissue samples could be included,ways for retrieving this information have to be explored.

8.4 Standard interfaces

All the layers of the PARTNER Platform have to communicate using standard interfaces.The infrastructure will be based on OGSA architecture with a set of WSRF extensions.For layer interoperation the SOAP protocol seems most suitable, and exposed using XMLinterfaces. To access data source the Platform will need be compliant with DICOM andHL7 protocols.

8.5 Security

Security is one of the key components of the platform and extends into 4 areas: con�den-tiality, integrity, availability and accountability

Security can be implemented in several modes:

• PKI user certi�cates, the basic Grid method for authorization and authentication(A&A)

• single sign-on, a new login for users of common domain, e.g. an oncology department

• Policy Based Access coupled with SLA resource negotiation, adding an extra securitylayer

• A �nal layer, like the Data Protection Framework, to supervise data collection andusage

30

Page 36: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 8

8.6 Use cases

Users can access the data using a �data portal� which provides access to the data serviceso�ered by the infrastructure. This data portal will be the entry point to the platform andthe security layer will enforce the authentication and authorization of the user.

The data portal has to understand the semantics of the query (a �platform-language�as outlined in the previous Sec.) and the range and structure of data that has to beaccessed. The request will then be communicated to the sources via the infrastructuresin a secure way.

A �rst version of this data portal will allow access to the rare tumor database andcould probably be extended to function as a support system for patient referral.

Looking at existing projects, some other use cases might be relevant for the HadronTherapy community and suitable for Grid technology:

• Veri�cation of treatment plans (BEinEIMRT), coupled with image reconstruction,analysis and anatomic model reconstruction.

• Clinical trials generation and tracking (ACGT and caBIG)

• Expert Systems and Clinical Decision Support Systems (ACGT and caBIG)

• Data Tagging, image annotation and comments from clinicians on shared data(HOPE)

• Collecting machine data for quality assurance and safety by automatically identify-ing unexpected/unwanted events or correlations between them.

• Treatment plan comparisons between Japan/Europe

Services for these or other use cases could be implemented at a later stage by embeddingthe respective access portal in the security framework of the platform.

31

Page 37: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Bibliography

[1] E. Niederlaender, Causes of death in Europe, Eurostat KS-08-02-001-EN-C, 2006

[2] R. R. Wilson, Radiological use of fast protons, Radiology 47 487�91, 1946

[3] Particle Therapy Co-Operative GroupWebsite: http://ptcog.web.psi.ch/ptcentres.html

[4] V. Brower, European boost for particle therapy, Nature 457 139, 2009,doi:10.1038/457139a

[5] J. Debus, K.D. Gross, M. Pavlovic, Proposal for a Dedicated Ion Beam Facility forCancer Therapy, Darmstadt: GSI, 1998

[6] M. Krengli, R. Orecchia, Medical aspects of the National Centre for OncologicalHadrontherapy (CNAO), Radiother. Oncol. 73 S21�3, 2004

[7] M. Bajard., J. Rochat, ETOILE Project: European Light Ion Oncological TreatmentCentre vol I and II, LYCEN, Lyon: Universit´e Claude Bernard, 2002

[8] T. Auberger, E. Griesmayer, Das Project Med-AUSTRON-Designstudie, WienerNeustadt: Fotec, 2004

[9] U. Amaldi, G. Kraft, Particle accelerators take up the �ght against cancer, CERNCourier Volume 46 No 10, 17�20, 2006

[10] PARTNER Project, Particle Training Network for European Radiotherapy, FP7Grant agreement no.: 215840-2, Website: http://cern.ch/partner

[11] I. Foster, C. Kesselman and S. Tuecke, The Anatomy of the Grid: Enabling ScalableVirtual Organizations, International J. Supercomputer Applications, 15(3), 2001

[12] I. Andoulsi et. al., SHARE Integrated Road Map II, 2008, Website:http://roadmap.healthGrid.org

[13] PARTNERWork Package 22Website: https://espace.cern.ch/partnersite/workspace/faust/default.aspx

[14] I. Foster, C. Kesselman, Ch. 2 of The Grid: Blueprint for a New Computing Infras-tructure, Computational Grids, Morgan-Kaufman, 1999

[15] gLite Grid middleware Website: http://www.glite.org/

[16] European Grid for E-sciencE Project, INFSO-RI-031688, Website: http://www.eu-egee.org/

32

Page 38: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 8

[17] I. Foster et al., The Open Grid Services Architecture, Version 1.5. OGS Draft 1.5-011

[18] Web Services Architecture, W3C Working Group Note 11, 2004, Website:http://www.w3.org/TR/2004/NOTE-ws-arch-20040211/

[19] OASIS Web Services Resource Framework (WSRF), Website: http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsrf

[20] E. Laure et al, Programming the Grid with gLite, EGEE-TR-2006-001, CERN, 2006

[21] EGEE Deployment Guide, MSA 3.5.1, EGEE-III INFSO-RI-222667

[22] EGEE-II-MJRA 1.2, Functional description of Grid components, Website:https://edms.cern.ch/document/736259/2

[23] HOPE and ACGT, Website: http://eu-acgt.org/news/newsletters/summer-2009/single-article/archive/2009/july/article/hope-hospital-platform-for-e-health.html

[24] J. Montagnat et. al., A Secure Grid Medical Data Manager Interfaced to thegLite Middleware, Journal of Grid Computing (Kluwer) 6 (1) 45-59, 2008, doi:10.1007/s10723-007-9088-2

[25] DICOM: Digital imaging and communication in medicine. Website:http://medical.nema.org/

[26] Advancing Clinico-Genomic Clinical Trials on Cancer (ACGT) Project, Website:http://www.eu-acgt.org/

[27] ACGT Consortium, Website: http://eu-acgt.org/consortium.html

[28] J. Pukacki, Grid Technologies for Cancer Research in the ACGT Project, inOGF25 - Grid technologies in e-Health Catania , 2-6 March 2009, Website:www.ogf.org/OGF25/materials/1547/acgt_ogf.pdf

[29] I. Foster, Globus Toolkit Version 4: Software for Service-Oriented Systems., in IFIPInternational Conference on Network and Parallel Computing, Springer-Verlag LNCS3779, pp 2-13, 2005

[30] GT4 GRAM,Website: http://www-unix.globus.org/toolkit/docs/4.2/4.2.1/execution/gram2/#gram2

[31] Gridge Toolkit, Website: http://www.Gridge.org/

[32] GRMS: GridLab Resource Management System, Website:http://www.Gridlab.org/WorkPackages/wp-9/

[33] OGSA-DAI, Website: http://www.ogsadai.org.uk/

[34] Gridsphere Project, Website: http://www.Gridsphere.org/Gridsphere/Gridsphere

[35] Center for Data Protection, Website: https://cdp.custodix.com/

[36] Cancer Biomedical Informatics Grid, Website: https://cabig.nci.nih.gov/

33

Page 39: Health Grids: overview and added-value, PARTNER Work ......Metadata and Catalog Services is o ered using LCG File Catalog (LFC) that provides a hierarchical view of store locations

Chapter 8

[37] caGrid, Website: http://caGrid.org/display/caGridhome/Home

[38] Mobius, Website: http://projectmobius.osu.edu/

[39] caCORE infrastructure, Website: http://ncicb.nci.nih.gov/infrastructure/cacoresdk

[40] S. Hastings et al., Introduce: An Open Source Toolkit for Rapid Development ofStrongly Typed Grid Services, Journal of Grid Computing, vol. 5, pp. 407-427, 2007

[41] A. Gómez et al., Monte Carlo Veri�cation of IMRT treatment plans on Grid , in FromGenes to Personalized HeathCare: Grid Solutions for the Life Sciences, Proceedingsof HealthGrid 2007, Nicolas Jacq and others, eds., IOS Press, 2007

[42] BEinGrid project, Website: http://www.beinGrid.eu/be25.html

[43] M.G. Bugeiro et. al, Integration of SLAs with GridWay in BEinEIMRT project,Conference 3rd Iberian Grid Infrastructure Conference (IBERGrid 2009), Valencia(Spain), May, 2009, ISBN 978849745406-3

[44] E. Huedo, R.S. Montero, I.M. Llorente, A Framework for Adaptive Execution onGrids, Software - Practice and Experience, 34 (7), pp. 631�651, Ed. John Wiley &Sons, 2004

[45] A. Andrieux et al., Web Services Agreement Speci�cation (WS-Agreement) Speci�-cation from the Open Grid Forum (OGF), 2007

[46] M. Parkin, R. M. Badia, J. Martrat, A Comparison of SLA Use in Six of the Euro-pean Commissions FP6 Projects, CoreGrid Technical Report No. TR-0129, Website:http://www.coreGrid.net/mambo/images/stories/TechnicalReports/tr-0129.pdf

[47] RadiotherapyGrid Case Study, Website: http://www.it-tude.com/rg-case-study.html

Policy Based Access Control, Website: http://www.Gridipedia.eu/1091.html

[48][49] NCRI ONIX, Website: http://www.ncri-onix. org.uk

[50] J. Dzenowagis, G.Kernen, WSIS report �Connecting for Health: global vision, localinsight�, ISBN 92 4 159390

[51] N. Santos, B. Koblitz, Distributed Metadata with the AMGA Metadata Catalog,in Workshop on Next-Generation Distributed Data Management HPDC-15, Paris,France, June 2006

[52] D. de Roure, M. Baker, N. R. Jennings and N. Shadbolt, The evolution of the Grid,in Grid Computing: Making the Global Infrastructure a Reality, Wiley, 65-100, 2003

34