gift-cloud: a data sharing and collaboration platform for ... · gift-cloud: a data sharing and...

10
GIFT-Cloud: A data sharing and collaboration platform for medical imaging research Tom Doel a,1, *, Dzhoshkun I. Shakir a,1 , Rosalind Pratt a,b , Michael Aertsen c , James Moggridge d , Erwin Bellon c,e , Anna L. David b , Jan Deprest b,f , Tom Vercauteren a , Sébastien Ourselin a a Translational Imaging Group, Centre for Medical Imaging Computing, University College London, London, UK b Institute for Women’s Health, University College London, London, UK c Department of Imaging & Pathology, UZ Leuven, Leuven, Belgium d Department of Medical Physics, UCL Hospitals, London, UK e Department of Information Technology, UZ Leuven, Leuven, Belgium f Department of Obstetrics, UZ Leuven, Leuven, Belgium ARTICLE INFO Article history: Received 23 June 2016 Received in revised form 3 October 2016 Accepted 3 November 2016 ABSTRACT Objectives: Clinical imaging data are essential for developing research software for computer- aided diagnosis, treatment planning and image-guided surgery, yet existing systems are poorly suited for data sharing between healthcare and academia: research systems rarely provide an integrated approach for data exchange with clinicians; hospital systems are focused towards clinical patient care with limited access for external researchers; and safe haven environ- ments are not well suited to algorithm development.We have established GIFT-Cloud, a data and medical image sharing platform, to meet the needs of GIFT-Surg, an international re- search collaboration that is developing novel imaging methods for fetal surgery. GIFT- Cloud also has general applicability to other areas of imaging research. Methods: GIFT-Cloud builds upon well-established cross-platform technologies.The Server provides secure anonymised data storage, direct web-based data access and a REST API for integrating external software.The Uploader provides automated on-site anonymisation, en- cryption and data upload. Gateways provide a seamless process for uploading medical data from clinical systems to the research server. Results: GIFT-Cloud has been implemented in a multi-centre study for fetal medicine re- search.We present a case study of placental segmentation for pre-operative surgical planning, showing how GIFT-Cloud underpins the research and integrates with the clinical workflow. Conclusions: GIFT-Cloud simplifies the transfer of imaging data from clinical to research in- stitutions, facilitating the development and validation of medical research software and the sharing of results back to the clinical partners. GIFT-Cloud supports collaboration between multiple healthcare and research institutions while satisfying the demands of patient con- fidentiality, data security and data ownership. © 2016 The Authors. Published by Elsevier Ireland Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). Keywords: Data sharing Biomedical research Cross-disciplinary research Anonymisation Deidentification Fetal surgery * Corresponding author.Translational Imaging Group, Centre for Medical Image Computing, University College London, Gower Street, London WC1E 6BT, UK. E-mail address: [email protected] (T. Doel). 1 Equal contribution by these authors. http://dx.doi.org/10.1016/j.cmpb.2016.11.004 0169-2607/© 2016 The Authors. Published by Elsevier Ireland Ltd. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/). computer methods and programs in biomedicine 139 (2017) 181–190 journal homepage: www.intl.elsevierhealth.com/journals/cmpb

Upload: others

Post on 14-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GIFT-Cloud: A data sharing and collaboration platform for ... · GIFT-Cloud: A data sharing and collaboration platform for medical imaging research Tom Doel a,1*, Dzhoshkun I. Shakir

GIFT-Cloud: A data sharing and collaborationplatform for medical imaging research

Tom Doel a,1,*, Dzhoshkun I. Shakir a,1, Rosalind Pratt a,b,Michael Aertsen c, James Moggridge d, Erwin Bellon c,e, Anna L. David b,Jan Deprest b,f, Tom Vercauteren a, Sébastien Ourselin a

a Translational Imaging Group, Centre for Medical Imaging Computing, University College London, London, UKb Institute for Women’s Health, University College London, London, UKc Department of Imaging & Pathology, UZ Leuven, Leuven, Belgiumd Department of Medical Physics, UCL Hospitals, London, UKe Department of Information Technology, UZ Leuven, Leuven, Belgiumf Department of Obstetrics, UZ Leuven, Leuven, Belgium

A R T I C L E I N F O

Article history:

Received 23 June 2016

Received in revised form

3 October 2016

Accepted 3 November 2016

A B S T R A C T

Objectives: Clinical imaging data are essential for developing research software for computer-

aided diagnosis, treatment planning and image-guided surgery, yet existing systems are poorly

suited for data sharing between healthcare and academia: research systems rarely provide

an integrated approach for data exchange with clinicians; hospital systems are focused towards

clinical patient care with limited access for external researchers; and safe haven environ-

ments are not well suited to algorithm development. We have established GIFT-Cloud, a data

and medical image sharing platform, to meet the needs of GIFT-Surg, an international re-

search collaboration that is developing novel imaging methods for fetal surgery. GIFT-

Cloud also has general applicability to other areas of imaging research.

Methods: GIFT-Cloud builds upon well-established cross-platform technologies. The Server

provides secure anonymised data storage, direct web-based data access and a REST API for

integrating external software.The Uploader provides automated on-site anonymisation, en-

cryption and data upload. Gateways provide a seamless process for uploading medical data

from clinical systems to the research server.

Results: GIFT-Cloud has been implemented in a multi-centre study for fetal medicine re-

search.We present a case study of placental segmentation for pre-operative surgical planning,

showing how GIFT-Cloud underpins the research and integrates with the clinical workflow.

Conclusions: GIFT-Cloud simplifies the transfer of imaging data from clinical to research in-

stitutions, facilitating the development and validation of medical research software and the

sharing of results back to the clinical partners. GIFT-Cloud supports collaboration between

multiple healthcare and research institutions while satisfying the demands of patient con-

fidentiality, data security and data ownership.

© 2016 The Authors. Published by Elsevier Ireland Ltd. This is an open access article

under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Keywords:

Data sharing

Biomedical research

Cross-disciplinary research

Anonymisation

Deidentification

Fetal surgery

* Corresponding author. Translational Imaging Group, Centre for Medical Image Computing, University College London, Gower Street,London WC1E 6BT, UK.

E-mail address: [email protected] (T. Doel).1 Equal contribution by these authors.

http://dx.doi.org/10.1016/j.cmpb.2016.11.0040169-2607/© 2016 The Authors. Published by Elsevier Ireland Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

c om pu t e r m e thod s and p r og r am s i n b i om ed i c i n e 1 3 9 ( 2 0 1 7 ) 1 8 1 – 1 9 0

journal homepage: www.int l .e lsevierheal th .com/ journals /cmpb

Page 2: GIFT-Cloud: A data sharing and collaboration platform for ... · GIFT-Cloud: A data sharing and collaboration platform for medical imaging research Tom Doel a,1*, Dzhoshkun I. Shakir

1. Introduction

GIFT-Cloud is a secure data platform that simplifies and au-tomates the process of accessing and sharing data for medicalimaging research. GIFT-Cloud has been developed to supportGIFT-Surg [1], a multi-institution collaboration between aca-demic and clinical researchers developing novel ways to imagethe fetus for fetal surgery.The GIFT-Cloud platform has generalapplicability and could be adapted to benefit medical imagingresearch projects in other fields.

Medical imaging is widely used in clinical practice, with ap-plications ranging from screening and diagnosis to treatmentplanning and image-guided surgery. The continuing develop-ment of novel imaging modalities, improved protocols andcomputer-assisted analysis software has huge potential toimprove disease assessment and patient outcome. These ad-vances require ever closer collaboration between academia,industry and healthcare providers.

One major issue in such collaborations is managing theimages and associated metadata needed for developing, testingand validating novel algorithms for medical image analysis.A wide range of clinical data are required that offer a repre-sentative sample of the disease condition being studied. Asresearch increasingly specialises on a narrower range of ap-plications and diseases, it becomes harder to obtain sufficientclinical data from any single institution. Firstly, for rare diseaseconditions individual hospitals may see only a limited numberof cases per year. Secondly, due to the rapid changes in imagingtechnology, data acquired using older hardware or protocolsmay be of less research value. Therefore, specialised researchincreasingly benefits from collaboration with multiple clini-cal institutions. The legal and technical requirements of datasharing are major considerations within any such projects.

Patient confidentiality and data protection are enshrinedin national, international and institutional regulations [2,3] lim-iting how and where patient-derived data are acquired andstored. Data use may require patient consent and permissionfrom ethics boards, and institutions may wish to protect in-stitutional copyright and intellectual property [4]. A significanttechnical challenge is the removal of personal identifiable in-formation (anonymisation) before data leave the secure clinicalenvironment. Naive approaches to anonymisation simplyremove personal data, but this makes it impossible to later re-combine data from the same subject, precluding multi-modal image analysis or follow-up studies. Further technicalchallenges include the sharing of data between protected securenetworks, obtaining unified access to data originating fromseparate storage systems, data backup, fault tolerance in thecase of interrupted data transfer and reconciling the need forlarge volumes of data with the bandwidth limitations of datatransfer between institutions.

Collaborative research projects often do not consider theseissues in the early project stages and this frequently resultsin ad-hoc and suboptimal solutions for sharing research data.Manually-driven processes are potentially prone to humanerror, and sharing data can quickly become a burden to clini-cians, discouraging data provision beyond the minimumnecessary. Conversely, the results of image processing re-search often do not easily flow back to the clinicians, limiting

their ability to evaluate this research and therefore its clini-cal potential.

The goal of GIFT-Cloud is to provide a flexible, clinician- andresearcher-friendly system for anonymising and sharing dataacross multiple institutions. GIFT-Cloud aims to satisfy thedemands of patient confidentiality, data security and legal agree-ments for data sharing and ownership. GIFT-Cloud brings anumber of benefits to the GIFT-Surg project.These include en-couraging more data sharing, standardising and simplifying thedata transfer and anonymisation processes, reducing the timeburden on clinicians and researchers, providing centralisedstorage and backup of research data and simplifying the sharingof researcher-derived results to clinicians for validation.

GIFT-Cloud achieves these aims by building upon exist-ing, well-established cross-platform technologies. GIFT-Cloud software is available for research use under a BSDSimplified (BSD-3) licence [5].

In this article we describe the GIFT-Cloud system and il-lustrate its use in the clinician–researcher workflow.The articleis structured as follows. Section 2 reviews published systemsin the field, and discusses the additional requirements that havemotivated the development of GIFT-Cloud. Section 3 is an over-view of the system features and technology. Section 4 presentsa case study of how data are acquired and exchanged usingGIFT-Cloud for an example application in the GIFT-Surg project.Section 5 presents a qualitative evaluation of the system in-cluding descriptions of features, scalability, performance andtesting. Finally, Section 6 presents discussions and conclusions.

2. State of the art

Healthcare institutions typically store their imaging data in aPicture Archiving and Communication System (PACS) [6], whichrelies on the DICOM standard [7] for communicating and man-aging the data. While data exchange with a PACS is highlystandardised, the same is not true for the overall ElectronicMedical Record (EMR). Both PACS and EMR may support remoteaccess but that is solely to support clinical routine and not re-search use of the data. This is because healthcare providershave obligations towards patient confidentiality and data se-curity and are naturally reluctant to give external users accessto their systems.

One potential approach which permits data sharing whileaddressing confidentiality issues is to create a “data safe haven”.This is a secure environment linked to a clinical institution inwhich research software can be run on patient-identifiable data,without giving researchers direct access to the data.To protectpatient confidentiality, only aggregated data are exported, withno output for individual patients. Data safe havens are suitedto using mature, stable software in clinical research. However,they are too restrictive for novel algorithm development, whereindividual patient results are necessary for analysing the per-formance and iteratively testing improvements to the algorithm.

In practise, therefore, developing novel image processing al-gorithms requires the research institution to host data on itsown systems. This requires transfer of data from the clinicalinstitutions to the research institution.Typically, anonymisationmust be performed on the clinical site to prevent identifiable

182 c om pu t e r m e thod s and p r og r am s i n b i om ed i c i n e 1 3 9 ( 2 0 1 7 ) 1 8 1 – 1 9 0

Page 3: GIFT-Cloud: A data sharing and collaboration platform for ... · GIFT-Cloud: A data sharing and collaboration platform for medical imaging research Tom Doel a,1*, Dzhoshkun I. Shakir

patient data from leaving the hospital. A common result is aprocess where clinicians manually anonymise data for storageon removable media or a shared system, after which research-ers must copy the data onto a shared resource within theacademic institution. Anonymisation is manually initiated bythe clinician either using functionality on the data generat-ing equipment, or from the PACS, or using standalone software.However, this is prone to patient data leakage if the softwareis not used with consistent settings, something that is diffi-cult to guarantee in a manual procedure [8]. Manual exportingprocesses are also error-prone, for example if individual imagesare mistakenly exported instead of series. Anonymisationsystems do not normally support anonymisation of pixel data.DicomCleaner [9] is one of the few programs supporting pixeldata anonymisation, although only in an interactive mode.

Research institutions are increasingly using server-based datamanagement systems to facilitate data sharing. Established re-search systems include XNAT [10], CTP [11], HID [12], COINS[13], LORIS [14], BIRN [15] and TCIA [16]. While mostly fo-cussed on specific research areas and particular workflows,these systems often have more general extensibility. Forexample, XNAT was developed for neuroimaging research butis now used in other areas [15,17]. These systems can consid-erably improve collaboration, but they do not always solve norsimplify the problem of obtaining and anonymising data fromclinical sources.This is particularly true where the clinical andresearch data are hosted on different networks, requiring anintermediate connection such as the Internet or a dedicateddata connection. While some research systems permit securedata upload from the hospital, they cannot in general be con-nected directly to the PACS or hardware devices due to risk ofsending non-anonymised data outside the clinical environ-ment and the security implications of connecting medicaldevices directly to the Internet. To address these concerns, anintuitive process is needed for clinicians to transfer data fromthe clinical to the research system while guaranteeinganonymisation and maintaining the data security of the hos-pital system.

These considerations motivate a “gateway” approach inwhich an intermediate server is installed within the clinicalenvironment. That server typically acts as a DICOM destina-tion allowing data to be pushed from the PACS or scanners.The received data are anonymised by the gateway and sent tothe server in the research institution. The technical detailsof the file transfer are hidden from the end user, but will typi-cally involve a direct data connection or Internet transfer usingsecure communication protocols. Internet-based transfer ismore practical due to the time and cost involved in installinga direct data link. However, to meet identity protection re-quirements the gateway must be installed within the clinicalsite, be able to receive data from the PACS and provide localdata anonymisation before encrypted upload to the remoteserver.

Integrating a gateway service solely with the hospital PACSmay be insufficient for obtaining all the required data, sincethe PACS system may not store all the data desired by re-searchers. This is because some data may not be in a formatsuitable for storage in the PACS, or because large datasets suchas endoscopic video or 3D ultrasound volumes may not be ar-chived if they are not considered to have future diagnostic

benefit, even if they have substantial value for researchers de-veloping and validating novel methods. A general data-sharing system needs to be flexible enough to support obtainingof data from multiple different systems and devices.

None of the existing research systems support all the fea-tures required by the GIFT-Surg project, such as grouping ofdatasets using pseudonymised patient identifiers, maintain-ing a hospital-side patient list of anonymised identifiers,providing a mechanism for client-side anonymisation of pixeland metadata and upload of data for multiple subjects simul-taneously. Rather than develop a new system from scratch, GIFT-Cloud builds upon the well-established XNAT system to providethese new capabilities, extending the server framework andadding custom gateway and uploading software to provide aflexible system that meet the needs of the clinicians andresearchers.

3. Description of GIFT-Cloud

3.1. System overview

A simplified system diagram of GIFT-Cloud is shown in Fig. 1.GIFT-Cloud consists of a central GIFT-Cloud Server and op-tional GIFT-Cloud Gateway servers within each clinicalinstitution. GIFT-Cloud is built using cross-platform technolo-gies, offering flexibility with regard to the required hardwareand operating systems. GIFT-Cloud provides data upload withfully automated, configurable, client-side anonymisation ofpatient information contained in both metadata and pixel data.The system provides automated subject matching and patientlist mapping while preserving subject anonymity in the re-search data. The system requirements are summarised inTable 1.

3.2. GIFT-Cloud Server

Anonymised research data are hosted on a dedicated GIFT-Cloud Server running a customised version of XNAT 1.6 [10].Users can access and download data within the collaborat-ing institutions via a secure website extended from the XNATweb interface. A Representational State Transfer ApplicationProgramming Interface (REST API) provides a set of protocolsand services that can be accessed programmatically, whichallows seamless integration of the research database directlyinto the software applications being developed by research-ers. Our GIFT-Cloud Server is a dedicated virtual machinerunning CentOS Linux 7.2 with the database managed usingPostgreSQL 9.2 and XNAT 1.6 running under Apache Tomcat7.0, Oracle Java 1.7 and the Apache 2.4 daemon.

3.3. GIFT-Cloud Uploader

Using the GIFT-Cloud Uploader software, users can interac-tively import data from the local file system or by querying alocal PACS. GIFT-Cloud Uploader is a cross-platform Java WebStart application, custom-developed for uploading data di-rectly by users or for running as a service on GIFT-CloudGateway servers. The software can be installed directly from

183c om pu t e r m e thod s and p r og r am s i n b i om ed i c i n e 1 3 9 ( 2 0 1 7 ) 1 8 1 – 1 9 0

Page 4: GIFT-Cloud: A data sharing and collaboration platform for ... · GIFT-Cloud: A data sharing and collaboration platform for medical imaging research Tom Doel a,1*, Dzhoshkun I. Shakir

the GIFT-Cloud website and can be updated without the needfor incoming connections into the hospital’s network. TheUploader makes use of open-source software including XNAT[10] (BSD Simplified), DicomCleaner [9] (BSD Simplified with ad-ditional clause) and dcm4che [18] (Mozilla Public License 1.1).

3.4. Integration with hospital systems

Optional Gateway servers allow each institution to provideimage uploading mechanisms suited to local policies, such assending data from the PACS or the electronic medical record.Each clinical institution can provide a Gateway system by in-

stalling the cross-platform GIFT-Cloud Uploader software onan appropriate on-site server. This server is provided by theinstitution in a way that conforms to their local security andmaintenance requirements. Users can DICOM Push data to theGateway from PACS, scanners and other DICOM-compatibledevices; the data will automatically be anonymised by theGateway before uploading. Non-DICOM data can be uploadedby copying to a shared folder on the Gateway server, an es-tablished mechanism that is simple to integrate with thehospital IT system while remaining secure within the hospi-tal environment. Data upload is fault-tolerant; if uploading fails,for example due to a server outage, the file is retained for up-loading at a later time. This shields the clinician and sendinginstitution from these concerns regarding the current statusof the research server.

3.5. Anonymisation

GIFT-Cloud automatically anonymises patient information inDICOM metadata and burnt-in annotations in pixel data. Fornon-DICOM data, custom anonymisation procedures are de-veloped as required. Anonymisation is performed by the GIFT-Cloud Uploader software on the uploading computer, whichmay be an individual computer where the user is performingdirect file upload, or the Gateway server. This localanonymisation ensures that patient information does not leavethe clinical institution.

The DICOM metadata anonymisation procedure is shownin Fig. 2. Fields required for correctly grouping data, such asthe patient identifier, series instance UID and study instanceUID, are pseudonymised using a one-way SHA-1 hash algo-rithm. This produces a consistent identifier for each subjectand series, ensuring data can always be grouped correctly, even

Fig. 1 – Simplified diagram illustrating collaboration using GIFT-Cloud. Data can be provided from hospitals as part of theroutine workflow for patient care. Researchers can upload analysis of the data, which are then available to othercollaborators. Confidential patient information never leaves the source hospital.

Table 1 – System requirements. This table describes thecore hardware and software requirements for thecomponents of the GIFT-Cloud system, and the currentversions used in the GIFT-Cloud system installed atUCL.

Minimumrequirement

UCLsystem

GIFT-Cloud ServerOperating system Linux/Windows/

macOSCentOS Linux 7.2

PostgreSQL 9.1 9.2Oracle Java SDK 1.7 1.7Apache Tomcat 7.0 7.0XNAT 1.6 1.6

GIFT-Cloud GatewayOperating system Linux/Windows/

macOSMicrosoft WindowsServer 2008 R2

Oracle Java JRE 1.7 1.8Java Advanced Imaging 1.1 1.1GIFT-Cloud Uploader 1.1.10 1.1.10

184 c om pu t e r m e thod s and p r og r am s i n b i om ed i c i n e 1 3 9 ( 2 0 1 7 ) 1 8 1 – 1 9 0

Page 5: GIFT-Cloud: A data sharing and collaboration platform for ... · GIFT-Cloud: A data sharing and collaboration platform for medical imaging research Tom Doel a,1*, Dzhoshkun I. Shakir

if uploaded at different times, while preserving pseudo-anonymity (see Fig. 2). Patient names are replaced by anarbitrary human-readable pseudonymous identifier (Re-search ID).Additional fields documented in the DICOM standardthat may contain information that could help to identify thepatient, such as date of birth or accession number, are removed.The anonymisation of other fields is configurable per-projectusing scripts written in the DicomEdit language [19]. All private(vendor-specific) fields, which are undocumented and some-times contain patient information, are removed unlessspecifically whitelisted in the anonymisation script. As an ad-ditional security measure, GIFT-Cloud Uploader verifies beforeeach upload that fundamental identifying fields (patient name,patient ID, date of birth and accession number) have been modi-fied, in order to prevent upload in the case of a misconfiguredor missing anonymisation script.

Some types of medical images contain patient informa-tion in annotations burnt into the pixel data. This is often thecase for 2D and 2D+t ultrasound images, and can in some casesbe true for MR localisers and scout images. This is a remnantof paper and film based organisation, which for those typesof applications is still widespread.These annotations are vendorspecific, but for a given scanner model, modality and soft-ware version, their location is generally consistent.The DICOMstandard contains a field to indicate the presence of burnt inannotations but this is optional and cannot be relied upon tosignal the presence and location of annotations [20].

Automated pixel data anonymisation is achieved using atemplate database (see Fig. 3). The Uploader uses DICOM in-formation describing the source and type of image, includingthe scanner model, scanner software version, image modal-ity and resolution, to find the template defining which regions

Fig. 2 – Diagram showing the process for anonymisation and subject matching during upload of imaging data to GIFT-Cloud. Anonymisation is performed within the clinical institution; if uploaded via PACS this is performed by the Gatewayserver, while if the user uploads data directly from their computer then this anonymisation is performed on their localcomputer. Before uploading, the image metadata are examined to determine the type of anonymisation required. ThePatient ID is translated into a pseudonymised patient identifier (PPID). This PPID is used to determine if the subject alreadyexists on the server and to obtain the pseudonymous Research ID for that subject. If no subject exists, then a new subject iscreated with a new pseudonymous Research ID. Within the metadata, the Patient ID is replaced by the PPID and the patientname is replaced by the Research ID. Required data grouping fields such as the Series Instance UID are replaced with aSHA-1 hash of their value.

185c om pu t e r m e thod s and p r og r am s i n b i om ed i c i n e 1 3 9 ( 2 0 1 7 ) 1 8 1 – 1 9 0

Page 6: GIFT-Cloud: A data sharing and collaboration platform for ... · GIFT-Cloud: A data sharing and collaboration platform for medical imaging research Tom Doel a,1*, Dzhoshkun I. Shakir

must be blacked out in each image. The black-out procedureis performed automatically on each image or video frame usingsoftware extended from DicomCleaner [9]. If the DICOM stan-dard permits an image type to contain burnt-in annotations,but no suitable template is found, then the data are not up-loaded. An interactive tool enables users to create newtemplates defining the regions to black out on the images fromscanners that are not yet in the template database.

3.6. Storing pseudonymous patient identifiers

Some research projects require clinicians to maintain a listmapping patient identifiers to the pseudonymous Research IDsused in the research data. This could for example be used inlongitudinal follow-ups with patients. The Gateway installedwithin each institution maintains this mapping in a localpassword protected file that does not leave the hospitalnetwork.

3.7. Data grouping

For multi-modal or time series data analysis, it is necessarythat data are grouped under the same subject label, even if those

data were uploaded at different times. However, typicalanonymisation processes destroy the identifiers that make suchgrouping possible. GIFT-Cloud solves this situation by using thepseudonymised patient, study and series identifiers to auto-matically match the data to the existing subjects and series.Because this mechanism is independent of the Gateway or localcomputer that performs the uploading, it ensures correct group-ing of data regardless of how the data are uploaded, even ifthey are transferred from different institutions.

3.8. Data security and access control

The GIFT-Cloud server does not receive nor store any per-sonal identifiable data, only pseudo-anonymised data. All dataexchange over the Internet is encrypted and a server certifi-cate ensures clients can only connect to the genuine server.Firewalls on the GIFT-Cloud Server and Gateways restrict accessonly to trusted clients.

Data access requires a personal account. GIFT-Cloud storesdata from each institution in a separate data group (see Fig. 4)and access to these groups can be individually configured foreach account. This supports adherence to data-sharing agree-ments that may vary between institutions.

Fig. 3 – Illustration of automated pixel data anonymisation procedure applied before uploading images. The image shown isa cropped ultrasound scan of a training phantom. The image metadata are examined to determine if pixel dataanonymisation is required. If so, a suitable template for the scanner model and image resolution is located that specifieswhich regions of the image contain personal identifiable data. These regions are blacked out before uploading. If nosuitable template is found, the data are not uploaded.

186 c om pu t e r m e thod s and p r og r am s i n b i om ed i c i n e 1 3 9 ( 2 0 1 7 ) 1 8 1 – 1 9 0

Page 7: GIFT-Cloud: A data sharing and collaboration platform for ... · GIFT-Cloud: A data sharing and collaboration platform for medical imaging research Tom Doel a,1*, Dzhoshkun I. Shakir

4. Case study: placental segmentation

To illustrate the integration of GIFT-Cloud with clinical work-flow, research and development, we describe an application infetal surgical planning and image-guided surgery that we arecurrently developing as part of the GIFT-Surg project.

4.1. Background

Identical twin placental abnormalities such as twin-to-twintransfusion syndrome (TTTS) [21] and selective intrauterinegrowth restriction (sIUGR) [22] can have a major impact on ma-ternal and fetal outcome. Both conditions can be treatedeffectively using minimally invasive fetoscopic surgery. Ad-vanced image-based surgical planning techniques have thepotential to improve efficacy and reduce treatment-related mor-bidity and mortality [23]. However, such an image-baseddiagnosis and surgical planning system requires robust andaccurate segmentation of fetal and maternal organs from MRIimages acquired during the diagnostic phase. To assist withthis, we have developed an advanced computational methodfor semi-automated organ segmentation, focussing initially onthe placenta [24]. The ongoing development and assessment

of this method requires recurring testing and validation withnew data.

4.2. Data

Retrospective and prospective fetal MRI scans are obtained fromUCLH and UZ Leuven as part of routine scanning, with ap-proval by the ethical committees and maternal consent forresearch use. For this research, datasets are selected that showcomplete placenta volumes for the second trimester.

4.3. Workflow

The workflow is illustrated in Fig. 5. The clinician selectsdatasets using the hospital PACS or the medical record, andsends them to the GIFT-Cloud Gateway node. The cliniciansneed not be concerned about anonymisation or upload pro-cedure which happens automatically. The hospital Gatewayinternally stores the mapping between patient ID andpseudonymised research ID for future reference. Researchersretrieve the anonymised data using software integrated withGIFT-Cloud, perform the segmentation using Slic-Seg [24] andupload the resulting segmentation mask to the server.The cli-nicians can access and evaluate the resulting segmentation

Fig. 4 – The data access permissions model for GIFT-Cloud. In this example, the current data sharing agreements betweeninstitutions permit User 1 to access data from Institution A but not from Institution B. User 2 is permitted to access datafrom both institutions.

187c om pu t e r m e thod s and p r og r am s i n b i om ed i c i n e 1 3 9 ( 2 0 1 7 ) 1 8 1 – 1 9 0

Page 8: GIFT-Cloud: A data sharing and collaboration platform for ... · GIFT-Cloud: A data sharing and collaboration platform for medical imaging research Tom Doel a,1*, Dzhoshkun I. Shakir

through the GIFT-Cloud web interface or by downloading thedata to their own systems.

5. Evaluation

5.1. Feature summary

Table 2 summarises key features available to end users of GIFT-Cloud.These include: standard features provided by XNAT; user-facing features implemented by GIFT-Cloud using XNAT’sprogrammatic REST API; and novel features developed forGIFT-Cloud.

5.2. Capacity

Data storage does not present a fundamental limitation for thiswork as storage is managed by a PostgreSQL database on anextensible hard disk. GIFT-Cloud is designed with an ex-pected need for storing 20,000 imaging series from 1000 subjects,comprising up to one million images in total. We expect theoperating limits of the system to be exponentially higher thanthis, based on XNAT usage by other projects. For example, theVanderbilt University IIS CCI database, based on XNAT, hostsmore than a quarter of a million scans for over 28,000 sub-jects [25].

Fig. 5 – Data workflow showing how GIFT-Cloud is used in the development of an algorithm for placental segmentation forfetal surgery applications. The clinician uses the PACS to send a suitable dataset to GIFT-Cloud. The data are sent by DICOMPush to the GIFT-Cloud Gateway, which anonymises, encrypts and uploads the data to the GIFT-Cloud server. Theresearcher can then download the anonymised data, perform the segmentation and upload the resulting segmentationmask, which will be grouped in the same subject as the anonymised data. The clinician can now download and comparethe anonymised data and segmentation for validation. Through such workflows GIFT-Cloud makes it possible to easilyshare new results during the active development stages of novel imaging methods.

Table 2 – Summary of GIFT-Cloud features includingthose provided by the XNAT system. GIFT-Cloud extendsXNAT with additional tools and novel features. XNATfeatures labelled “via API” are not part of the standardXMAT 1.6 user interface but are availableprogrammatically through the REST API and may beinvoked by third-party tools such as those provided byGIFT-Cloud.

Feature XNAT1.6

GIFT-Cloud

Web user interface ✓ ✓

REST API ✓ ✓

Automatic subject grouping by PPID lookup ✓

Interactive single-subject data upload ✓ ✓

Automated multi-subject data upload via API ✓

Remote DICOM Gateway for query/retrieve ✓ ✓

Remote DICOM Gateway for anonymisationand store

PACS integration via API ✓

PACS integration on-site anonymisation andencryption

Configurable client-side metadataanonymisation

✓ ✓

Configurable client-side pixel dataanonymisation

DICOM CT, MR, US support ✓ ✓

NIFTI, Analyze support via API ✓

Automatic MPEG to DICOM video conversion ✓

188 c om pu t e r m e thod s and p r og r am s i n b i om ed i c i n e 1 3 9 ( 2 0 1 7 ) 1 8 1 – 1 9 0

Page 9: GIFT-Cloud: A data sharing and collaboration platform for ... · GIFT-Cloud: A data sharing and collaboration platform for medical imaging research Tom Doel a,1*, Dzhoshkun I. Shakir

5.3. Scalability

The system allows the practical addition of unlimited new clini-cal sites by installing local Gateways, with no limit to thenumber of users who can access the system. The total capac-ity of the system is limited by the single server design, whichis adequate for our purposes. However, if individual sites regu-larly require large volumes of data upload or download, aredundant server design may be more appropriate.

5.4. Performance

GIFT-Cloud is designed to prioritise a low impact on networkinfrastructure over speed and latency; therefore data uploadis bandwidth-limited in a configurable manner. Speed of uploadis therefore not a consideration for this work.

5.5. Fault tolerance

The Gateway servers asynchronously store and upload data re-ceived from the PACS. To prevent data loss, files are physicallystored on the Gateway servers until notified of successfulstorage by the Server. If the Server connection is not avail-able or data upload fails, uploading is re-attempted multipletimes with an exponentially increasing delay.

5.6. Testing

The Server and Uploader software include a suite of auto-mated unit, integration and system tests. After installing theGateway systems at each hospital, the anonymisation mecha-nisms were tested with dummy data to ensure compliance withdata anonymisation policies.

5.7. Data governance

GIFT-Cloud fulfils the requirements of patient confidentialityand information governance processes within all the collabo-rating institutions and in the UK were passed by the appropriateNHS Caldicott guardian and were cleared by the London Hamp-stead Research Ethics Committee (15/LO/1488).

6. Discussion and conclusion

GIFT-Cloud provides a secure, open-source platform for two-way sharing of research data across multiple institutions [5].GIFT-Cloud is designed to meet the demands of collaborativeresearch projects by simplifying data transfer and automat-ing anonymisation processes.

GIFT-Cloud fulfils the following key requirements: firstly, theability to easily integrate with local IT infrastructure of the in-stitutions that provided clinical data and expertise, with theend user’s systems such as PACS or electronic medical record,and within routine clinical workflow. Secondly, the support forvaried collaboration agreements between institutions andrelated access control restrictions.Thirdly, the support not onlyfor DICOM data but also for including images and video avail-able in other commonly used formats.

Ongoing work will identify and integrate support for addi-tional anonymisation approaches that may be needed for futuredata protocols and modalities. Other planned improvementsinclude extending support for additional data formats and videocodecs, and adding a configurable uploading bandwidth limi-tation to reduce the impact of the Gateway servers on the PACSsystems when dealing with large data streams. An improvedconfiguration updating system will allow new modalities andpixel data anonymisation templates to be added via the serverwithout requiring software updates to the Gateway and usersoftware.

GIFT-Cloud will promote interdisciplinary research withinthe GIFT-Surg project, facilitating the development of new soft-ware for fetal surgical planning and image-guided surgery.Theautomated processes significantly reduce the time burden onclinicians and researchers in data anonymisation and trans-fer, encouraging more data sharing. Researchers benefit froma centralised repository of all available data with secure accessand automated backup. Research software can be written toautomatically fetch the data directly from the server, insteadof requiring the developer to manually copy data to theirmachine. Additionally, the ability to share segmentations andother analysis results back to the GIFT-Cloud Server allows clini-cal users to access and evaluate these results.

These features are useful not only to GIFT-Surg; they canmore generally benefit medical imaging research projects wheredata sharing is required between researchers and with clini-cal institutions. The supporting XNAT framework furtherprovides a wide range of customisations and extensible fea-tures. By building on cross-platform architecture and publishingour software under an open-source licence, we offer this systemfor improving collaboration across medical imaging research.

Funding

This work was supported through an Innovative Engineeringfor Health award by the Wellcome Trust [WT101957], the En-gineering and Physical Sciences Research Council (EPSRC) [NS/A000027/1] and a National Institute for Health ResearchBiomedical Research Centre UCLH/UCL High Impact Initia-tive. Jan Deprest is funded by the Fonds voor WetenschappelijkOnderzoek Vlaanderen (FWO) [1.8.012.07] and Great OrmondStreet Hospital Children’s Charity. Anna L. David is sup-ported at UCL/UCLH by funding from the Department of HealthNIHR Biomedical Research Centres funding scheme. SébastienOurselin receives funding from the EPSRC [EP/H046410/1, EP/J020990/1] and the MRC [MR/J01107X/1].

Conflicts of interest

None.

R E F E R E N C E S

[1] University College London, GIFT-Surg. http://www.gift-surg.ac.uk/, 2016. (Accessed 1 June 2016).

189c om pu t e r m e thod s and p r og r am s i n b i om ed i c i n e 1 3 9 ( 2 0 1 7 ) 1 8 1 – 1 9 0

Page 10: GIFT-Cloud: A data sharing and collaboration platform for ... · GIFT-Cloud: A data sharing and collaboration platform for medical imaging research Tom Doel a,1*, Dzhoshkun I. Shakir

[2] D.T. Fetzer, O.C. West, The HIPAA privacy rule and protectedhealth information: implications in research involvingDICOM image databases, Acad. Radiol. 15 (2008) 390–395.

[3] J.D. Robinson, Beyond the DICOM header: additional issuesin deidentification, Am. J. Roentgenol. 203 (2014) W658–W664.

[4] B. Fecher, S. Friesike, M. Hebing, What drives academic datasharing?, PLoS ONE 10 (2015) 25.

[5] GIFT-Surg, Open-source software on GitHub. https://github.com/gift-surg, 2016. (Accessed 22 September 2016).

[6] E. Bellon, M. Feron, T. Deprez, R. Reynders, B. Van den Bosch,Trends in PACS architecture, Eur. J. Radiol. 78 (2011) 199–204.

[7] National Electrical Manufacturers Association, Digitalimaging and communications in medicine (DICOM).http://medical.nema.org/, 2016. (Accessed 1 June16).

[8] K.Y.E. Aryanto, M. Oudkerk, P.M.A. van Ooijen, Free DICOMde-identification tools in clinical research: functioning andsafety of patient privacy, Eur. Radiol. 25 (2015) 3685–3695.

[9] PixedMed Publishing, DicomCleaner. http://www.dclunie.com/pixelmed/software/webstart/DicomCleanerUsage.html, 2016. (Accessed 1 June 2016).

[10] D.S. Marcus, T.R. Olsen, M. Ramaratnam, R.L. Buckner, Theextensible neuroimaging archive toolkit—an informaticsplatform for managing, exploring, and sharingneuroimaging data, Neuroinformatics 5 (2007) 11–33.

[11] Radiological Society of N. America, 2000 MIRC clinical trialsprocessor. http://www.rsna.org/ctp.aspx. (Accessed 1 June2016).

[12] I.B. Ozyurt, D.B. Keator, D. Wei, C. Fennema-Notestine, K.R.Pease, J. Bockholt, et al., Federated web-accessible clinicaldata management within an extensible neuroimagingdatabase, Neuroinformatics 8 (2010) 231–249.

[13] A. Scott, W. Courtney, D. Wood, R. de la Garza, S. Lane, M.King, et al., COINS: an innovative informatics andneuroimaging tool suite built for large heterogeneousdatasets, Front Neuroinform 5 (2011) 33.

[14] S. Das, A.P. Zijdenbos, J. Harlap, D. Vins, A.C. Evans, LORIS: aweb-based data management system for multi-centerstudies, Front Neuroinform 5 (2011) 37.

[15] K.G. Helmer, J.L. Ambite, J. Ames, R. Ananthakrishnan, G.Burns, A.L. Chervenak, et al., Biomedical informaticsresearch, enabling collaborative research using theBiomedical Informatics Research Network (BIRN), J. Am.Med. Inform. Assoc. 18 (2011) 416–422.

[16] K. Clark, B. Vendt, K. Smith, J. Freymann, J. Kirby, P. Koppel,et al., The Cancer Imaging Archive (TCIA): maintaining andoperating a public information repository, J. Digit. Imaging26 (2013) 1045–1057.

[17] Y.R. Gao, S.S. Burns, C.B. Lauzon, A.E. Fong, T.A. James, J.F.Lubar, et al., Integration of XNAT/PACS, DICOM, and researchsoftware for automated multi-modal image analysis, in: M.Y.Law, W.W. Boonn (Eds.), Medical Imaging, 2013. AdvancedPacs-Based Imaging Informatics and TherapeuticApplications, Spie-Int Soc Optical Engineering, Bellingham,2013.

[18] G. Zeilinger, dcm4che. http://www.dcm4che.org/, 2016.(Accessed 1 June 2016).

[19] K.A. Archie, D.S. Marcus, DicomBrowser: software forviewing and modifying DICOM metadata, J. Digit. Imaging 25(2012) 635–645.

[20] J.B. Freymann, J.S. Kirby, J.H. Perry, D.A. Clunie, C.C. Jaffe,Image data sharing for biomedical research-meeting HIPAArequirements for de-identification, J. Digit. Imaging 25 (2012)14–24.

[21] J.A. Deprest, A.W. Flake, E. Gratacos, Y. Ville, K. Hecher, K.Nicolaides, et al., The making of fetal surgery, Prenat. Diagn.30 (2010) 653–667.

[22] G.E. Chalouhi, M.A. Marangoni, T. Quibel, B. Deloison, N.Benzina, M. Essaoui, et al., Active management of selectiveintrauterine growth restriction with abnormal Doppler inmonochorionic diamniotic twin pregnancies diagnosed inthe second trimester of pregnancy, Prenat. Diagn. 33 (2013)109–115.

[23] R. Pratt, J. Deprest, T. Vercauteren, S. Ourselin, A.L. David,Computer-assisted surgical planning and intraoperativeguidance in fetal surgery: a systematic review, Prenat. Diagn.(2015).

[24] G. Wang, M.A. Zuluaga, R. Pratt, M. Aertsen, T. Doel, M.Klusmann, et al., Slic-Seg: a minimally interactivesegmentation of the placenta from sparse and motion-corrupted fetal MRI in multiple views, Med. Image Anal.(2016).

[25] R.L. Harrigan, B.C. Yvernault, B.D. Boyd, S.M. Damon, K.D.Gibney, B.N. Conrad, et al., Vanderbilt University Institute ofImaging Science Center for Computational Imaging XNAT: amultimodal data archive and processing environment,Neuroimage 124 (2016) 1097–1101.

190 c om pu t e r m e thod s and p r og r am s i n b i om ed i c i n e 1 3 9 ( 2 0 1 7 ) 1 8 1 – 1 9 0