artefact presentation teamplte

56
1 COLLABORATIVE DATASPACE LabKey User Conference Drienna Holman, SCHARP/FHCRC Britt Piehler, LabKey Software October 23, 2014

Upload: others

Post on 26-Apr-2022

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Artefact Presentation Teamplte

1

COLLABORATIVE DATASPACE

LabKey User Conference

Drienna Holman, SCHARP/FHCRC

Britt Piehler, LabKey Software

October 23, 2014

Page 2: Artefact Presentation Teamplte

2

AGENDA

Collaborative DataSpace (CDS) Program Overview

• Problem Statement

• Program Components

• Application Walk-through

• Considerations

Future of CDS

Page 3: Artefact Presentation Teamplte

3

Problem Statement

Page 4: Artefact Presentation Teamplte

4

“Dramatic shift in the culture and practice of sharing research data”

Page 5: Artefact Presentation Teamplte

5

DATA SHARING CURRENT PROCESS

Current process

Page 6: Artefact Presentation Teamplte

6

The Collaborative DataSpaceProgram

Page 7: Artefact Presentation Teamplte

7

Content-rich, annotated catalog

of integrated study data and

metadata

Outreach, analysis and support

services

Promotes engagement and data

consistency across networks

Web-based app combines CDS

data compendium and user

services and applies data

security model

CDS Program

Data ApplicationServices Governance

COLLABORATIVE DATASPACE PROGRAM COMPONENTS

Page 8: Artefact Presentation Teamplte

8

COLLABORATIVE DATASPACE PROGRAM WHAT IS CDS?

What CDS is (and what it isn’t)

CDS is not:

• Source for statistical proof

• Deep analysis environment

• Source for raw lab data

• Repository of “datasets”

CDS helps researchers:

• Discover basic facts and learn what data exist

• Perform quick, low-cost tests of ideas

• Compare data across studies and assay types

• Communicate findings

Page 9: Artefact Presentation Teamplte

9

The Components of CDS

Page 10: Artefact Presentation Teamplte

10

Application Walk-through

Page 11: Artefact Presentation Teamplte

11

Page 12: Artefact Presentation Teamplte

12

Sample Scenario:

Researchers have questioned whether there is a

relationship between a subject's vaccine response

and their Body Mass Index.

Do subjects with a high BMI show a reduced

immune response under a given vaccine protocol?

But first, a tour…

Page 13: Artefact Presentation Teamplte

13

The “Learn About…” section of the CDS allows

researchers to explore the data catalog from a

variety of perspectives.

A researcher might start here to assess which

studies might address a BMI/response relationship.

Page 14: Artefact Presentation Teamplte

14

Page 15: Artefact Presentation Teamplte

15

Page 16: Artefact Presentation Teamplte

16

Page 17: Artefact Presentation Teamplte

17

Page 18: Artefact Presentation Teamplte

18

The “Find Subjects” view allows for the creation of virtual

cohorts of subjects based on various criteria.

In our scenario, the researcher has used the Learn

About pages to identify two studies that measured both

BMI and Immune Response and wishes to investigate

these subjects further.

Page 19: Artefact Presentation Teamplte

19

Page 20: Artefact Presentation Teamplte

20

Page 21: Artefact Presentation Teamplte

21

Page 22: Artefact Presentation Teamplte

22

Page 23: Artefact Presentation Teamplte

23

Page 24: Artefact Presentation Teamplte

24

The “plot data” view allows researchers to explore data

relationships between measures for the virtual cohort

defined by the active filters.

In this case, the researcher will quickly plot the

participants’ BMI at enrollment against measured

Immune Response at all time points.

Page 25: Artefact Presentation Teamplte

25

Page 26: Artefact Presentation Teamplte

26

Page 27: Artefact Presentation Teamplte

27

Page 28: Artefact Presentation Teamplte

28

If the requested plot contains too many data points to

display clearly or efficiently, the CDS will automatically

switch to a “binned” plot, similar to a heat map.

Filtering the data further will show actual data points. For

example, the researcher may wish to narrow the cohort

to subjects who received selected vaccine products.

Page 29: Artefact Presentation Teamplte

29

Page 30: Artefact Presentation Teamplte

30

Page 31: Artefact Presentation Teamplte

31

Page 32: Artefact Presentation Teamplte

32

Page 33: Artefact Presentation Teamplte

33

Page 34: Artefact Presentation Teamplte

34

Page 35: Artefact Presentation Teamplte

35

At a glance, it appears that Obese subjects may have

slightly lower Immune Response than Normal,

Overweight, and Underweight subjects.

Data points can be colored to investigate whether there

might be a confounding variable, such as the subject’s

country of residence.

Page 36: Artefact Presentation Teamplte

36

Page 37: Artefact Presentation Teamplte

37

Page 38: Artefact Presentation Teamplte

38

The Data Grid view allows the researcher to view, sort,

and filter data rows, as well as add additional columns.

This dataset will be exported to analyze the

BMI/Immune Response relationship in more depth using

external statistical tools.

Finally, this subject group will be saved in the CDS for

future analysis.

Page 39: Artefact Presentation Teamplte

39

Page 40: Artefact Presentation Teamplte

40

The Other (Often Unappreciated) Components

of the CDS Program

Page 41: Artefact Presentation Teamplte

41

CDS application

Data harmonization/integration

Data annotation

User services

Outreach, training, support

Program and data governance

Important work is hidden beneath the surface

COLLABORATIVE DATASPACE PROGRAM COMPONENTS

Page 42: Artefact Presentation Teamplte

42

Integrated

CDS Program

COLLABORATIVE DATASPACE PROGRAM APPROACH

Application

Explore

Visualize

Export

Governance

Collaboration

Data Sharing

Harmonization

Services

Outreach

Analysis

Support

Application

Explore

Visualize

ExportData

Study Catalog

Integrated Data

Annotation

Page 43: Artefact Presentation Teamplte

43

Data

Study Catalog

Integrated Data

Annotation

Governance

Collaboration

Data Sharing

Harmonization

Services

Outreach

Analysis

Support

Application

Explore

Visualize

Export

Integrated

CDS Program

COLLABORATIVE DATASPACE PROGRAM APPROACH

Why are these other components important?

Page 44: Artefact Presentation Teamplte

44

CDS Program: Data Component

COLLABORATIVE DATASPACE PROGRAM COMPONENTS

Main Elements

• Study Catalog – Facilitate discovery and provide historical context

• Integrated Data – High volume of combined data

• Annotation – Critical to interpretation

Other Considerations

• Whether to restrict users to combine certain data

• CDS data model, data harmonization, controlled terminology

• Raw versus processed data

Page 45: Artefact Presentation Teamplte

45

CDS Program: Services Component

COLLABORATIVE DATASPACE PROGRAM COMPONENTS

Main Elements

• Outreach – Promote utilization and build community

• Analysis – Assist with analyses and interpretation

• Support – Combination of self-service and facilitated support

Other Considerations

• Maintaining accurate and current list of users

• User metrics

Page 46: Artefact Presentation Teamplte

46

CDS Program: Governance Component

COLLABORATIVE DATASPACE PROGRAM COMPONENTS

Main Elements

• Collaboration – Network engagement in governance

• Data Sharing – Global data sharing and use agreements

• Harmonization – Increase efficiency of data integration

Other Considerations

• Public access

• Flexible data sharing and collaboration levels

• Identify potential legal issues and informed consent concerns

Page 47: Artefact Presentation Teamplte

47

COLLABORATIVE DATASPACE PROGRAM APPROACH

Network

A

Network

B

CDS

Support

Visualize

Explore Export

Data

?

Outreach Analyze

Governance

Easier

Collaboration

Page 48: Artefact Presentation Teamplte

48

CDS shows what data were collected

and what is available without needing

help from others

Networks agree to a single

overarching agreement one time,

saving users cumbersome steps

Data are already joined,

harmonized, and annotated with

interactive tools, saving effort

Deeper data annotation and

resource information are all in

one place to facilitate

appropriate interpretation

DATA SHARING CDS PROCESS

Current process

CDS process

Page 49: Artefact Presentation Teamplte

49

Future of CDS

Page 50: Artefact Presentation Teamplte

50

CDS Program Extensibility

Data GovernanceApplication Services

COLLABORATIVE DATASPACE PROGRAM FUTURE DIRECTIONS

Page 51: Artefact Presentation Teamplte

51

Oversight, coordination,

data integration

Product design Product development Consultation on data

integration and system

development

COLLABORATIVE DATASPACE PROGRAM ACKNOWLEDGEMENTS

Page 52: Artefact Presentation Teamplte

52

Questions?

Page 53: Artefact Presentation Teamplte

53

Appendix

Page 54: Artefact Presentation Teamplte

54

WHAT’S NEXT FOR CDS ROADMAP

Proof of concept ImpactMake it real

• Built proof of concept

CDS application

• Manually added data

• Drafted temporary

agreements

• Establish data model

• Establish data integration

infrastructure

• Initiate data sharing

agreements

• Prepare for launch

Year 1 Year 2 Years 3 & 4

Pre-launch

• Expand feature improvements

• Expand data catalog

• Establish production-level data

pipelines

• Establish governance structure

• Establish user support services

Launch!

Post-launch

• Outreach to establish user base

• Proactive use of analytic services

• Science directed Program

Governance

• User feedback, system refinement,

bug fixes

• Assess impact

Page 55: Artefact Presentation Teamplte

55

CDS• Population-oriented exploration of

clinical and pre-clinical CAVD,

AVEG, HVTN data, areas of study

• Data, statistical, outreach and

user support services

Atlas

• Operations-oriented portal for

SCHARP networks focused on

data acquisition

• Study and dataset-based

sharing with light analytics

ImmuneSpace

• Analytic front-end over integrated

ImmPort data, including rich tools

for specialized analyses

• Systems biology approach to

high-dimensional/high-throughput

data; e.g., gene expression

ImmPort

• Repository for data generated

and submitted by NIAID/DAIT

investigators

• Focus on data standardization

and curation, with a handful of

analytic tools for selected data

types

COMPLEMENTARY PLATFORMS

Study

Operations

Endpoint

Analysis

Ancillary

Analysis

Hypothesis

Generation

Research

Review

Study

Design

Page 56: Artefact Presentation Teamplte

56

VisualizeAssay 2

THE EVOLUTION OF CDS DATA FLOW

Export Transform Integrate

Study Data

Meta Data

Annotation

CRF

Assay 1

Products

CDS

Application

Discover

Support

Explore

Export

?

Outreach

Analyze