a vision for a cancer research knowledge system

19
A Vision for a Cancer Research Knowledge System September 14, 2016 Warren Kibbe, PhD NCI Center for Biomedical Informatics @wakibbe

Upload: warren-kibbe

Post on 09-Feb-2017

87 views

Category:

Health & Medicine


1 download

TRANSCRIPT

Page 1: A Vision for a Cancer Research Knowledge System

A Vision for a Cancer Research Knowledge System

September 14, 2016

Warren Kibbe, PhDNCI Center for Biomedical Informatics

@wakibbe

Page 2: A Vision for a Cancer Research Knowledge System

2

Changing the Conversation Around Data Sharing

• How do we find data, software, standards?• How can we make data, annotations, software, metadata accessible?• How do we reuse data standards• How do we make more data machine readable?

NIH Data Commons

Data commons co-locate data, storage and computing infrastructure, and commonly used tools for analyzing and sharing data to create an interoperable resource for the research community.

*Robert L. Grossman, Allison Heath, Mark Murphy, Maria Patterson, A Case for Data Commons Towards Data Science as a Service, to appear. Source of image: Interior of one of Google’s Data Center, www.google.com/about/datacenters/.

Page 3: A Vision for a Cancer Research Knowledge System

3

FAIR –Making data

Findable, Accessible,

Attributable, Interoperable,

Reusable,and provide Recognition

Force11 white paperhttps://www.force11.org/group/fairgroup/fairprinciples

Page 4: A Vision for a Cancer Research Knowledge System

Cancer Research Data Ecosystem

Well characterized Research Data Sets Cancer Cohorts Patient Data

EHR, Lab Data, Imaging, PROs, Smart Devices,

Decision Support

Learning from everycancer patient

Active researchparticipation

Research informationdonor

Clinical ResearchObservational Studies

Proteogenomics Imaging DataClinical Trials

Discovery Patient-engaged Research

SurveillanceBig Data

Implementation Research

SEERGDC

Page 5: A Vision for a Cancer Research Knowledge System

5

Cancer Research Data Commons Ecosystem

GenomicData Commons

Data Validation and Harmonization

ImagingData Commons

ProteomicsData Commons

Clinical Data Commons

(Cohorts / Indiv.)

SEER(Populations)

Data Contributors and Consumers

Researchers PatientsCliniciansInstitutions

Page 6: A Vision for a Cancer Research Knowledge System

6

Data Commons:Co-locate data, storage and computing infrastructure, and commonly used tools for analyzing and sharing data to create an interoperable resource for the research community.

Genomic Data Commons& Cloud Pilots

FAIR Principles: Microattribution, nanopublications, tracking the use of data, annotation of data, use of algorithms, supports the data /software /metadata

life cycle to provide credit and analyze impact of data, software, analytics, algorithm, curation and

knowledge sharing

Page 7: A Vision for a Cancer Research Knowledge System

GDC Content TCGA 11,353 cases TARGET 3,178 cases

Current

Foundation Medicine 18,000 cases Cancer studies in dbGAP ~4,000 cases

Coming soon

NCI-MATCH ~5,000 cases Clinical Trial Sequencing Program ~3,000 cases

Planned (1-3 years)

Cancer Driver Discovery Program ~5,000 cases Human Cancer Model Initiative ~1,000 cases APOLLO – VA-DoD ~8,000 cases

GenomicData Commons

Page 8: A Vision for a Cancer Research Knowledge System

Towards a Cancer Knowledge System• Continue genomic investigations of cancer

• Need > 100,000 cases analyzed• Embrace all genomic platforms• Relationship of relapse and primary biopsies

• Incorporate associated clinical annotations• Clinical trial data• Observational, longitudinal standard-of-care data• N-of-1 clinical data

• Promote and curate biological investigations of cancer genetic variants

• Driver vs. passenger mutations• Multiple phenotypic assays• Alterations in regulatory pathways – proteomics• Mechanisms of therapeutic resistance• Functional genomic investigations

• Integrative models for high-dimensional data

GenomicData Commons

Page 9: A Vision for a Cancer Research Knowledge System

Development of the NCI Genomic Data Commons (GDC)To Foster the Molecular Diagnosis and Treatment of Cancer

GDC

Bob Grossman PIUniv. of Chicago

Ontario Inst. Cancer Res.Leidos

Institute of MedicineTowards Precision Medicine

2011

Page 10: A Vision for a Cancer Research Knowledge System

GDC

Utility of a Cancer Knowledge System

Identifylow-frequencycancer drivers

Define genomicdeterminants of response

to therapy

Compose clinical trialcohorts sharing

targeted genetic lesions

Cancerinformation

donors

GenomicData Commons

Page 11: A Vision for a Cancer Research Knowledge System

11

NCI Cloud PilotsVisualization, Compute, Pipelines, Workspaces

GenomicData Commons

Genomic Data Commons / Cloud Pilot Ecosystem - Today

Data Submission, Harmonization, Visualization & Download

Broad FireCloud

SBG CGC

ISB CGC

Researchers

APIs

Web InterfaceResearchersContributors

Authentication & Authorization

thru eRA Commons and dbGaP

Page 12: A Vision for a Cancer Research Knowledge System

12

GenomicData Commons

Genomic Data Commons / Cloud Pilot Ecosystem– Near Term

Data Submission, Harmonization, Visualization & Download

Broad FireCloud

ISB CGC

Researchers

APIs

Web InterfaceResearchersContributors

Authentication & Authorization

thru eRA Commons and dbGaP

SBG CGC

Commons @ AWS

Commons @ GCP

Commons @ Azure

New algorithms,tools, pipelines, visualizations

DockStore

Page 13: A Vision for a Cancer Research Knowledge System

13

Extra Slides

Page 14: A Vision for a Cancer Research Knowledge System

Blue Ribbon Panel Recommendations

• Network for Direct Patient Engagement• Cancer Immunotherapy Translational Science Network• Therapeutic Target Identification to Overcome Drug Resistance• A National Cancer Data Ecosystem for Sharing and Analysis• Fusion Oncoproteins in Childhood Cancers• Symptom Management Research• Prevention and Early Detection – Implementation of Evidence-based Approaches• Retrospective Analysis of Biospecimens from Patients Treated with Standard of Care• Generation of Human Tumor Atlases• Development of New Enabling Cancer Technologies

Page 15: A Vision for a Cancer Research Knowledge System

15

Genomic Data Commons

• Unified knowledge base that promotes sharing of genomic and clinical data between researchers and facilitates precision medicine in oncology

• Contains standardized data from approximately 14,500 patients, derived from NCI programs, including:

- The Cancer Genome Atlas (TCGA)- Therapeutically Applicable Research to Generate Effective Treatment (TARGET)- Cancer Genome Characterization Initiative (CGCI)- The Cancer Line Encyclopedia (CCLE)

Page 16: A Vision for a Cancer Research Knowledge System

16

GA4GH

GenomicsAndHealth.org

Page 17: A Vision for a Cancer Research Knowledge System

17

Data Working GroupGA4GH

Page 18: A Vision for a Cancer Research Knowledge System

18

PMI – Oncology, the GDC and the Cloud Pilots Goals

• Support precision medicine-focused clinical research• Enable researchers to deposit well-annotated (Interoperable)

genomic data sets with the GDC• Provide a single source (and single dbGaP access request!) to

Find and Access these data• Enable effective analysis and meta-analysis of these data

without requiring local downloads – data Reuse• Understand Contributions, Assess value through usage, and

give Attribution to all users

Page 19: A Vision for a Cancer Research Knowledge System

19

PMI – Oncology, the GDC and the Cloud Pilots Goals

• Provide a data integration platform to allow multiple data types, multi-scalar data, temporal data from cancer models and patients through open APIs

• Work with the Global Alliance for Genomics and Health (GA4GH) to define the next generation of secure, flexible, meaningful, interoperable, lightweight interfaces – open APIs

• Engage the cancer research community in evaluating the open APIs for ease of use and effectiveness