transmart community meeting 5-7 nov 13 - session 5: advancing transmart analytical capabilities with...

32
ADVANCING TRANSMART ANALYTICAL CAPABILITIES WITH KNOWLEDGE CONTENT tranSMART Community Meeting Sirimon O’Charoen [email protected] November 6 th , 2013

Upload: david-peyruc

Post on 19-Jan-2015

275 views

Category:

Health & Medicine


1 download

DESCRIPTION

tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content Sirimon Ocharoen, Thomson Reuters To effectively analyze data in tranSMART, biological analysis/knowledge-based approach is needed. Through a case study, we will demonstrate how system biology content can be integrated in tranSMART to enable functional analysis and biological interpretation. We will also share our experience and user feedbacks from various projects.

TRANSCRIPT

Page 1: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

ADVANCING TRANSMART ANALYTICAL CAPABILITIES WITH KNOWLEDGE CONTENT

tranSMART Community Meeting

Sirimon O’[email protected]

November 6th, 2013

Page 2: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

2

4 EXAMPLES OF tranSMART USE CASES

• Use case 1: Leveraging public datasets

• Use case 2: Finding information on variant and mutation

• Use case 3: Biological interpretation

• Use case 4: Implementing classification model

• Feedbacks from tranSMART users

Page 3: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

3

Leveraging Public Datasets

USE CASE 1

Page 4: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

4

Where else is IL-33 gene significantly expressed?

Up-regulated in an asthma study

Page 5: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

5

What are other genes significantly expressed in Ulcerative Colitis?

Page 6: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

6

REG1A

Page 7: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

7

SLC6A14

Page 8: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

8

MICROARRAY REPOSITORY:PROCESSING PROCEDURE

A. Search for Datasets in public databases & Data loading

B. Quality Control (QC) testing of Raw Assays (filtering out of unsuitable defective Assays)

C. GCRMA Processing of QC-approved Assays

D. Assays Annotation: i. Assignment of experimental

meta-data values to the Assays

ii. Assignment of experimental Assays Groups and their Comparisons

E. Statistical analysis of defined Comparisons:

i. Differential expression testingii. Calculation of Fold Changesiii. Functional Descriptors

calculation

(A)(B)

(C)

(D)

(E)

Optional: cutoffs

Page 9: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

9

MICROARRAY REPOSITORY:QC PROCEDURE

• Datasets undergo rigorous quality control during processing

• An assay is removed from the dataset if it’s identified as an outlier by the majority of qc metrics

• Users are able to see which tests the datasets passed/failed

Page 10: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

MICROARRAY REPOSITORY: ANNOTATION PROCEDURE

10

METACORE

Additional manual annotation of datasets increases granularity & numbers of groups and comparisons

Page 11: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

11

Finding Information on Variant and Mutation

USE CASE 2

Page 12: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

12

How does 17p13 deletion correlated to thalidomide response in chronic lymphocytic leukemia (CLL) patients?

WBC Reduction at Day 7

No abberation vs. 17p13 deletion

Page 13: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

13

What are other diseases or drugs that 17p13 deletion is associated to?

Page 14: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

14

GENE VARIANT RECORD

Page 15: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

15

GENE VARIANT ASSOCIATIONS

Page 16: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

GENE VARIANT API

16

Significance of genotype-phenotype relationships across the translational pipeline

OH

O

O

O

DISEASEDRUGVARIANTDISEASEVARIANT

IDENTIFY ACTIONABLE GENE VARIANTS

A. Establish variant significance B. Characterize the variantC. Asses the utility of the

variant:• Understanding Disease

Mechanism • Treatment & Response

DISEASE RECORD RESPONSE RECORD

Disease ProfilingDiagnosisPrognosis

Screening, Risk

Predicting Efficacy / ToxicityMonitoring Efficacy / Toxicity

Selection for TherapyResistance

DISCOVERY VALIDATION APPLICATION

Preclinical in vitro & animal studies Clinical studies in patient segments

FDA approvals Clinical guidelines

HTP StudiesCandidate Studies

MANUALLY CURATED CONTENT FROM A RANGE OF SOURCES

Page 17: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

GENE VARIANT API: PROCEDURE

17

SOURCES

SOURCE SELECTION

VS. REJECTION

SELECTION CRITERIA BY THE ANALYST

conference abstracts, patents, peer reviewed journal articles, clinical trial registries, clinical guidelines, and

authority approval documents (ex. FDA)

• Retrospective selection or prospective screening for frontfile. Items are screened by a text-mining tool to identify and remove items that have no relevance to the Gene Variant API.

• All articles not removed are sent to manual selection by trained annotators who follow the policy.

• A clear study design, and valuable results are required by the analyst. The item must satisfy requirements of evidence-based medicine in order to be taken into consideration.

• Statistics and / or statement by the author of the variant effect on health are required. If both components are absent, the item is rejected.

Page 18: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

18

Biological Interpretation

USE CASE 3

Page 19: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

19

INVASIVE BREAST CANCER STUDY

Find predictors for treatment response

Page 20: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

20

RCB 0/I RCB III

MARKER SELECTION WORKFLOW

Page 21: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

21

NFIB nuclear factor IB type- a potential biomarker in breast neoplasmsSTK24 –induction of apoptosisESR1- Estrogen receptor 1PGR – Progesteron receptorCDKN2A Cyclin-dependent kinase inhibitor 2A Geminin – DNA replication inhibitorMCM2/5- a regulatory subunit inhibiting the helicase complexCDC45L- Cell cycle control proteinMSH6 and MSH2 - MutS homologues, proteins involved in DNA repairRAD50 – DNA repair protein (homologues recombination-dependent repair)DNMT1 – DNA methylation enzymeEZH2- Histone methylation enzymeHDAC2- Histone acetylation enzyme

Significant signaling pathways enriched with differentially expressed genes in Responders vs. Non-responders

Estrogen/Progesteron signalingCell Cycle regulationDNA damage repairEpigenetic regulation of gene expression

ENRICHMENT ANALYSIS

Page 22: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

22

PATHWAY MAPS

Brca1 and Brca2 in breast cancer Cell cycle: Start of DNA replication in early S phase

Page 23: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

23

PATHWAY MAPS

Epigenetic alterations in ovarian cancer

PR action in breast cancer -stimulation of cell growth and proliferation

Page 24: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

24

Implementing Classification Model

USE CASE 4

Page 25: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

25

PATIENT STRATIFICATION MODEL

Page 26: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

IMPLEMENTING MODEL IN tranSMART

Sample Molecular Subtype

Associated clinical phenotype

ModelBiomarkers

Stratification rule

Mechanism

Drug target

Standard tranSMART

MetaCore Cytoscape

Pathways

SubnetworksAdditional functionality in tranSMART

Applicable for both One Mind and Orion projects

Page 27: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

SYSTEMS BIOLOGY TOOLS

Network/Pathways based Approach–

OMICs data + other data types including clinical response

Statistical Approach

• Drug Targets

• Drug Repositioning

• Biomarker Identification

• Biological Mechanism

Reconstruction

• Drug Combinations

• Prognostic Biomarkers

• Predictive Biomarkers

SYSTEMS BIOLOGY TOOL LIBRARYState of the art methods

METABASEThe most comprehensive data available

Page 28: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

Systems Biology(Zhang et al., 2011)

Probabilistic Inference(Su et al., 2009)

Pathway Activity(Lee et al., 2007)

RRFE(Johannes et al., 2010)

Subnetworks(Chuang et al., 2012)

EXAMPLES OF NETWORK APPROACHES

Pathway Based(Kim et al., 2012)

Page 29: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

29

FEEDBACKS FROM tranSMART USERS

Page 30: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

30

STATISTICAL ANALYSIS QUESTIONS

• How to control Type 1 error rate?– Testing set vs. Validation set

• How to perform longitudinal analysis?– Regression models

• How to identify covariance variables? Which variable has the highest correlation with the outcome?– Multivariate analysis

• Other analysis methods/workflows

Page 31: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

31

SCIENTIFIC QUESTIONS

• How to set QC framework around uploaded data? Community developed QC standard?

• How to do across study analysis (easier)?

• How to do across species analysis?

• How to the community report these (and bugs)?

Page 32: tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content

32

FEATURE WISH LIST

• Multiple improvement to R advance workflows

• Sending results (gene list, patient subsets) from an advanced workflow back to summary statistics

• Saving a workflow (history/output)

• Using gene expression data to create subsets

• Viewing specific subject records

• Adding data types (i.e. date, longitudinal measurement)

• Improving exported tables

and many more ….