transmart community meeting 5-7 nov 13 - session 5: advancing transmart analytical capabilities with...

Post on 19-Jan-2015

275 Views

Category:

Health & Medicine

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

tranSMART Community Meeting 5-7 Nov 13 - Session 5: Advancing tranSMART Analytical Capabilities with Knowledge Content Sirimon Ocharoen, Thomson Reuters To effectively analyze data in tranSMART, biological analysis/knowledge-based approach is needed. Through a case study, we will demonstrate how system biology content can be integrated in tranSMART to enable functional analysis and biological interpretation. We will also share our experience and user feedbacks from various projects.

TRANSCRIPT

ADVANCING TRANSMART ANALYTICAL CAPABILITIES WITH KNOWLEDGE CONTENT

tranSMART Community Meeting

Sirimon O’Charoensirimon.ocharoen@thomsonreuters.com

November 6th, 2013

2

4 EXAMPLES OF tranSMART USE CASES

• Use case 1: Leveraging public datasets

• Use case 2: Finding information on variant and mutation

• Use case 3: Biological interpretation

• Use case 4: Implementing classification model

• Feedbacks from tranSMART users

3

Leveraging Public Datasets

USE CASE 1

4

Where else is IL-33 gene significantly expressed?

Up-regulated in an asthma study

5

What are other genes significantly expressed in Ulcerative Colitis?

6

REG1A

7

SLC6A14

8

MICROARRAY REPOSITORY:PROCESSING PROCEDURE

A. Search for Datasets in public databases & Data loading

B. Quality Control (QC) testing of Raw Assays (filtering out of unsuitable defective Assays)

C. GCRMA Processing of QC-approved Assays

D. Assays Annotation: i. Assignment of experimental

meta-data values to the Assays

ii. Assignment of experimental Assays Groups and their Comparisons

E. Statistical analysis of defined Comparisons:

i. Differential expression testingii. Calculation of Fold Changesiii. Functional Descriptors

calculation

(A)(B)

(C)

(D)

(E)

Optional: cutoffs

9

MICROARRAY REPOSITORY:QC PROCEDURE

• Datasets undergo rigorous quality control during processing

• An assay is removed from the dataset if it’s identified as an outlier by the majority of qc metrics

• Users are able to see which tests the datasets passed/failed

MICROARRAY REPOSITORY: ANNOTATION PROCEDURE

10

METACORE

Additional manual annotation of datasets increases granularity & numbers of groups and comparisons

11

Finding Information on Variant and Mutation

USE CASE 2

12

How does 17p13 deletion correlated to thalidomide response in chronic lymphocytic leukemia (CLL) patients?

WBC Reduction at Day 7

No abberation vs. 17p13 deletion

13

What are other diseases or drugs that 17p13 deletion is associated to?

14

GENE VARIANT RECORD

15

GENE VARIANT ASSOCIATIONS

GENE VARIANT API

16

Significance of genotype-phenotype relationships across the translational pipeline

OH

O

O

O

DISEASEDRUGVARIANTDISEASEVARIANT

IDENTIFY ACTIONABLE GENE VARIANTS

A. Establish variant significance B. Characterize the variantC. Asses the utility of the

variant:• Understanding Disease

Mechanism • Treatment & Response

DISEASE RECORD RESPONSE RECORD

Disease ProfilingDiagnosisPrognosis

Screening, Risk

Predicting Efficacy / ToxicityMonitoring Efficacy / Toxicity

Selection for TherapyResistance

DISCOVERY VALIDATION APPLICATION

Preclinical in vitro & animal studies Clinical studies in patient segments

FDA approvals Clinical guidelines

HTP StudiesCandidate Studies

MANUALLY CURATED CONTENT FROM A RANGE OF SOURCES

GENE VARIANT API: PROCEDURE

17

SOURCES

SOURCE SELECTION

VS. REJECTION

SELECTION CRITERIA BY THE ANALYST

conference abstracts, patents, peer reviewed journal articles, clinical trial registries, clinical guidelines, and

authority approval documents (ex. FDA)

• Retrospective selection or prospective screening for frontfile. Items are screened by a text-mining tool to identify and remove items that have no relevance to the Gene Variant API.

• All articles not removed are sent to manual selection by trained annotators who follow the policy.

• A clear study design, and valuable results are required by the analyst. The item must satisfy requirements of evidence-based medicine in order to be taken into consideration.

• Statistics and / or statement by the author of the variant effect on health are required. If both components are absent, the item is rejected.

18

Biological Interpretation

USE CASE 3

19

INVASIVE BREAST CANCER STUDY

Find predictors for treatment response

20

RCB 0/I RCB III

MARKER SELECTION WORKFLOW

21

NFIB nuclear factor IB type- a potential biomarker in breast neoplasmsSTK24 –induction of apoptosisESR1- Estrogen receptor 1PGR – Progesteron receptorCDKN2A Cyclin-dependent kinase inhibitor 2A Geminin – DNA replication inhibitorMCM2/5- a regulatory subunit inhibiting the helicase complexCDC45L- Cell cycle control proteinMSH6 and MSH2 - MutS homologues, proteins involved in DNA repairRAD50 – DNA repair protein (homologues recombination-dependent repair)DNMT1 – DNA methylation enzymeEZH2- Histone methylation enzymeHDAC2- Histone acetylation enzyme

Significant signaling pathways enriched with differentially expressed genes in Responders vs. Non-responders

Estrogen/Progesteron signalingCell Cycle regulationDNA damage repairEpigenetic regulation of gene expression

ENRICHMENT ANALYSIS

22

PATHWAY MAPS

Brca1 and Brca2 in breast cancer Cell cycle: Start of DNA replication in early S phase

23

PATHWAY MAPS

Epigenetic alterations in ovarian cancer

PR action in breast cancer -stimulation of cell growth and proliferation

24

Implementing Classification Model

USE CASE 4

25

PATIENT STRATIFICATION MODEL

IMPLEMENTING MODEL IN tranSMART

Sample Molecular Subtype

Associated clinical phenotype

ModelBiomarkers

Stratification rule

Mechanism

Drug target

Standard tranSMART

MetaCore Cytoscape

Pathways

SubnetworksAdditional functionality in tranSMART

Applicable for both One Mind and Orion projects

SYSTEMS BIOLOGY TOOLS

Network/Pathways based Approach–

OMICs data + other data types including clinical response

Statistical Approach

• Drug Targets

• Drug Repositioning

• Biomarker Identification

• Biological Mechanism

Reconstruction

• Drug Combinations

• Prognostic Biomarkers

• Predictive Biomarkers

SYSTEMS BIOLOGY TOOL LIBRARYState of the art methods

METABASEThe most comprehensive data available

Systems Biology(Zhang et al., 2011)

Probabilistic Inference(Su et al., 2009)

Pathway Activity(Lee et al., 2007)

RRFE(Johannes et al., 2010)

Subnetworks(Chuang et al., 2012)

EXAMPLES OF NETWORK APPROACHES

Pathway Based(Kim et al., 2012)

29

FEEDBACKS FROM tranSMART USERS

30

STATISTICAL ANALYSIS QUESTIONS

• How to control Type 1 error rate?– Testing set vs. Validation set

• How to perform longitudinal analysis?– Regression models

• How to identify covariance variables? Which variable has the highest correlation with the outcome?– Multivariate analysis

• Other analysis methods/workflows

31

SCIENTIFIC QUESTIONS

• How to set QC framework around uploaded data? Community developed QC standard?

• How to do across study analysis (easier)?

• How to do across species analysis?

• How to the community report these (and bugs)?

32

FEATURE WISH LIST

• Multiple improvement to R advance workflows

• Sending results (gene list, patient subsets) from an advanced workflow back to summary statistics

• Saving a workflow (history/output)

• Using gene expression data to create subsets

• Viewing specific subject records

• Adding data types (i.e. date, longitudinal measurement)

• Improving exported tables

and many more ….

top related