transmart community meeting 5-7 nov 13 - session 5: advancing transmart analytical capabilities with...
Post on 19-Jan-2015
275 Views
Preview:
DESCRIPTION
TRANSCRIPT
ADVANCING TRANSMART ANALYTICAL CAPABILITIES WITH KNOWLEDGE CONTENT
tranSMART Community Meeting
Sirimon O’Charoensirimon.ocharoen@thomsonreuters.com
November 6th, 2013
2
4 EXAMPLES OF tranSMART USE CASES
• Use case 1: Leveraging public datasets
• Use case 2: Finding information on variant and mutation
• Use case 3: Biological interpretation
• Use case 4: Implementing classification model
• Feedbacks from tranSMART users
3
Leveraging Public Datasets
USE CASE 1
4
Where else is IL-33 gene significantly expressed?
Up-regulated in an asthma study
5
What are other genes significantly expressed in Ulcerative Colitis?
6
REG1A
7
SLC6A14
8
MICROARRAY REPOSITORY:PROCESSING PROCEDURE
A. Search for Datasets in public databases & Data loading
B. Quality Control (QC) testing of Raw Assays (filtering out of unsuitable defective Assays)
C. GCRMA Processing of QC-approved Assays
D. Assays Annotation: i. Assignment of experimental
meta-data values to the Assays
ii. Assignment of experimental Assays Groups and their Comparisons
E. Statistical analysis of defined Comparisons:
i. Differential expression testingii. Calculation of Fold Changesiii. Functional Descriptors
calculation
(A)(B)
(C)
(D)
(E)
Optional: cutoffs
9
MICROARRAY REPOSITORY:QC PROCEDURE
• Datasets undergo rigorous quality control during processing
• An assay is removed from the dataset if it’s identified as an outlier by the majority of qc metrics
• Users are able to see which tests the datasets passed/failed
MICROARRAY REPOSITORY: ANNOTATION PROCEDURE
10
METACORE
Additional manual annotation of datasets increases granularity & numbers of groups and comparisons
11
Finding Information on Variant and Mutation
USE CASE 2
12
How does 17p13 deletion correlated to thalidomide response in chronic lymphocytic leukemia (CLL) patients?
WBC Reduction at Day 7
No abberation vs. 17p13 deletion
13
What are other diseases or drugs that 17p13 deletion is associated to?
14
GENE VARIANT RECORD
15
GENE VARIANT ASSOCIATIONS
GENE VARIANT API
16
Significance of genotype-phenotype relationships across the translational pipeline
OH
O
O
O
DISEASEDRUGVARIANTDISEASEVARIANT
IDENTIFY ACTIONABLE GENE VARIANTS
A. Establish variant significance B. Characterize the variantC. Asses the utility of the
variant:• Understanding Disease
Mechanism • Treatment & Response
DISEASE RECORD RESPONSE RECORD
Disease ProfilingDiagnosisPrognosis
Screening, Risk
Predicting Efficacy / ToxicityMonitoring Efficacy / Toxicity
Selection for TherapyResistance
DISCOVERY VALIDATION APPLICATION
Preclinical in vitro & animal studies Clinical studies in patient segments
FDA approvals Clinical guidelines
HTP StudiesCandidate Studies
MANUALLY CURATED CONTENT FROM A RANGE OF SOURCES
GENE VARIANT API: PROCEDURE
17
SOURCES
SOURCE SELECTION
VS. REJECTION
SELECTION CRITERIA BY THE ANALYST
conference abstracts, patents, peer reviewed journal articles, clinical trial registries, clinical guidelines, and
authority approval documents (ex. FDA)
• Retrospective selection or prospective screening for frontfile. Items are screened by a text-mining tool to identify and remove items that have no relevance to the Gene Variant API.
• All articles not removed are sent to manual selection by trained annotators who follow the policy.
• A clear study design, and valuable results are required by the analyst. The item must satisfy requirements of evidence-based medicine in order to be taken into consideration.
• Statistics and / or statement by the author of the variant effect on health are required. If both components are absent, the item is rejected.
18
Biological Interpretation
USE CASE 3
19
INVASIVE BREAST CANCER STUDY
Find predictors for treatment response
20
RCB 0/I RCB III
MARKER SELECTION WORKFLOW
21
NFIB nuclear factor IB type- a potential biomarker in breast neoplasmsSTK24 –induction of apoptosisESR1- Estrogen receptor 1PGR – Progesteron receptorCDKN2A Cyclin-dependent kinase inhibitor 2A Geminin – DNA replication inhibitorMCM2/5- a regulatory subunit inhibiting the helicase complexCDC45L- Cell cycle control proteinMSH6 and MSH2 - MutS homologues, proteins involved in DNA repairRAD50 – DNA repair protein (homologues recombination-dependent repair)DNMT1 – DNA methylation enzymeEZH2- Histone methylation enzymeHDAC2- Histone acetylation enzyme
Significant signaling pathways enriched with differentially expressed genes in Responders vs. Non-responders
Estrogen/Progesteron signalingCell Cycle regulationDNA damage repairEpigenetic regulation of gene expression
ENRICHMENT ANALYSIS
22
PATHWAY MAPS
Brca1 and Brca2 in breast cancer Cell cycle: Start of DNA replication in early S phase
23
PATHWAY MAPS
Epigenetic alterations in ovarian cancer
PR action in breast cancer -stimulation of cell growth and proliferation
24
Implementing Classification Model
USE CASE 4
25
PATIENT STRATIFICATION MODEL
IMPLEMENTING MODEL IN tranSMART
Sample Molecular Subtype
Associated clinical phenotype
ModelBiomarkers
Stratification rule
Mechanism
Drug target
Standard tranSMART
MetaCore Cytoscape
Pathways
SubnetworksAdditional functionality in tranSMART
Applicable for both One Mind and Orion projects
SYSTEMS BIOLOGY TOOLS
Network/Pathways based Approach–
OMICs data + other data types including clinical response
Statistical Approach
• Drug Targets
• Drug Repositioning
• Biomarker Identification
• Biological Mechanism
Reconstruction
• Drug Combinations
• Prognostic Biomarkers
• Predictive Biomarkers
SYSTEMS BIOLOGY TOOL LIBRARYState of the art methods
METABASEThe most comprehensive data available
Systems Biology(Zhang et al., 2011)
Probabilistic Inference(Su et al., 2009)
Pathway Activity(Lee et al., 2007)
RRFE(Johannes et al., 2010)
Subnetworks(Chuang et al., 2012)
EXAMPLES OF NETWORK APPROACHES
Pathway Based(Kim et al., 2012)
29
FEEDBACKS FROM tranSMART USERS
30
STATISTICAL ANALYSIS QUESTIONS
• How to control Type 1 error rate?– Testing set vs. Validation set
• How to perform longitudinal analysis?– Regression models
• How to identify covariance variables? Which variable has the highest correlation with the outcome?– Multivariate analysis
• Other analysis methods/workflows
31
SCIENTIFIC QUESTIONS
• How to set QC framework around uploaded data? Community developed QC standard?
• How to do across study analysis (easier)?
• How to do across species analysis?
• How to the community report these (and bugs)?
32
FEATURE WISH LIST
• Multiple improvement to R advance workflows
• Sending results (gene list, patient subsets) from an advanced workflow back to summary statistics
• Saving a workflow (history/output)
• Using gene expression data to create subsets
• Viewing specific subject records
• Adding data types (i.e. date, longitudinal measurement)
• Improving exported tables
and many more ….
top related