transmart presentation
TRANSCRIPT
tranSMART v1.2 Case Study
for PredicTox
April 2015
Agenda What is PredicTox? Brief tranSMART overview Answering scientific questions with
tranSMART’s help: A case study maximizing data value
Questions?
Agenda
A public private partnership, led by the Reagan-Udall Foundation for the FDA, with the goals of: Applying systems-based approaches to better
understand Adverse Events (AEs) Developing predictive models
Pilot project --- one drug class &AE Use TranSMART as platform for Integration of
clinical, preclinical and molecular data
PredictTox
Agenda What is PredicTox? Brief tranSMART overview Answering scientific questions with
tranSMART’s help: A case study maximizing data value
Questions?
Agenda
tranSMART Data Warehouse Structure
Szalma S.; Koka, VC.; Khasanova, T.; Perakslis, E. :Effective knowledge management in translational medicineJournal of Translational Medicine 2010, 8:68
Analytical and visualization
tools
SecurityAccess
(enterprise vs. project level)
Patient Privacy
Diverse Data
Warehouse structure
PredictTox
tranSMART Data for PredicTox (so far) 18 gene expression data sets from GEO
Human white blood cells having to do with left ventricular dysfunction, and the drugs Imantinib, Sunitinib, and Trastuzumab.
Preclinical studies with gene expression data from heart tissue of rats dosed with imatinib.
These datasets may provide confirmatory gene expression profiles as it differentiates left ventricular dysfunction from other cardiac disease.
Information gleaned from these data may provide mechanistic insight into the cardiotoxicity of tyrosine kinase inhibitors.
Agenda What is PredicTox? Brief tranSMART overview Answering scientific questions with
tranSMART’s help: A case study maximizing data value
Questions?
Agenda
Case Study GSE21125 Blood Signature of Pre-heart Failure: A
Microarray Study (Smih et al) Human white blood cells from healthy, heart failure risk patients,
asymptomatic left ventricular dysfunction patients, chronic heart failure, acute heart failure patients
Platform - RNG-MRC_HU25k_NICE PLoS ONE 6(6): e20414. doi:10.1371/journal.pone.0020414
GSE2535 In chronic myeloid leukemia white cells from cytogenetic responders and non-responders to imatinib have very similar gene expression signatures (Crossman et al) Analysis of peripheral blood and bone marrow of chronic myelogenous
leukemia (CML) patients prior to imatinib (Gleevec) treatment. This study attempts to determine transcriptional signature of imatinib non-responders.
Platform - Affymetrix Human Genome U95 Version 2 Array Haematologica 2005; 90:459-464
Case Study Data
Analytical Rationale Imatinib is the first targeted therapy used to treat
Philadelphia chromosome positive CML Targets and inhibits the catalytic activity of constitutively
active tyrosine kinase Bcr-Alb. Also associated with reduced left ventricular ejection
volume indicative of left ventricular dysfunction With data loaded into tranSMART we investigated the
correlation between gene expression signatures of patients with ALVD and those associated with imatinib response Do these profiles have overlapping genes?
What are the functions of the overlapping genes? Do these profiles show effects on similar pathways? Can apply the signature of one data set to cluster gene
expression profiles from the other?
Case Study
Analysis WorkflowMarker
Selection Analyses
Gene Lists
Investigate the Role of
Shared Genes
Biomarkers?
Pathway Enrichment Analysis
Mechanistic Similarities?
Clustering Analysis
Hierarchical clustering using swapped gene list
Case Study
GSE2535 Marker Selection
Calculates the most differentiating genes between two datasets
Case Study
Gene List Creation Gene lists gathered using Marker Selection
workflow in tranSMART were edited to remove control genes, repeats, unrecognized loci and ORFs GSE21125 Marker Selection list yielded 54
recognizable gene symbols when loaded into tranSMART when comparing healthy controls with ALVD patients
GSE2535 Marker Selection list yielded 92 recognizable gene symbols when loaded into tranSMART when comparing responders vs. non-responders
Case Study
Analysis WorkflowMarker
Selection Analyses
Gene Lists
Investigate the Role of
Shared Genes
Biomarkers?
Pathway Enrichment Analysis
Mechanistic
Similarities?
Clustering Analysis
Hierarchical clustering using swapped gene list
Case Study
Venn Diagram GSE21125 and GSE2535
Compared gene lists from GSE21125 and GSE2535 Shared gene CACNA2D2
CACNA2D2 – voltage-dependent calcium channel Homozygous mutation is associated with epileptic encephalopathy Null mutants in mice display seizures, cardiac abnormalities and premature
death. Known to promote tumorigenesis and over expression is associated with
increased cell proliferation. Oncogene. 2015 Jan 26. doi: 10.1038/onc.2014.467
Search in PubMed for CACNA2D2 and left ventricular dysfunction yielded no results
Case Study
CACNA2D2 Expression
CACNA2D2 down-regulated in patients with ALVD p=6.41 x 10-6
GSE21125 GSE2535
• CACNA2D2 slightly down-regulated in patients unresponsive to imatinib treatment p= 0.011
• Gene expression data for CACNA2D2 show a statistically significant difference in these comparisons
Case Study
CACNA2D2 in GSE21125 CACNA2D2 seems to
highly differentiate the investigated pathologies. Not part of the blood
gene expression signature in Smih et al
Pairwise t-Test
Case Study
Analysis WorkflowMarker
Selection Analyses
Gene Lists
Investigate the Role of
Shared Genes
Biomarkers?
Pathway Enrichment Analysis
Mechanistic
Similarities?
Clustering Analysis
Hierarchical clustering using swapped gene list
Case Study
Pathway Enrichment AnalysisGSE21125 Smih et al
GSE2535 Crossman et al -log 0.05≈ 1.3
Case Study
Analysis WorkflowMarker
Selection Analyses
Gene Lists
Investigate the Role of
Shared Genes
Biomarkers?
Pathway Enrichment Analysis
Mechanistic Similarities
?
Clustering Analysis
Hierarchical clustering using swapped gene list
Case Study
GSE21125 Hierarchical Clustering with GSE2535 Marker Selection List
Clustering based on the GSE2535 gene list shows separation of Control profiles but does not effectively differentiate ALVD patients.
Case Study
GSE2535 Hierarchical Clustering with GSE21125 Marker Selection List
Applying the GSE21125 Marker Selection List does not distinguish imatinib responders vs non-responders
Case Study
Further inspection of GSE2535
GSM48360 GSM48357 GSM48368 GSM4835441806_at38013_at39583_at
1968_g_at36397_at35923_at34582_at38767_at
1892_s_at37866_at
39015_f_at526_s_at41476_at1114_at
32331_at680_s_at1197_at
41542_at40340_at32918_at593_s_at36475_at31520_at
37745_s_at36094_at37346_at41559_at40842_at38371_at37739_at41060_at40242_at39096_at1894_f_at33325_at
1011_s_atAFFX-HUMGAPDH/M33197_5_at
256_s_at
Column13Column15
Cluster1(n=6877)
GSM48360 GSM48355 GSM48357 GSM48373 GSM48368 GSM48369 GSM483541194_g_at
41032_at39326_at34042_at36139_at34029_at37880_at40006_at40280_at
40986_s_at37251_s_at
405_at32258_r_at
32733_at34120_r_at
846_s_at38591_at41256_at1929_at
31862_at36566_at40236_at32622_at
35825_s_at37162_at1284_at
41409_at38802_at201_s_at33326_at1512_at717_at
38824_at41202_s_at
32119_at37603_at40853_at195_s_at
Column13Column15
Cluster2(n=5748)
-4.00 4.00
Column13
Non-responder
Responder
Column15
Leipzig
Mannheim
Batch effect
Case Study
Summary Marker Selection analysis GEO data sets GSE21125 and
GSE2535 in tranSMART yielded gene lists of 54 and 92 gene respectively
These lists had one gene in common CACNA2D2 voltage-dependent calcium channel This gene is down regulated in GSE21125 in patients with
ALVD and distinguishes ALVD from the other pathologies studied
Down regulated in non-responders to imatinib Null mutants in mice display seizures, cardiac abnormalities
and premature death
Case Study
Summary Pathway enrichment analysis
GSE21125 – pathways involved in cell adhesion and cytoskeleton remodeling
GSE2535 – pathways involved in VEGF signaling and ESR1 activation The datasets share one pathway – “Cytoskeleton remodeling Role of PKA
in cytoskeleton reorganization” the significance of which remains to be investigated
Hierarchical clustering analysis shows poor performance with swapped gene lists
Case Study
Agenda What is PredicTox? Brief tranSMART overview Answering scientific questions with
tranSMART’s help: A case study maximizing data value
Questions?
Agenda
How to Get Involved Looking for partners: data, expertise,
funding, and other resources Steering Committee and work groups
forming soon--- stay tuned for updates No membership fee --- funding is raised
from a variety of public and private sources For more info: contact
[email protected] or see her after the talk
Extra Slides
Lessons learned The more attributes for the samples the
better The more data the better!
Need same tissue, same species, similar treatments, and similar measurements!
Case Study
Current Project Activities
Securing data sharing agreements with pharma companies
Gathering publically available data Building the ontology of Cardiac Adverse
Events Establishing the project governance structure Building out tranSMART instance
Rancho developed use case using GEO data sets to demonstrate utility…
Pilot project – develop centralized knowledge base that includes publically available clinical and molecular data having to do with tyrosine kinase inhibitors (TKIs) and mAbs and cardiac AEs; specifically left ventricular dysfunction.
Data goes into tranSMART infrastructure Integrated knowledgebase Mine information on biomarkers, non-clinical
and clinical screens Assist in hypothesis generation and mechanistic
level understanding
PredictTox
Concept of tranSMART on data level
tranSMART
Examples Of Data Stored In tranSMART Data from clinical trials
Demographics, medical history Treatment information Clinical outcomes, including AEs OMICs type data (gene expression, proteomics, RBM, SNPs)
Pre-clinical Studies PK/PD data OMICs type data for animal models and cell lines Toxicology data
Warehouse structure
Concept of tranSMART
Discovery Group
Preclinical Group
Clinical Development
tranSMART
Cytoskeleton remodeling Role of PKA in cytoskeleton reorganization
Case Study
Conclusions We went over tranSMART and explored main
functionality of the platform We used platform to answer a scientific
question Great sets from GEO were curated and
loaded into this tranSMART instance – they serve as a starting point and will provide useful comparisons for new, exciting data that is yet to come
We need more data!
Case Study
Marker Selection vs. Signature from Smih et al.
Tested the performance of the blood signature from supporting reference and the Marker Selection gene list
Performed hierarchical clustering analysis using “Marker selection list” generated by comparing
gene expression from healthy controls and ALVD patients
7 gene list from the paper
Case Study
GSE21125 Clustering with Marker Selection List
• The marker selection list was able to differentiate patient samples based on pathology
Case Study
Clustering with Smih et al Gene List
The list from Smih et al. performs well at clustering samples from patients with ALVD
Case Study