expression signatures as biomarkers: solving combinatorial problems with gene networks andrey...
Post on 19-Dec-2015
213 views
TRANSCRIPT
![Page 1: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/1.jpg)
Expression signatures as biomarkers: solving
combinatorial problems with gene networks
Andrey AlexeyenkoDepartment of Medical Epidemiology and
Biostatistics, Karolinska Institute
![Page 2: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/2.jpg)
FunCoup is a data integration framework to discover
functional coupling in eukaryotic proteomes with
data from model organisms
Amouse
Bmouse
?
Find
orthologs
Human
Fly
Rat
Yeast
High-throughput
evidence
Andrey Alexeyenko and Erik L.L. Sonnhammer. Global networks of functional coupling in eukaryotes from comprehensive data integration. Genome Research. Published in Advance February 25, 2009
![Page 3: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/3.jpg)
FunCoup• Each piece of data is evaluated• Data FROM many eukaryotes (7)• Practical maximum of data sources (>50)• Predicted networks FOR a number of
eukaryotes (10…)• Organism-specific efficient and robust
Bayesian frameworks• Orthology-based information transfer and
phylogenetic profiling• Networks predicted for different types of
functional coupling (metabolic, signaling etc.)
http://FunCoup.sbc.su.se
![Page 4: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/4.jpg)
FunCoup was queried for any links between members of TGFβ pathway (left blue circle) and habituées of known cancer pathways (members of at least 7 out of 18 groups; right blue circle). MAPK1 and MAPK3 belonged to both categories.
TGFβ <-> cancer pathway cross-talk
http://FunCoup.sbc.su.se
![Page 5: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/5.jpg)
FunCoup: recapitulation of known cancer pathways
Figure 5 from:The Cancer Genome Atlas Research Network Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008 Sep 4. [Epub ahead of print]
The same genes submitted to FunCoup No TCGA data were used. Outgoing links are not shown.
![Page 6: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/6.jpg)
Single molecular markers are (often) far from perfect. Combinations (signatures) should perform better.
The problem:
How to select optimal combinations?
×
Outcome,Optimal treatment, Severity/urgency
etc.
![Page 7: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/7.jpg)
Biomarker discovery in network context
The idea:
Construct multi-gene predictors with regard to network context
• Reduce the computational complexity• Make marker sets biologically sound
Accounting for network context is taking either:a) network neighbors orb) genes at remote network positions
![Page 8: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/8.jpg)
“Rotterdam” dataset (Wang et al., 2005): 286 patients
Expression:
~22000 probes
Clinical data:
Estrogen receptor status: +/ –
Lymph. node status: all –
Relapse : yes/no and time (days)
×
Procedure
Individual probe p-values (~22000):
Estrogen receptor-specific ability to predict relapse
Select most significant probes (1000):
Candidate members for marker signatures
Compile set of probes:
N probes at a time (e.g. N=20 or N=50)1. Split data: 75% to train, 25% to test.
2. Produce a linear regression equation (weight terms step-wise, reward for performance, penalize for complexity) on the train sub-set.
3. Apply the equation to the test set to predict outcome (relapse yes/no).
4. Record the specificity/sensitivity (Type I/II error rates) as ROC curve.Repeat m times
RELAPSE = γ1g1 + γ2g2 + γ3g3 + … + γNgN
![Page 9: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/9.jpg)
ProcedureSelect most significant probes (1000):
Candidate members for marker signatures
Compile set of probes:
N probes at a time (e.g. N=20 or N=50)
1. Split data: 75% to train, 25% to test.
2. Produce a linear regression equation (weight terms step-wise, reward for performance, penalize for complexity) on the train sub-set.
3. Apply the equation to the test set to predict outcome (relapse yes/no).
4. Record the specificity/sensitivity (Type I/II error rates) as ROC curve.Repeat m times
RELAPSE = γ1g1 + γ2g2 + γ3g3 + … + γNgN
Test X randomly retieved sets
Take the best ones Account for the network context
![Page 10: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/10.jpg)
Candidate signature in the network
Biomarker candidates
![Page 11: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/11.jpg)
Ready signature in the network
RELAPSE = γ1EIF3S9+ γ2CRHR1 + γ3LYN + … + γNKCNA5
![Page 12: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/12.jpg)
Testing “top”, “free”, and “network” approaches
Estrogen receptor status: positive
90% 91% 92% 93% 94% 95% 96% 97%
Quality of prognosis relapse/no relapse (area under ROC curve)
Fre
quen
cy
netw free
Estrogen receptor status: negative
93% 94% 95% 96% 97% 98% 99%
Quality of prognosis relapse/no relapse (area under ROC curve)
Fre
quen
cy
netw free
Top
Top
![Page 13: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/13.jpg)
Signature involves genes mutated in cancer
![Page 14: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/14.jpg)
Tumour tcga-02-0114-01a-01w
Cancer individuality: each tumor is unique in its molecular state and set of
mutated/disordered genes
![Page 15: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/15.jpg)
Partial correlations:a way to get rid of spurious links
0.7
0.6
0.4
![Page 16: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/16.jpg)
Cancer individuality via network view
Functional couplingtranscription ? transcription transcription ? methylation methylation ? methylation mutation methylation mutation transcriptionmutation ? mutation
+ mutated gene
![Page 17: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/17.jpg)
is a framework for biomarker discovery:
•Markers can be discovered and presented in the network dimension.
•Choice of data types to incorporate is unlimited – from metabolite profiling to patient phenotypes.
Useful features:•Web-based resource ready for further expansion
and presenting new research results in an interactome perspective;
•Cross-species network comparison of human and model organisms.
•Efficient query system to retrieve network environments of interest.
http://FunCoup.sbc.su.se
![Page 18: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/18.jpg)
Thank you for attention!
![Page 19: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/19.jpg)
Decomposing biological context
rPLC = 0.88
rPLC = 0.95
rPLC = 0.76
Common
Develomental
Dioxin-enabled
ANOVA (Analysis Of VAriance):
Look at F-ratios:
Signal of interest /Residual (“error”) variance
![Page 20: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/20.jpg)
Accounting for edge features:dioxin-enabled vs. dioxin-sensitive links
Andrey Alexeyenko, Deena M Wassenberg, Edward K Lobenhofer, Jerry Yen, Erik LL Sonnhammer, Elwood Linney, Joel N Meyer Transcriptional response to dioxin in the interactome of developing zebrafish. submitted.
![Page 21: Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,](https://reader030.vdocument.in/reader030/viewer/2022032800/56649d2a5503460f949ff67d/html5/thumbnails/21.jpg)
a