computation for large systems ii: applications and...
TRANSCRIPT
![Page 1: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/1.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
Computation for Large Systems II:Applications and Analysis
ARC Winter School in Mathematicaland Computational Biology
IMB, UQ, 5-9 July, 2010
Mike LangstonProfessor
Department of Electrical Engineering and Computer ScienceUniversity of Tennessee
USA8 July 2010
![Page 2: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/2.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
2
Outline of Talk
Foundations
Gene Coexpression Analysis
Clustering
Data Integration
Sample Applications to Human Health
Application to Model Organisms
![Page 3: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/3.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
3
Foundations
Systems Biology
• How do biological entities function in unison and atall levels of scale?
• Linkage, communication and networks (graphs!)
![Page 4: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/4.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
4
Foundations
Systems Biology
Correlation
Here are five mouse geneswith Pearson correlations of at least 0.65. What of
• noise?• experimental design?• circadian rhythms?• sex, tameness, etc?• other confounds?• other metrics?
![Page 5: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/5.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
5
Foundations
Systems Biology
CorrelationCoefficient ProfilesSometimes via• Pearson• Spearman• Mutual Information• Etc
Other times we need• p-values• Bonferroni corrections• q-values• false discovery rates...
![Page 6: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/6.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
6
Foundations
Systems Biology
Correlation
Omics: key to deciphering complex systems
![Page 7: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/7.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
7
Foundations
Systems Biology
Correlation
Omics: key to deciphering complex systemsHumans: 1014+ cells, 200+ cell types
![Page 8: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/8.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
8
Foundations
Systems Biology
Correlation
Omics: key to deciphering complex systemsHumans: 1013+ cells, 200+ cell typesGenome (blueprint, 20K+ genes, 10M+ polymorphisms)
![Page 9: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/9.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
9
Foundations
Systems Biology
Correlation
Omics: key to deciphering complex systemsHumans: 1013+ cells, 200+ cell typesGenome (blueprint, 20K+ genes, 10M+ polymorphisms)Proteome (functional units, unknown # of proteins)
![Page 10: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/10.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
10
Foundations
Systems Biology
Correlation
Omics: key to deciphering complex systemsHumans: 1013+ cells, 200+ cell typesGenome (blueprint, 20K+ genes, 10M+ polymorphisms)Proteome (functional units, unknown # of proteins)Transcriptome
Translation (tRNA) via transcription (mRNA)Function and Signaling (siRNA, miRNA, etc)
![Page 11: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/11.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
11
Foundations
Systems Biology
Correlation
Omics: key to deciphering complex systemsHumans: 1013+ cells, 200+ cell typesGenome (blueprint, 20K+ genes, 10M+ polymorphisms)Proteome (functional units, unknown # of proteins)Transcriptome
Translation (tRNA) via transcription (mRNA)Function and Signaling (siRNA, miRNA, etc)
Other: metabalome, lipidome, interactome, omeome!
![Page 12: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/12.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
12
Foundations
Systems Biology
Correlation
Omics
Visualization- highly dependenton scale
![Page 13: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/13.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
13
Foundations
Systems Biology
Correlation
Omics
Visualization
Computational Tools - focus usually on dense subgraphs
![Page 14: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/14.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
14
Foundations
Systems Biology
Correlation
Omics
Visualization
Computational ToolsMaximum Clique
• must run often• time is a limiting factor• exploit fixed-parameter tractability (FPT)
![Page 15: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/15.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
15
Foundations
Systems Biology
Correlation
Omics
Visualization
Computational ToolsMaximum CliqueMaximal Clique
• huge outputs• various orderings• memory is often the limiting factor
![Page 16: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/16.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
16
Foundations
Systems Biology
Correlation
Omics
Visualization
Computational ToolsMaximum CliqueMaximal CliqueBiclique
• new algorithms• bipartite graphs
![Page 17: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/17.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
17
Foundations
Systems Biology
Correlation
Omics
Visualization
Computational ToolsMaximum CliqueMaximal CliqueBicliqueParaclique
• noisy data and/or soft clustering
![Page 18: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/18.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
18
Outline of Talk
Foundations
Gene Coexpression Analysis
Clustering
Data Integration
Sample Applications to Human Health
Application to Model Organisms
![Page 19: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/19.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
19
Coexpression Analysis
Raw Data
Gene Expression Profiles
Edge-Weighted Complete Graph
cDNA or mRNA Microarrays
Correlation Computation
High-Pass Filtering
Normalization
Real-Valued Matrix
Graph Transforms
Unweighted Incomplete Graph
Clique-CentricMethods
k-Cores k-ConnectedComponents
Principal Component Analysis
k-MeansClustering
… . . . . . . . .
Paraclique
. . . . . . .MaximalClique
MaximumClique
...Increasing Edge Density(and Increasing Problem Complexity)
NP-completeProblems
Unsupervised Methods
Biclique...
HCSSubgraphs. .
. . . . .FPT VCCodes
HPC &Novel
Methods
Toolchain
Thresholding
![Page 20: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/20.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
Coexpression Analysis
Thresholding
20
![Page 21: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/21.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
Coexpression Analysis
Thresholding
21
Method Anoxia Reoxygen-ation Alpha Absolute deviations
from GO threshold
GO Functional Similarity 0.97 0.92 0.85
Spectral Clustering 0.93 0.97 0.89 0.04+0.05+0.04=0.13
Maximal Clique-2 0.90 0.91 0.74 0.07+0.01+0.11=0.19
Power 0.88 0.94 0.96 0.09+0.02+0.11=0.22
Bonferroni adjustment 0.85 0.93 0.95 0.12+0.01+0.10=0.23
Control-Spot 0.93 0.83 0.70 0.04+0.09+0.15=0.28
Maximal Clique-3 0.87 0.89 0.60 0.10+0.03+0.25=0.38
Top 1 Percent 0.81 0.81 0.72 0.16+0.11+0.13=0.40
Estimated threshold for each dataset, sorted by performance of the methods.GO functional similarity thresholds are the standard against which the methods are
compared, summing absolute deviations across datasets (thresholds above GO are in bold).
![Page 22: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/22.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
Coexpression Analysis
Thresholding
22
Method Dataset Estimated Threshold
BootstrapMean Differencea
Bootstrap Standard Deviation
Maximal Clique-2
Anoxia 0.90 0.91 -0.01 0.015Reoxy 0.91 0.93 -0.02 0.009Alpha 0.74 0.78 -0.04 0.057
Spectral Clustering
Anoxia 0.93 0.95 -0.02 0.012Reoxy 0.97 0.97 0.00 0.011Alpha 0.89 **0.95 -0.06 0.017
Top 1%Anoxia 0.81 0.83 -0.02 0.011Reoxy 0.81 0.84 -0.03 0.016Alpha 0.72 **0.79 -0.07 0.027
Control Spot
Anoxia 0.93 0.95 -0.02 0.015Reoxy 0.83 **0.90 -0.07 0.034Alpha 0.70 **0.82 -0.08 0.043
Summary of bootstrap results. Estimated threshold is compared with the bootstrapdistribution for selected methods.a Estimated threshold minus bootstrap mean.** Estimated threshold is more than 2 std. deviations from bootstrap mean.
![Page 23: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/23.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
23
Coexpression Analysis
Gene (vertex) comparisons:• differential expression• does not require multiple conditions • compare the two lists of gene expression levels
![Page 24: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/24.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
24
Coexpression Analysis
Correlate (edge) comparisons• differential correlation• requires multiple conditions in control versus stimulus• compare two lists of gene-gene correlations
![Page 25: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/25.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
25
Coexpression Analysis
Putative network (clique) comparisons• differential topology• compare cliques, sort by ontology, CREs, etc• consider granularity, for example, with the clique intersection graph
![Page 26: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/26.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
26
Outline of Talk
Foundations
Gene Coexpression Analysis
Clustering
Data Integration
Sample Applications to Human Health
Application to Model Organisms
![Page 27: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/27.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
27
Clustering
27
Clique is the gold standard. But data is seldom without errors:• false-discovery rate low via clique• biggest problem is missing edges
What we really want are very dense subgraphs. It’s straightforwardenough to use neighborhoods, but on real data:
• 1-neighborhoods produce edge densities of only around 16%.• 2-neighborhoods produce edge densities of only around 6%
The Paraclique Algorithm:• A clique gloms onto highly connected vertices.• Edge density stays north of 96%.• Lift and separate.
466-paraclique
280-clique
![Page 28: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/28.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
28
Clustering
2828
Allows Overlapping Clusters
Pre-specified Number of Clusters (k)
Thresholded Correlations
Method Type Tool Result Range Other Parameters Tested
Ward Hierarchical R k Y
Average Hierarchical R k Y
McQuitty Hierarchical R k Y
Complete Hierarchical R k Y
K-Means Partitioning R k Y
SOM Neural network MeV k Y Grid size/type
QT Clust Partitioning MeV 24-385 Maximum cluster diameters
CAST Graph-based MeV 1-6162 Y
CLICK Graph-based Expander 4-32 Cluster homogeneity
SAMBA Graph-based Expander 7-30 Overlap prior factor
WGCNA Graph-based stand-alone 4-160 Power, Module detection method
NNN Graph-based stand-alone 23-52 (Y) Minimum neighborhood size
K-Clique Communities Graph-based CFinder/Ours 1-68 Y Y Clique size
Maximal Clique Graph-based Ours 1,000-64,000 Y Y
Paraclique Graph-based Ours 8-615 Y Y Glom factor
Methods Tested
![Page 29: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/29.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
29
Clustering
2929
![Page 30: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/30.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
30
Clustering
303030
Average Quartile
Small (3-10 genes) Medium (11-100 genes) Large (101-1000 genes)
Clustering Method QuartileBAT5
Jaccard Quartile BAT5 Jaccard Quartile BAT5 Jaccard
K-Clique Communities 1.00 1 0.7531 1 0.4465 1 0.4915
Maximal Clique 1.00 1 0.8433 1 0.4081 0.0000
Paraclique 1.00 1 0.7576 1 0.4285 1 0.4169
Ward (H) 1.33 2 0.5782 1 0.4011 1 0.5723
CAST 1.67 1 0.7455 3 0.3146 1 0.4994
QT Clust 2.00 2 0.5473 2 0.3670 2 0.3944
Complete (H) 2.33 3 0.3933 2 0.3677 2 0.3419
NNN 2.67 2 0.5521 2 0.3705 4 0.2406
K-Means 3.00 4 0.2573 3 0.3015 2 0.3463
SOM 3.00 4 0.3260 2 0.3286 3 0.3282
WGCNA 3.00 3 0.4391 3 0.3106 3 0.2949
Average (H) 3.33 3 0.4087 4 0.2792 3 0.3037
McQuitty (H) 3.33 3 0.4594 3 0.3065 4 0.2868
SAMBA 3.50 0.0000 4 0.1860 3 0.3298
CLICK 4.00 4 0.0339 4 0.1453 4 0.2817
Algorithms Ranked by Quartile Comparisons
![Page 31: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/31.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
31
Clustering
313131
Algorithm Pipelining, Clustering Suites, Teragrid
![Page 32: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/32.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
32
Clustering
SevenQuantativeTrait Loci
There’s a high probability that somewhere in here is a polymorphism controlling this trait.
Transcriptabundance can be the phenotype!
Relationship to QTLs
![Page 33: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/33.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
33
Concentrated Parental Alleles
TwoParacliques
Clustering
![Page 34: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/34.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
34
Outline of Talk
Foundations
Gene Coexpression Analysis
Clustering
Data Integration
Sample Applications to Human Health
Application to Model Organisms
![Page 35: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/35.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
35
Data Integration
Phenotypic Data (e. g., diseased versus healthy patients)
![Page 36: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/36.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
36
Data Integration
Phenotypic Data (e. g., diseased versus healthy patients)Proteomic Data (e. g., amino acid peaks from mass spec)
![Page 37: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/37.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
37
Data Integration
Phenotypic Data (e. g., diseased versus healthy patients)Proteomic Data (e. g., amino acid peaks from mass spec)Transcriptomic Data (e.g., gene expression from µarrays)
![Page 38: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/38.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
38
Data Integration
Phenotypic Data (e. g., diseased versus healthy patients)Proteomic Data (e. g., amino acid peaks from mass spec)Transcriptomic Data (e.g., gene expression from µarrays)Genotypic Data: SNPs
• DNA sequence variations, each occurringwhen a single nucleotide in the genomediffers between members of a species
• highly conserved throughout evolution and within population
• almost always just two alleles
• detected with SNP arrays designed to detect polymorphisms
![Page 39: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/39.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
39
Data Integration
Proteins
AT
TCCG
TCACGT
AGCTGT
mRNA Co-expression
Network
Multi-LocusGenetic
RegulatoryNetwork Models
Natural Allelic Perturbations
(SNPs)
Protein-GeneRelationships
Proteins
Proteins
Protein PeakFactors
T/C
C/G
A/T
G/G
C/TPutative
Biomarkers
Diseased
Healthy
Data Integration
![Page 40: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/40.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
40
Outline of Talk
Foundations
Gene Coexpression Analysis
Clustering
Data Integration
Sample Applications to Human Health
Application to Model Organisms
![Page 41: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/41.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
41
Application, Allergy
Data Description• Mikael Benson, Göteborg, Sweden: 56 patients and 39 controls• Affymetrix HU133 arrays• roughly 33,000 genes• hay fever, eczema• nasal secretions, lymphocytes, skin
![Page 42: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/42.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
42
Data Description• Mikael Benson, Göteborg, Sweden: 56 patients and 39 controls• Affymetrix HU133 arrays• roughly 33,000 genes• hay fever, eczema• nasal secretions, lymphocytes, skin
Preprocessing• MAS5.0• log transformed• centered around zero with z scores• probesets with consistently low expression levels removed• replicates averaged
Application, Allergy
![Page 43: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/43.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
43
Data Description• Mikael Benson, Göteborg, Sweden, 56 patients and 39 controls• Affymetrix HU133 arrays• roughly 33,000 genes• hay fever, eczema• nasal secretions, lymphocytes, skin
Preprocessing• MAS5.0• log transformed• centered around zero with z scores• probesets with consistently low expression levels removed• replicates averaged
Threshold Selection• chosen to balance graph densities• AFFX spots retained for quality control
Application, Allergy
![Page 44: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/44.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
44
0
500000
1000000
1500000
2000000
2500000
Correlation Value
Freq
uenc
y
Patient
Control
Correlation Coefficient Distribution
Application, Allergy
![Page 45: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/45.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
45
Threshold Vertices Edges Maximal Cliques Maximum Size
0.88 8009 256346 240146378 84
0.89 7169 178144 15067064 79
0.90 6254 118900 1579041 71
0.91 5317 75541 243232 66
0.92 4415 45471 51315 59
Control
Threshold Vertices Edges Maximal Cliques Maximum Size
0.88 5809 91152 2298595 61
0.89 4999 62271 447176 52
0.90 4146 40933 114030 450.91 3405 26031 41605 35
0.92 2628 11322 11322 28
Patientribosomal or RNA-related
T-lymphocytes or epithelial cells
Graph Properties
Application, Allergy
![Page 46: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/46.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
46
Clique profiles using the five most highly represented genes:
Control Patient
Gene Symbol Clique membership Gene Symbol Clique membership
UBE1C 29% FGFR2 66%
RANBP6 27% NFIB 65%
DKFZP564O123 26% PPL 64%
SLC25A13 24% FGFR3 64%
GTPBP4 21% CDH3 56%
Application, Allergy
![Page 47: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/47.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
47
Clique profiles using the five most highly represented genes:
Control Patient
Gene Symbol Clique membership Gene Symbol Clique membership
UBE1C 29% FGFR2 66%
RANBP6 27% NFIB 65%
DKFZP564O123 26% PPL 64%
SLC25A13 24% FGFR3 64%
GTPBP4 21% CDH3 56%
Of course gene representation is only a small part of the story.
Application, Allergy
![Page 48: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/48.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
48
We can use traditional algorithmic tools• extract cores, cliques and other dense subgraphs• check for scale-freeness, putative TFs, hubs, etc
Application, Allergy
![Page 49: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/49.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
49
We can use traditional algorithmic tools• extract cores, cliques and other dense subgraphs• check for scale-freeness, putative TFs, hubs, etc
We can use commercial and other tools• sort subgraphs by ontological enrichment, CREs, etc• compare to literature, databases, etc• match genes and gene products with known interactions
Application, Allergy
![Page 50: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/50.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
50
We can use traditional algorithmic tools• extract cores, cliques and other dense subgraphs• check for scale-freeness, putative TFs, hubs, etc
We can use commercial and other tools• sort subgraphs by ontological enrichment, CREs, etc• compare to literature, databases, etc• match genes and gene products with known interactions
It’s tempting to scan for your favorites...
Application, Allergy
![Page 51: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/51.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
51
We can use traditional algorithmic tools• extract cores, cliques and other dense subgraphs• check for scale-freeness, putative TFs, hubs, etc
We can use commercial and other tools• sort subgraphs by ontological enrichment, CREs, etc• compare to literature, databases, etc• match genes and gene products with known interactions
It’s tempting to scan for your favorites...
But our goal is to identify altered interactions
Application, Allergy
![Page 52: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/52.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
52
Differential AnalysisGene (vertex) comparisons:
• differential expression• does not require multiple conditions • compare the two lists of gene expression levels
Correlate (edge) comparisons • differential correlation• requires multiple conditions in control, in dose• compare the two lists of gene-gene correlations
Putative network (clique) comparisons• differential topology• focus on network aka clique differences• consider the clique intersection graph
Application, Allergy
![Page 53: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/53.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
53
Differential AnalysisGene (vertex) comparisons:
• differential expression• does not require multiple conditions • compare the two lists of gene expression levels
Correlate (edge) comparisons • differential correlation• requires multiple conditions in control, in dose• compare the two lists of gene-gene correlations
Putative network (clique) comparisons• differential topology• focus on network aka clique differences• consider the clique intersection graph
Ongoing Work• 62 genes pass all three screens, 6 match a known pathway• ITK (IL2-inducible T-cell kinase), studying in depth...moved on to Illumina
Application, Allergy
![Page 54: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/54.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
54
Differential AnalysisGene (vertex) comparisons:
• differential expression• does not require multiple conditions • compare the two lists of gene expression levels
Correlate (edge) comparisons • differential correlation• requires multiple conditions in control, in dose• compare the two lists of gene-gene correlations
Putative network (clique) comparisons• differential topology• focus on network aka clique differences• consider the clique intersection graph
Ongoing Work• 62 genes pass all three screens, 6 match a known pathway• ITK (IL2-inducible T-cell kinase), studying in depth...moving to Illumina
For Impact• concentrate on real data, and working with bench biologists• strategic publications (e.g., Nature Genetics, PLoS Comp Bio, etc)
Application, Allergy
![Page 55: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/55.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
55
Outline of Talk
Foundations
Gene Coexpression Analysis
Clustering
Data Integration
Sample Applications to Human Health
Application to Model Organisms
![Page 56: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/56.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
56
Application,Model Organisms
LD: a measure of statistical dependence between genetic markers• non-random association of alleles at two or more loci• the occurrence in a population of two linked alleles at a frequencyhigher or lower than expected on the basis of the individual frequencies
• not necessarily on the same chromosome
![Page 57: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/57.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
57
Application,Model Organisms
LD: a measure of statistical dependence between genetic markers• non-random association of alleles at two or more loci• the occurrence in a population of two linked alleles at a frequencyhigher or lower than expected on the basis of the individual frequencies
• not necessarily on the same chromosome
Reflects biologically meaningful association of loci
![Page 58: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/58.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
58
Application,Model Organisms
LD: a measure of statistical dependence between genetic markers• non-random association of alleles at two or more loci• the occurrence in a population of two linked alleles at a frequencyhigher or lower than expected on the basis of the individual frequencies
• not necessarily on the same chromosome
Reflects biologically meaningful association of loci
Generally a result of population history• population genealogy• recombination frequency• co-adaptive allele selection• natural selection• other factors
LD: a measure of statistical dependence between genetic markers• non-random association of alleles at two or more loci• the occurrence in a population of two linked alleles at a frequencyhigher or lower than expected on the basis of the individual frequencies
• not necessarily on the same chromosome
Reflects biologically meaningful association of loci
![Page 59: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/59.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
59
Application,Model Organisms
Evaluation of Mus musculus breeding strategies
Solution: Use SNPs, correlation, paraclique and proximity
Standard Inbred (SI)Recombinant Inbred (RI)
BXD, LXS, etcThe Collaborative Cross
![Page 60: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/60.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
60
Application,Model Organisms
0
200
400
600
800
1000
1200
0.90.850.80.750.70.650.60.550.50.450.40.350.30.250.20.150.1
Mutual Information
Num
ber o
f Par
acliq
ues
cont
aini
ng m
ore
than
3 S
NPs
that
cro
ss m
ultip
le
chro
mos
omes
67SI
89BXD
0
200
400
600
800
1000
1200
1400
0.90.850.80.750.70.650.60.550.50.450.40.350.30.250.20.150.1
Mutual Information
Nu
mb
er o
f P
arac
liqu
es c
on
tain
ing
m
ore
th
an 3
SN
Ps
67SI
89BXD
Number of LD Networks
Number of Non-Syntenic LD Networks
67 Inbred Strains
0
200
400
600
800
1000
1200
1400
0.90.850.80.750.70.650.60.550.50.450.40.350.30.250.20.150.1
Mutual Information
Num
ber
of P
arac
lique
s (s
ize
> 3)
1 chrs 2 chrs 3 chrs4 chrs 5 chrs 6 chrs7 chrs 8 chrs 9 chrs10 chrs 11 chrs 12 chrs13 chrs 14 chrs 15 chrs16 chrs 17 chrs
89 BXD Strains
0
200
400
600
800
1000
1200
1400
0.90.850.80.750.70.650.60.550.50.450.40.350.30.250.20.150.1
Mutual Information
Num
ber o
f Par
acliq
ues
(Siz
e>3)
1 chrs 2 chrs3 chrs 4 chrsRecombinant Inbred
Standard Inbred
Chromosome Coverage
![Page 61: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/61.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
61
Application,Model Organisms
rs13476024
CEL-1_103029662
Chr 1
Chr 7
StandardInbred
Chr 11rs3664950
rs3724175
rs3674958rs13480968
Chr 4
rs3718552Chr 7 rs8243991
UT_7_136.88578
rs4226997
rs3714636
rs3694146rs6334210 rs1347955
3 mCV22291963rs13479554 rs13479555
rs6392543 CEL-
7_126142971
rs3663988
rs6303477 rs366012
2 rs3659292 rs1347956
6 rs13479567 rs62121
86 rs13479569rs13479570 rs366616
0 rs13479571
CEL-7_126301023
rs13479559
CEL-7_126570687
rs6216320
Recombinant Inbred
Chr 7
Example of ContrastingParaclique Profiles
![Page 62: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/62.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
62
CollaboratorsComputer Science, Mathematics, Molecular Biology, Statistics
Research Scientists (Incomplete!):Mikael BensonElissa Chesler Frank DehneVlad Estivill-CastroMike FellowsIvan GerlingFrans HenskensReza MobiniPablo MoscatoMark RaganFran RosamondArnold SaxtonBrynn VoyRob Williams
Current/Recent Students:Bhavesh BoratePatricia CareyJohn EblenJeremy JayDenise KoesslerZuopan LiZahra MahoorSudhir NaswaClinton NolanAndy PerkinsCharles PhillipsGary RogersPeter ShawDinesh WeerapurageYun Zhang
![Page 63: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/63.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
63
Come Join Us!
Biology, Computer Science, Mathematics, StatisticsThe University of Tennessee and Oak Ridge National Laboratory
• Analysis of Algorithms• Applications
o Alcoholism, Allergy, Cancer, Diabetes, Neuroscience, Radiology...• Complex Cross (210 RI Mouse Lines)• High Performance Computing• Metabolomics, Proteomics, Transcriptomics, ...• Ontological Discovery• RNAi and µRNA Interference• Threshold Selection• Time Series Analysis• Work Alongside Both Computer and Domain Scientists!
![Page 64: Computation for Large Systems II: Applications and Analysisbioinformatics.org.au/resources/ws10/presentations/... · 2014-03-20 · ELECTRICAL ENGINEERING & COMPUTER SCIENCE UNIVERSITY](https://reader034.vdocument.in/reader034/viewer/2022042402/5f133379e675f70f302e7209/html5/thumbnails/64.jpg)
ELECTRICAL ENGINEERING & COMPUTER SCIENCEUNIVERSITY OF TENNESSEE
64
The Langston Lab(Geeks Я Us)