supplementary,figure,01, - princeton...
TRANSCRIPT
SUPPLEMENTARY FIGURE 01 Ortho
Grou
ps associated with
5,798 yeast gen
es
126 ordered species (from “03-‐OG_vs_Species-‐Sorted.txt”)
archaea bacteria non-‐chordate
animals fungi plants eukaryoYc parasites
chordate animals
Supplementary Figure 1. Expanded heat-‐map showing conserva<on of yeast genes in each of the 131 species analyzed. Binarized data represenYng the presence or absence of an ortholog to each protein is represented as green (presence) or grey (absence) for each of the 126 species analyzed in this manuscript (a sub-‐set of the species present in the OrthoMCL database). The individual species data were collapsed into taxonomic groups for Figure 01. See the Materials and Methods secYon for details on data binarizaYon, species selecYon, and ordering of genes.
SUPPLEMENTARY FIGURE 02
Species with Orthologs Present
0% 100% 0.1%
Fungi + Non-‐Chordates + Plants (24 genes)
Fungi + Chordates + Plants (26 genes)
Fungi + Non-‐Chordates (12 genes)
Fungi + Chordates (21 genes)
Fungi + Non-‐Chordates + Plants + Bacteria (9 genes) Fungi + Chordates + Plants + Bacteria (3 genes)
Fungi + Plants + Bacteria (32 genes)
Fungi + Animals + Bacteria (12 genes)
Fungi + Bacteria (20 genes) Fungi + Non-‐Chordates + Plants + Archaea (3 genes) Fungi + Plants + Archaea (9 genes) All (-‐ plants) (3 genes) All (-‐ chordates) (20 genes) All (-‐ animals) (13 genes)
Fungi + Bacteria + Archaea (6 genes) All (-‐ plants) (4 genes)
Supplementary Figure 2. Fine-‐scale analysis of Minor Phylogroups. Expanded view of the Minor Phylogroups with included labels for rough phylogeneYc categories to the right. An asterisk (*) indicates that only one gene is present with the idenYfied phylogeneYc pabern, and due to space limitaYons is not fully described in the phylogeneYc categories to the right.
SUPPLEMENTARY FIGURE 03
Supplementary Figure 3. Gene Ontology (GO) func<onal category term enrichment of phylogroups. GO-‐Slim Mapper was used to idenYfy GO terms that are enriched in each phylgroup. The most significant results are presented in a heat-‐map with yellow intensity corresponding to significance of enrichment (see legend -‐ the color intensity scale was defined using our significance threshold of p < 10-‐7). Phylogroups analyzed are listed across the top of the heat-‐map.
Enrichment Significance
p = 1 p < 10-‐7 p = 10-‐3.5
FuncYo
n
nucleic acid binding TF
molecular funcYon unknown
DNA binding
structural molecule acYvity
translaYon factor acYvity
ATPase acYvity
ligase acYvity
lyase acYvity
hydrolase acYvity (C-‐N bonds)
oxidoreductase acYvity
ribosome structural consYtuent
SUPPLEMENTARY FIGURE 04
Supplementary Figure 4. Comparison of phylogene<c break-‐down amongst defined sets of yeast genes. GO-‐Slim Mapper was used to idenYfy GO terms that are enriched in each phylgourp. The most significant results are presented in a heat-‐map with yellow intensity corresponding to significance of enrichment (see legend -‐ the color intensity scale was defined using our significance threshold of p < 10-‐7). Phylogroups analyzed are listed across the top of the heat-‐map.
Total genome (5,798)
PHYLOGROUPS ALL
ALL (-‐archaea)
ALL (-‐bacteria)
ALL (-‐animals)
EUKARYOTES
ANIMALS + FUNGI
PLANTS + FUNGI
FUNGI
MINOR PHYLOGROUPS
NO DATA
Perce
nt
010
2030
4050
60
20
0
10
60
50
40
30
Perce
nt
010
2030
4050
60
20
0
10
60
50
40
30
EssenYal genes (1,109)
Genes of unknown funcYon (1,222)
Perce
nt
010
2030
4050
60
EssenYal genes of unknown funcYon (26)
Perce
nt
010
2030
4050
60
Percen
t Pe
rcen
t
Percen
t Pe
rcen
t
A.
C. D.
B.
20
0
10
60
50
40
30
20
0
10
60
50
40
30
SUPPLEMENTARY FIGURE 05
Hierarchical clustering
with opYmal leaf ordering
Manual ordering
A B
Supplementary Figure 5. Alterna<ve clustering approaches result in similar clusters of genes. Comparison of manual ordering (employed in this study) and hierarchical clustering with opYmal leaf ordering as in Bar-‐Joseph, 2001 (note that in accordance with the original figure the data for panel B was binarized using the 0.2 threshold and eukaryoYc parasites were not included for the clustering). ( A) and (B) The column order is the same as in Figure 1. The red bar refers to the group of genes found in all species except bacteria. Note that the main difference appears to be in the scabering of
genes that were placed into a group called “minor clusters” for the original figure. (C) Venn diagram showing overlap of the genes (in all species except bacteria) idenYfied by each method. The overlap is highly significant (p<10-‐308, hypergeometric distribuYon).