uc davis eve161 lecture 11 by @phylogenomics

46
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 Lecture 10: EVE 161: Microbial Phylogenomics Lecture #10: Era III: Genome Sequencing UC Davis, Winter 2014 Instructor: Jonathan Eisen 1

Upload: jonathan-eisen

Post on 10-May-2015

508 views

Category:

Education


2 download

TRANSCRIPT

Page 1: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Lecture 10:

EVE 161:Microbial Phylogenomics

!Lecture #10:

Era III: Genome Sequencing !

UC Davis, Winter 2014 Instructor: Jonathan Eisen

!1

Page 2: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Where we are going and where we have been

• Previous lecture: !10: Genome Sequencing

• Current Lecture: !11: Genome Sequencing II

• Next Lecture: !12: Genome Sequencing III

!2

Page 3: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Comparative Genomics

Page 4: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Structural Diversity

Page 5: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Structural Diversity

In many organisms, there is a clear distinction in size between the chromosomes andthe plasmids. However, as more complete genome sequences are determined, size is nolonger an infallible criterion for distinguishing different types of genetic elements. Forexample, the halophilic archaeon Haloferax volcanii has five circular DNA elements withsizes of 2.92 Mb, 690 kb, 442 kb, 86 kb, and 6.4 kb (other examples are in Table 7.1). Ifsize were the only criterion, we might consider the 690-kb element in H. volcanii to bea second chromosome because it is larger than the chromosome of B. aphidicola APS,for example. However, size is just a property that helps distinguish plasmids from chro-mosomes. More importantly, there are significant biological differences between plasmidsand chromosomes. Some of these differences are discussed in the following sections.

Plasmids, unlike chromosomes, are generally “accessory” elements, carrying genesthat are required only under certain conditions (Table 7.2). For example, the B. aphidi-cola APS plasmids encode genes needed to synthesize tryptophan and leucine, two ofthe amino acids that the bacteria provide for their host. The B. aphidicola APS chro-mosome encodes all the information for DNA replication, transcription, translation,cell-membrane and cell-wall formation, and the other genes required to assemble thecore machinery of the cell. In E. coli O157:H7, the 92-kb plasmid encodes many viru-lence factors that contribute to the disease caused by this bacterium, whereas the chro-mosome encodes all the housekeeping functions. Because plasmids typically have only

170 Part I I • THE ORIGIN AND DIVERSIFICATION OF LIFE

TABLE 7.1. Examples of bacteria with multiple genetic elements

Species Form Size (kb) Shape

Streptomyces coelicolor Chromosome 8667 LinearPlasmid 356 LinearPlasmid 31 Circular

Agrobacterium tumefaciens Chromosome 2842 CircularChromosome 2057 LinearPlasmid 543 CircularPlasmid 214 Circular

Borrelia burgdorferi Chromosome 911 LinearPlasmid (n = 11) 9–54 Circular/Linear

Brucella melitensis Chromosome 2117 CircularChromosome 1178 Circular

Clostridium acetobutylicum Chromosome 3941 CircularPlasmid 192 Circular

Deinococcus radiodurans Chromosome 2649 CircularPlasmid 412 CircularPlasmid 177 CircularPlasmid 46 Circular

Ralstonia solanacearum Chromosome 3716 CircularChromosome? 2095 Circular

Salmonella typhi Chromosome 4809 CircularPlasmid 218 CircularPlasmid 107 Circular

Sinorhizobium meliloti Chromosome 3654 CircularPlasmid 1683 CircularPlasmid 1354 Circular

Vibrio cholerae Chromosome 2941 CircularChromosome 1072 Circular

Yersinia pestis Chromosome 4654 CircularPlasmid (n = 3) 10–96 Circular

Based on Bentley S.D. and Parkhill J. Annu. Rev. Genet. 38: 771–792, as adapted from Ohmachi M. 2002.Curr. Biol. 12: R427–428.

169-194_Evo_Ch07.qxd:13937_C05.qxd 12/15/08 11:05 AM Page 170

Page 6: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

What is a PlasmidChapter 7 • BACTERIAL AND ARCHAEAL GENETICS AND GENOMICS 171

accessory functions, an organism can usually survive without them provided it is notexposed to the specialized conditions for which the plasmids are needed. In turn, thismeans that plasmids are commonly lost from particular bacterial and archaeal strains.

In most species, only one copy of the chromosome (or, at most, a few) is present percell. Plasmids, however, are frequently present in much greater copy number; sometimesthere are hundreds of copies per cell. Allowing the copy number of plasmids to increase(while controlling chromosome copy number) in essence means that all of the genes onthe plasmids undergo substantial gene duplication. For example, in B. aphidicola APS, theratio of tryptophan and leucine plasmid number to chromosome number is greater than10:1. This difference in copy number arises because plasmid and chromosomal replica-tion are not coupled. Furthermore, the two frequently use entirely separate replicationmechanisms. In addition, because plasmids and chromosomes use different replicationsystems, they frequently have different rates and patterns of mutation.

From an evolutionary point of view, the most important distinction between plas-mids and chromosomes is the ease with which plasmids move between strains andeven species. The mobility of plasmids plays a critical role in lateral gene transfer (seebelow). This transfer of plasmids results in very sporadic plasmid distribution patternswhen different strains of one species or different species are compared.

In almost all species, there is only one chromosome and all other genetic elementsare plasmids. There are, however, a few notable exceptions of bacteria with more thanone chromosome. The causative agent of cholera, Vibrio cholerae, has two large geneticelements (2.9 and 1.1 Mb in size; see Table 7.1). Both encode multiple housekeepinggenes and are found in all close relatives of this species (and thus do not have sporadicdistribution patterns). Therefore, both elements qualify as chromosomes.

Agrobacterium tumefaciens, which causes crown gall tumors in plants, has an un-usual pair of chromosomes: One is circular (as is typical for bacteria), but the other islinear. Once thought to be the exclusive province of eukaryotes, linear genetic elementshave now been found in several species of bacteria.

Linear chromosomes are faced with a unique problem: DNA polymerases cannotreplicate the ends of the chromosome, because the enzymes cannot replace the terminalRNA primer of the lagging strand (see Box 12.1). Without another mechanism for repli-cating the ends (i.e., the telomeres), linear chromosomes would become shorter with eachround of replication. Eukaryotes use a specialized enzyme, telomerase, which adds a re-peating DNA motif to the telomeres (see Fig. 8.17). Bacteria, like A. tumefaciens, appear

TABLE 7.2. Plasmid functions

Genetic Functionof Plasmid Gene Functions Examples

Resistance Antibiotic resistance Rbk plasmid of Escherichia coli and otherbacteria

Fertility Conjugation and DNA F plasmid of E. colitransfer

Killer Synthesis of toxins that Col plasmids of E. coli, for colicin productionkill other bacteria

Degradative Enzymes for TOL plasmid of Pseudomonas putida, formetabolism of toluene metabolismunusual molecules

Virulence Pathogenicity Ti plasmid of Agrobacterium tumefaciens,conferring the ability to cause crown galldisease on dicotyledonous plants

169-194_Evo_Ch07.qxd:13937_C05.qxd 12/15/08 11:05 AM Page 171

Page 7: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Genome Size

Page 8: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Genome Size

to use a similar mechanism for preserving the ends of their linear chromosomes. Fur-thermore, it appears that these replication systems arose independently in bacteria andeukaryotes and, thus, are an interesting example of convergent evolution.

Bacterial and Archaeal Genomes Are Much Smaller and MoreCompact Than Those of Eukaryotes

Bacterial and archaeal genomes are smaller than the vast majority of eukaryoticgenomes (Fig. 7.1). Among bacteria, genomes range in size from 160 kb (the obligatesymbiont Carsonella ruddii) to more than 13 Mb (the δ-proteobacterium Sorangiumcellulosum). Archaeal genomes range from 490 kb (Nanoarchaeum equitans, a symbi-otic species [Fig. 6.7]) to 5.7 Mb (the methanogen Methanosarcina acetivorans). Themedian genome size for both archaea and bacteria is approximately 2 Mb.

When comparing bacteria and archaea with eukaryotes, the difference in genome sizeis much greater than the difference in the number of genes. This is because the densityof genes is very great within bacterial and archaeal genomes (Fig. 7.2). For example, thehuman genome is approximately 1000 times bigger than the E. coli K12 genome, yet hu-mans have only about ten times as many protein-coding genes (Fig. 7.3). In fact, a num-ber of bacteria and archaea have more protein-coding genes than some eukaryotes. Al-most all species of Myxobacteria (a subgroup of fruiting-body-forming δ-proteobacteria,including S. cellulosum with the 13-Mb genome) have greater than 8000 protein-codinggenes, which is more than in the model yeast species Saccharomyces cerevisiae andSchizosaccharomyces pombe.

The great density of genes within the genomes of bacteria and archaea is due to thepaucity of noncoding DNA compared with that in eukaryotic genomes. Introns and in-tergenic regions (i.e., the DNA located between genes) are rare and generally small inbacteria and archaea. Instead, as mentioned in Chapter 6, many bacterial and archaealgenes are organized into operons, clusters of cotranscribed genes that use only a singlepromoter for the entire gene cluster. This organization helps to create a compact genome.The genes found in a single operon are usually involved in similar functions (e.g., thesame metabolic pathway or a single-protein complex) (Fig. 7.4). Operons are a criticalfeature of the genomes of bacteria and archaea. For example, it is estimated that E. coliK12 has about 700 operons in its genome.

Eukaryotic genomes are bulky in part because they contain large numbers of repeti-tive DNA elements (Fig. 7.2). Common eukaryotic repetitive DNA elements include sim-

172 Part I I • THE ORIGIN AND DIVERSIFICATION OF LIFE

Bacteria

Arabidopsisthaliana

LeishmaniamajorGuillardia theta

P. marius

Nanoarchaeumequitans

Methanosarcinaacetivorans

Myxobacteria

Bradyrhizobiumjaponicum

Escherichiacoli

Human Fern

CockroachMoss Amoebadubia

Schizosac-charomyces

pombe

Parameciumtetraurelia

Number of base pairs

Eukaryotes

Archaea

1 10131 10121 10111 10101 1091 1081 1071 1061 105

FIGURE 7.1. Genome sizes in the three domains of life. A selection of genome sizes and sizeranges from specific groups of organisms is indicated.

169-194_Evo_Ch07.qxd:13937_C05.qxd 12/15/08 11:05 AM Page 172

Page 9: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Gene Density

Page 10: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Gene DensityChapter 7 • BACTERIAL AND ARCHAEAL GENETICS AND GENOMICS 173

ple sequence repeats (e.g., microsatellites and minisatellites), gene duplications (both tan-dem arrays and pseudogenes), and transposable elements. Although bacterial and ar-chaeal genomes contain repetitive DNA, the total amount is relatively small. For exam-ple, hundreds of thousands of copies of transposable elements are present in manyeukaryotic genomes, yet in bacteria and archaea it is rare to have even 100 copies.

Pressure to Streamline Genomes Causes Bacteria and Archaeato Lose Genes Not Actively Maintained by Selection

To understand the evolution of bacterial and archaeal genomes, it is useful to ask whythere is so much more noncoding DNA in most eukaryotic genomes. Clearly, some ofthe extra DNA in eukaryotes has important functions such as gene regulation. However,much of the noncoding DNA in eukaryotic genomes has been classified as either junkDNA or selfish DNA. Junk DNA appears to provide little benefit or no function to theorganism. (In some cases this designation is a misnomer resulting from a lack of infor-

0 10

Gene

20 30 40 50 kb

A Human

B Escherichia coli

Human pseudogeneKEY

Repetitive DNA element

0 10 20 30 40 50 kb

FIGURE 7.2. Genome density. Comparison of the genome density and content of humans and Es-cherichia coli. Each segment is 50 kb in length and represents (A) a portion of the human β T-cellreceptor locus and (B) a region of the E. coli K12 genome. Note the much greater proportion ofgenes (red boxes) in E. coli compared to humans.

30,00025,00020,00015,00010,0005,000

0

Bacteria

Genes

Genome size105 106 107 108 109 1010

EukaryotesVirusesArchaea

FIGURE 7.3. Genome size vs. number of protein-coding genes. The number of genes is highly cor-related to genome size for bacteria, archaea, and viruses, but less so for eukaryotes. Many archaealpoints (blue triangles) are hidden under bacterial ones (yellow squares).

169-194_Evo_Ch07.qxd:13937_C05.qxd 12/15/08 11:05 AM Page 173

Page 11: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Number of genes

Page 12: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Number of Genes

Chapter 7 • BACTERIAL AND ARCHAEAL GENETICS AND GENOMICS 173

ple sequence repeats (e.g., microsatellites and minisatellites), gene duplications (both tan-dem arrays and pseudogenes), and transposable elements. Although bacterial and ar-chaeal genomes contain repetitive DNA, the total amount is relatively small. For exam-ple, hundreds of thousands of copies of transposable elements are present in manyeukaryotic genomes, yet in bacteria and archaea it is rare to have even 100 copies.

Pressure to Streamline Genomes Causes Bacteria and Archaeato Lose Genes Not Actively Maintained by Selection

To understand the evolution of bacterial and archaeal genomes, it is useful to ask whythere is so much more noncoding DNA in most eukaryotic genomes. Clearly, some ofthe extra DNA in eukaryotes has important functions such as gene regulation. However,much of the noncoding DNA in eukaryotic genomes has been classified as either junkDNA or selfish DNA. Junk DNA appears to provide little benefit or no function to theorganism. (In some cases this designation is a misnomer resulting from a lack of infor-

0 10

Gene

20 30 40 50 kb

A Human

B Escherichia coli

Human pseudogeneKEY

Repetitive DNA element

0 10 20 30 40 50 kb

FIGURE 7.2. Genome density. Comparison of the genome density and content of humans and Es-cherichia coli. Each segment is 50 kb in length and represents (A) a portion of the human β T-cellreceptor locus and (B) a region of the E. coli K12 genome. Note the much greater proportion ofgenes (red boxes) in E. coli compared to humans.

30,00025,00020,00015,00010,0005,000

0

BacteriaGenes

Genome size105 106 107 108 109 1010

EukaryotesVirusesArchaea

FIGURE 7.3. Genome size vs. number of protein-coding genes. The number of genes is highly cor-related to genome size for bacteria, archaea, and viruses, but less so for eukaryotes. Many archaealpoints (blue triangles) are hidden under bacterial ones (yellow squares).

169-194_Evo_Ch07.qxd:13937_C05.qxd 12/15/08 11:05 AM Page 173

Page 13: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Gene Arrangement

Page 14: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Operons174 Part I I • THE ORIGIN AND DIVERSIFICATION OF LIFE

lacZ

CAPsite

Operator

Promoter

Lactose permeasetransports lactose intothe cell

transacetylase+split lactose to galactose + glucose

CH2OHOH

OHH H

H

H OH

H

O

O

-galactosidase

lacY lacA

OH

H

CH2OHOH

OHH

H

H OH

H

O OH

H

CH2OHH

OHOH

H

H OH

H

O OH

H

CH2OHH

OHH H

H

H OH

Lactose Galactose

+

Glucose

H

O

FIGURE 7.4. Lac operon from Escherichia coli. This operon consists of three genes whose transcrip-tion is regulated by a single promoter. The genes encode proteins involved in utilizing lactose, in-cluding a permease (encoded by lacY), which brings lactose into the cell from the outside, and twoenzymes (encoded by lacZ and lacA), which split lactose into glucose + galactose (see pp. 52–53).

mation. Some stretches of “junk DNA” have been determined to be involved in gene reg-ulation, chromatin organization, centromere activity, and other functions.) Selfish DNAis composed of mobile DNA elements that facilitate their own duplication, even if it isto the detriment of the host.

All of the many theories that have been proposed to explain why junk DNA and selfishDNA are less abundant in bacteria and archaea agree that there is some global pressure tokeep total genome size small. This global pressure is most likely selection, although theremay also be a bias toward deletion of DNA. Indeed, such a mechanism in bacteria and ar-chaea could be responsible for keeping introns both small and rare, holding transposable el-ements in check, maintaining operons, and culling junk DNA. This global pressure and othertheories on the evolution of genome size are discussed in more detail in Chapter 21. Herewe discuss its effects on the general patterns of genomic evolution in bacteria and archaea.

The limited occurrence of introns in bacteria and archaea has many importantconsequences. For example, although eukaryotes can make thousands of protein prod-ucts from a single gene by alternative splicing, this is not seen in bacteria and archaea.In addition, mixing and matching of protein domains is less common in bacteria andarchaea than in eukaryotes, possibly because such events are caused mainly by re-combination in introns.

The extensive use of operons also has significant consequences. In some respects, oper-ons are a major constraint; mutations that break up the operon (e.g., by causing a re-arrangement in the middle of the operon) may be quite detrimental. In other ways, oper-ons facilitate rapid acquisition of new features by bacteria and archaea because they allowcomplete pathways to be transferred readily between strains or species (see the discussionof lateral transfer later in this chapter). In contrast, in many eukaryotes, with genes in-volved in the same pathway scattered around the genome, it is unlikely that all of the geneswould be transferred at one time to another strain or species.

In bacteria and archaea, the pressure to streamline genomes (whether caused bymutation bias or selection for small genomes or both) means that genes that provideno advantage are rapidly lost (see Box 18.2). Thus, although vestigial genes may lingerfor long periods in eukaryotes, they do not linger in bacteria and archaea. For exam-

169-194_Evo_Ch07.qxd:13937_C05.qxd 12/15/08 11:05 AM Page 174

Page 15: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Gene Content

!15

Page 16: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Shared Genes

!16

Page 17: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

E. coli shared Genes

substantial variation in gene content among members of the same species have beenreported in other lineages of bacteria and archaea. Thus, the diminishing number ofcore orthologous genes is simply an extension of something happening among closerelatives.

How do such extensive differences in gene content among close relatives originate?One of the most important clues comes from comparing the genome structures of re-lated species. (A graphical method for aligning circular genomes is introduced in Box7.1; see Figs. 7.8 and 7.9.) In comparing E. coli K12 and O157:H7, the genes that areshared between the two strains not only are highly conserved at the sequence level; but

176 Part I I • THE ORIGIN AND DIVERSIFICATION OF LIFE

Graphical Alignment for Comparing Circular Genomes Using Dotplots

Comparing the arrangement of genomes is a critical tool for un-derstanding how they evolve. This enables scientists to identifyand characterize genome rearrangements (e.g., inversions andtranslocations) and to search for patterns and associations thatmay explain how and why certain events occur. For example,differences in gene order between species are frequently at siteswhere repetitive DNA is found, which suggests that recombina-tion at the repetitive DNA may have led to rearrangements. Oneof the more useful methods compares two genomes on an x–yplot, a procedure commonly referred to as a dotplot.

Dotplots let people use their visual pattern-recognitionskills to identify similarities. Their power and simplicityhave made them a valuable analytical tool in fields beyondbiology, including electrical engineering and computer sci-

ence. Let us illustrate the method using some text-based ex-amples. Figure 7.8A plots a familiar quotation against itself.The central diagonal line is the axis of identity. The outlyingpoints represent text that repeats. A quick examination candistinguish a pattern that is repeated in its entirety (Fig.7.8B) from one with some unique elements (Fig. 7.8C).

Because most bacterial and archaeal chromosomes are cir-cular, a chromosome must first be “opened” before laying it outon the x- or y-axis. Although the circle can be linearized at anypoint, it is preferable to open each chromosome at its origin ofreplication (Fig. 7.9A). One linearized chromosome is thenaligned along the x-axis with the origin of replication placed atthe graphical origin. The other chromosome is similarlyarranged along the y-axis. The two chromosomes are com-

MG1655 (K-12)nonpathogenic

EDL933 (0157:H7)enterohemorrhagic

585

514 204

1932996

1346

1623CFT073uropathogenic

FIGURE 7.7. Number of shared proteins be-tween strains of Escherichia coli. Note thelarge number of genes found in one strainbut not the others (seen in the outer portionsof each circle).

be

A B C

to

not

or

be

to

to be or not to be

edcbaedcba

a b c d e a b c d e

ed

zy

bc

aedcba

a b c d e a b c y z d e

FIGURE 7.8. Dotplots of repeating text.

Box 7.1

169-194_Evo_Ch07.qxd:13937_C05.qxd 12/15/08 11:05 AM Page 176

Page 18: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Gene Order

Page 19: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Gene Order

Page 20: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Page 21: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Page 22: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Origin of replication

Terminus of replication

Page 23: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Origin of replication

Terminus of replication

Artificially Open Circle

Origin Terminus Origin Again

Page 24: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Origin of replication

Terminus of replication

Artificially Open Circle

Origin Terminus Origin Again

Genome 1

Gen

ome

2

O T O

O

O

T

Page 25: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Page 26: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

E. coli K12

Island InversionRepeat

E.coli0157:

H7

FIGURE 7.10. Conserved gene order inthe backbone of Escherichia coli K12 and0157:H7. The two genomes were alignedwith each other and the matching regionswere plotted. The conserved order ofgenes in the backbone of the two E. colistrains is indicated by the diagonal line.Three important genomic regions are cir-cled. An island present in one of the twostrains causes a slight shift in the positionof the main diagonal.

178 Part I I • THE ORIGIN AND DIVERSIFICATION OF LIFE

they also occur in virtually the same order in both strains (Fig. 7.10). The genes uniqueto each strain are clustered into “islands” interspersed among the stretches of commongenes. Similar patterns of DNA “islands” within a conserved genome backbone havebeen found among other related bacteria or archaea.

How do these islands originate? These are two possibilities: insertion of DNA intothe strain with the island or deletion of DNA in the strain without the island. Gene lossis very common and frequently very rapid in bacteria and archaea (e.g., Fig. 7.5). How-ever, relying on gene loss alone to explain genomic islands becomes untenable as moreand more species are compared. For example, when the genome of a third strain of E.coli was determined, it was found to have many additional islands that are absent fromboth K12 and O157:H7 (Fig. 7.7). For gene loss to explain all the islands in the variousE. coli strains, their common ancestor would have required an enormous genome fromwhich different regions were lost in different lineages. Indeed, such a mechanism wouldrequire ancestral species to have had bigger and bigger genomes further back in time.Thus genes must be acquired to offset gene loss. Acquisition of genes is one of the hall-marks of bacterial and archaeal evolution and is discussed on pages 182–191.

Gene Order Changes Rapidly but with Strong Constraints

In addition to studying the location of genes found in one organism but not another,it is useful to compare the order of genes and other genomic features that are con-served between species. These comparisons reveal how genomes evolve and what theconstraints are on the relative positioning of genes.

As with gene content, there is little conservation in the gene order between dis-tantly related species (Fig. 7.11). Some sets of genes, however, are strongly conserved.The best example is the genes that encode many of the ribosomal proteins (Fig. 7.12).When such conservation occurs across such large evolutionary distances, it suggeststhat tightly coordinated regulation of transcription and translation is necessary forfunctionality. This is probably due in part to the coupling of transcription and trans-lation in bacteria and archaea. In turn, the lack of coupling in eukaryotes may explainwhy there are few examples of gene-order conservation across such large distances.

When gene-order comparisons are made among closely related strains or species, re-arrangements are frequently observed at sites of repetitive sequences such as transposonsor duplicated genes (Fig. 7.13). Although repetitive DNA is less abundant in bacteria andarchaea, it still plays a major role in genome evolution.

Comparing gene order among multiple sets of close relatives has revealed what typesof rearrangements are most common. In bacteria and archaea, one of the most com-

169-194_Evo_Ch07.qxd:13937_C05.qxd 12/15/08 11:05 AM Page 178

Page 27: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Chapter 7 • BACTERIAL AND ARCHAEAL GENETICS AND GENOMICS 179

mon is symmetric inversion around the origin of replication (Fig. 7.14). Such inversionsare seen in almost every comparison of moderately closely related strains or species. Al-though other rearrangements occur, the symmetric inversions serve as a useful tool forunderstanding some features of general evolution and we focus on them here.

Symmetric inversions around the origin are due to a combination of mutation biasand selection bias. To understand how mutation bias could cause this, it is helpful to un-derstand some of the features of circular chromosome replication in bacteria and archaea.Replication of circular chromosomes almost always begins at a single region—referred toas the origin of replication. DNA replication proceeds bidirectionally from this origin, con-tinuing until the replication forks collide on the other side of the DNA circle at the ter-minus of replication (Fig. 7.15). It is thought that the replication complex stands relativelystill and the DNA is threaded through this complex, which would place the two replica-tion forks close to each other. This threading can thus lead to symmetric inversions. If theDNA replication complexes were to slip and drop the DNA strands, they might restartreplication by using the template from the opposite side of the origin to extend the re-cently replicated DNA, thereby causing an inversion. As the two replication forks should

400,0000400,000

0

800,0001,200,0001,600,0001,667,867

800,000H. influenzae Rd chromosome

H.pylori266

95chromoso

me

1,200,000 1,830,137

FIGURE 7.11. The lack of conservation ofgene order between Haemophilus influen-zae and Helicobacter pylori is illustrated.Linearized chromosomes of H. influenzaeand H. pylori are plotted on the horizontaland vertical axes, respectively. Each dot rep-resents a single pair of orthologous proteins.Genes in similar operons, which do exist,are too close together to give separatedpoints on the scale used.

Sinorhizobium melilotiBacillus subtilisBorrelia burgdorferiTreponema pallidumHelicobacter pyloriEscherichia coliHaemophilus influenzaeRickettsia prowazekiiMycoplasma sp.Aquifex aeolicus S6Thermatoga maritimaDeinococcus radioduransMycobacterium tuberculosisChlamydia sp.Synechocystis

Archaea SUI1-X1 S-4E L32-L19 X2 cdk-L1--ccm-mms

Small SUr-protein genes

rpoBC str S10 spc alpha

Large SUr-protein genesNonribosomal genesUnknown genesBreakpointGene insertionRho-independent terminatorMissing gene

S4

?

L11(rplK)

L1(rplA)

L10(rplJ)

L7/L12(rplL)

rpoB rpoC unknown

S12(rpsL)

S7(rpsG)

fusA tufA S10(rpsJ)

L3(rplC)

L4(rplD)

L23(rplW)

L2(rplB)

S19(rpsS)

L22(rplY)

S3(rpsC)

L16(rplP)

L29(rpmC)

S17(rpsQ)

L14(rplN)

L24(rplX)

L5(rplE)

S14(rpsN)

S8(rpsH)

L6(rplF)

L18(rplR)

S5(rpsE)

L30(rpmD)

L15(rplO)

secY adk map infA L36(rpmJ)

S13(rpsM)

S11(rpsK)

S4(rpsD)

rpoA L17(rplQ)

xxx

? ???

FIGURE 7.12. Conservation of gene order of ribosomal protein operons across bacterial and ar-chaeal species.

169-194_Evo_Ch07.qxd:13937_C05.qxd 12/15/08 11:05 AM Page 179

Page 28: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Chapter 7 • BACTERIAL AND ARCHAEAL GENETICS AND GENOMICS 179

mon is symmetric inversion around the origin of replication (Fig. 7.14). Such inversionsare seen in almost every comparison of moderately closely related strains or species. Al-though other rearrangements occur, the symmetric inversions serve as a useful tool forunderstanding some features of general evolution and we focus on them here.

Symmetric inversions around the origin are due to a combination of mutation biasand selection bias. To understand how mutation bias could cause this, it is helpful to un-derstand some of the features of circular chromosome replication in bacteria and archaea.Replication of circular chromosomes almost always begins at a single region—referred toas the origin of replication. DNA replication proceeds bidirectionally from this origin, con-tinuing until the replication forks collide on the other side of the DNA circle at the ter-minus of replication (Fig. 7.15). It is thought that the replication complex stands relativelystill and the DNA is threaded through this complex, which would place the two replica-tion forks close to each other. This threading can thus lead to symmetric inversions. If theDNA replication complexes were to slip and drop the DNA strands, they might restartreplication by using the template from the opposite side of the origin to extend the re-cently replicated DNA, thereby causing an inversion. As the two replication forks should

400,0000400,000

0

800,0001,200,0001,600,0001,667,867

800,000H. influenzae Rd chromosome

H.pylori266

95chromoso

me

1,200,000 1,830,137

FIGURE 7.11. The lack of conservation ofgene order between Haemophilus influen-zae and Helicobacter pylori is illustrated.Linearized chromosomes of H. influenzaeand H. pylori are plotted on the horizontaland vertical axes, respectively. Each dot rep-resents a single pair of orthologous proteins.Genes in similar operons, which do exist,are too close together to give separatedpoints on the scale used.

Sinorhizobium melilotiBacillus subtilisBorrelia burgdorferiTreponema pallidumHelicobacter pyloriEscherichia coliHaemophilus influenzaeRickettsia prowazekiiMycoplasma sp.Aquifex aeolicus S6Thermatoga maritimaDeinococcus radioduransMycobacterium tuberculosisChlamydia sp.Synechocystis

Archaea SUI1-X1 S-4E L32-L19 X2 cdk-L1--ccm-mms

Small SUr-protein genes

rpoBC str S10 spc alpha

Large SUr-protein genesNonribosomal genesUnknown genesBreakpointGene insertionRho-independent terminatorMissing gene

S4

?L11(r

plK)L1(rp

lA)L10(r

plJ)L7/L12

(rplL)rpoB rpoC unkno

wnS12(r

psL)S7(rp

sG)fusA tufA S10(r

psJ)L3(rp

lC)L4(rp

lD)L23(r

plW)L2(rp

lB)S19(r

psS)L22(r

plY)S3(rp

sC)L16(r

plP)L29(r

pmC)S17(r

psQ)L14(r

plN)L24(r

plX)L5(rp

lE)S14(r

psN)S8(rp

sH)L6(rp

lF)L18(r

plR)S5(rp

sE)L30(r

pmD)L15(r

plO)secY adk map infA L36(r

pmJ)S13(r

psM)S11(r

psK)S4(rp

sD)rpoA L17(r

plQ)

xxx

? ???

FIGURE 7.12. Conservation of gene order of ribosomal protein operons across bacterial and ar-chaeal species.

169-194_Evo_Ch07.qxd:13937_C05.qxd 12/15/08 11:05 AM Page 179

Page 29: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Gene Order Again

Page 30: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

V. cholerae vs. E. coli All

0

1000000

2000000

3000000

4000000

5000000E.

col

i Coordinates

0 1000000 2000000 3000000

V. cholerae Coordinates Eisen et al., 2000

Page 31: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

V. cholerae vs. E. coli Best

0

1000000

2000000

3000000

4000000

5000000E.

col

i Coordinates

0 1000000 2000000 3000000

V. cholerae Coordinates Eisen et al., 2000

Page 32: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

V. cholerae vs. E. coli, Rotated

0

1 0 0 0 0 0 0

2 0 0 0 0 0 0

3 0 0 0 0 0 0

4 0 0 0 0 0 0

5 0 0 0 0 0 0E

. col

i OR

F C

oord

inat

es

0 5 0 0 0 0 0 1 0 0 0 0 0 0 1 5 0 0 0 0 0 2 0 0 0 0 0 0 2 5 0 0 0 0 0 3 0 0 0 0 0 0

V. cholerae ORF Coordinates Eisen et al., 2000

Page 33: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Duplication and Gene Loss Model

Eisen et al., 2000

Page 34: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

V. cholerae vs. E. coliOrthologs on Both Diagonals

0

1 0 0 0 0 0 0

2 0 0 0 0 0 0

3 0 0 0 0 0 0

4 0 0 0 0 0 0

5 0 0 0 0 0 0E

. col

i OR

F C

oord

inat

es

0 5 0 0 0 0 0 1 0 0 0 0 0 0 1 5 0 0 0 0 0 2 0 0 0 0 0 0 2 5 0 0 0 0 0 3 0 0 0 0 0 0

V. cholerae ORF Coordinates Eisen et al., 2000

Page 35: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014C. trachomatis MoPn

C. p

neum

onia

e A

R39

Origin

Terminus

C. trachomatis vs C. pneumoniae

Page 36: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

B1

A1

B2

A2

B3

A3

B3

B2

2423

2221

20191817161514

1312

11109

67258

2627

2829

301 2 3

45

3132

B1

3132

6789

1011

1213

14151617181920

2122

2324252627

2829

301 2 3

45

3132

B3 2423

2221

20191817161514

1312

11109

67258

2627

2829

33231 30

45

2 1

A1

3132

6789

1011

1213

14151617181920

2122

2324252627

2829

301 2 3

45

3132

A2

3132

6789

1011

1213

19181716151420

2122

2324252627

2829

301 2 3

45

3132

A3

2

6789

1011

1213

19181716151420

2122

2324252627

54

3 31 3029

28

1 32

B2

Inversion Around Terminus (*)

Inversion Around Terminus (*)

Inversion AroundOrigin (*)

Inversion AroundOrigin (*)

* *

* *

* *

* *

Common Ancestor of

A and B

3132

6789

1011

1213

14151617181920

2122

2324252627

2829

301 2 3

45

3132

A2

A1 A2

A3

B2

B1

Symmetric Inversion Model

Eisen et al., 2000

Page 37: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

13621300

13621775

13622250

13622725

13623200

0 625 1250 1875 2500

Series1

Streps

0

500

1000

1500

2000

2500

3000

2632200 2632700 2633200 2633700 2634200 2634700 2635200 2635700 2636200 2636700

B. subt vs. Staph

0

1 0 0 0 0 0 0

2 0 0 0 0 0 0

3 0 0 0 0 0 0

4 0 0 0 0 0 0

Myc

obac

teri

um tu

berc

ulos

is

0 1 0 0 0 0 0 0 2 0 0 0 0 0 0 3 0 0 0 0 0 0

Mycobacterium leprae

M. tb vs. M. leprae Pyrococcus Thermoplasmas9945700

9947275

9948850

9950425

9952000

0 2125 4250 6375 8500

Series1

Pseudomonas

The X-Files

Page 38: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

B C

A

Commonancestor of

A and B

Inversionaround

terminus (*)

Inversionaround

origin (*)

Inversionaround

terminus (*)

Inversionaround

origin (*)

A2

B1 B2 B3

A1 A2

B1 B2

A1 A2 A3

A3

B2 B3

A1 A2 A3

B1 B2 B3

1 23

45

6789

1011

1213

1918171615

1420

2122

2324252627

2829

30 31 32

V. cholerae chromosome IV. cholerae chromosome I

Esch

eric

hia

coli

V.pa

raha

emol

ytic

usch

rom

osom

eI

32 3130

2928

6789

1011

1213

1918171615

1420

2021

2223242526

274

3 2 11 23

45

6789

1011

1213

1415161718

1920

2122

2324252627

2829

30 31 32

1 23

45

6789

1011

1213

1415161718

1920

2122

2324252627

2829

30 31 32 1 23

45

6789

1011

1213

1415161718

1920

2122

2324252627

2829

30 31 32 32 3130

45

67252423

2221

2019

1817161514

1312

1110982627

2829

3 2 1

1 23

45

6789

1011

1213

1415161718

1920

2122

2324252627

2829

30 31 32

FIGURE 7.14. X-alignments. (A) Schematic model of symmetric genome inversions. The modelshows an initial speciation event, followed by a series of inversions in the different lineages (Aand B). Inversions occur between the asterisks (*). Numbers on the chromosome refer to hypo-thetical genes 1–32. At time point 1, the genomes of the two species are still colinear (as indi-cated in the scatterplot of A1 vs. B1). Between time point 1 and time point 2, each species (Aand B) undergoes a large inversion about the terminus (as indicated in the scatterplots of A1 vs.A2 and B1 vs. B2). This results in the between-species scatterplot looking as if there have beentwo nested inversions (A2 vs. B2). Between time point 2 and time point 3, each species under-goes an additional inversion (as indicated in the scatterplots of B2 vs. B3 and A2 vs. A3). This re-sults in the between-species scatterplots beginning to resemble an X-alignment. (B) X-like align-ment in dotplot of the main chromosomes of Vibrio cholerae (x-axis) and Vibrio parahaemolyticus(y-axis). (C) A weak X-like pattern exists even when comparing more distantly related species, inthis case V. cholerae and E. coli. An X-like pattern indicates that the distance of a gene from theorigin is conserved, but the side of the origin on which it is located is not conserved.

Chapter 7 • BACTERIAL AND ARCHAEAL GENETICS AND GENOMICS 181

169-194_Evo_Ch07.qxd:13937_C05.qxd 12/15/08 11:05 AM Page 181

Page 39: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Gene Loss

Chapter 7 • BACTERIAL AND ARCHAEAL GENETICS AND GENOMICS 175

ple, B. aphidicola APS has undergone a massive reduction in its genome since it shareda common ancestor with E. coli (Fig. 7.5). This symbiont lives inside aphid cells wheremany genes required for the free-living lifestyle of E. coli are not needed.

Gene Content Is in Constant Flux in Bacteria and Archaea

The availability of hundreds of complete genome sequences enables scientists to exam-ine how gene content evolves. The first analysis of this sort was performed using the firsttwo sequenced genomes: M. genitalium and H. influenzae. Despite the fact that bothspecies have very small genomes, hundreds of homologous genes were identified. Theseshared genes were proposed to be the “minimal gene set” of a bacterium; that is, theymight represent the genes that are essential for making a bacterium (Fig. 7.6A).

However, as more genomes from different phylogenetic groups have been se-quenced, the number of “core” homologous genes has diminished (Fig. 7.6B). The rea-son for this became clear when genomes from different strains of the same specieswere compared. This was first done with the pathogenic strain E. coli (O157:H7) andthe E. coli K12 laboratory strain. Although these strains share approximately 4000highly conserved genes, O157:H7 has more than 1000 genes not found in K12, andK12 has approximately 500 genes absent from O157:H7 (Fig. 7.7). Similar patterns of

adk

htpG

recR

ybaB

dnaX

apt pr

iCyb

aM

aefA

acrR ac

rA

acrB

RNA

-ffs

amtB

ybaE

ybaX

ybaW

ybaV

ybaZ

ybaY

tesB

ginK

mdl

B

mdl

A

ybaO

ybaU

hupB

a clpX

clpP

tig bolA

cof

ybaNAncestor

Buchnera 10 kbFIGURE 7.5. Genome reduction in Buchnera endosymbionts of aphids. A fragment of two genomesis shown. (Top row) The putative ancestor of all aphid endosymbionts in the Buchnera genus. (Bot-tom row) The genome of the symbionts today. The massive amounts of gene loss are indicated bythe genes colored white in the ancestral genome that are missing from the modern genome below.Orthologous genes between the two genomes are shown in the same color. Note the conservationof gene order between the two genomes despite the gene loss. The direction of gene transcriptionis indicated by the gene box being shifted above or below the black line.

Mycoplasma genitalium468 genes240 sharedgenes

Haemophilus influenzae1703 genes

BA

80sharedgenes

FIGURE 7.6. (A) Comparison of predicted protein-coding genes in the first two completed genomesHaemophilus influenzae and Mycoplasma genitalium. Approximately 240 genes are shared be-tween the two species. (B) Comparison of the predicted protein-coding genes of the first 25 bac-terial genomes (not all 25 circles are shown). Note that only about 80 genes can be identified asbeing shared among all of these species.

169-194_Evo_Ch07.qxd:13937_C05.qxd 12/15/08 11:05 AM Page 175

Page 40: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Gene Duplication

Page 41: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Why Duplications Are Useful to Identify

• Allows division into orthologs and paralogs !

• Improves functional predictions !

• Helps identify mechanisms of duplication !

• Can be used to study mutation processes in different parts of a genome

!• Lineage specific duplications may be indicative of

species’ specific adaptations

Page 42: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Page 43: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

C. pneumoniae - All Paralogs

0

250000

500000

750000

1000000

1250000Su

bjec

t Orf

Posit

ion

0 250000 500000 750000 1000000 1250000

Query Orf Position

Page 44: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

C. pneumoniae Lineage-Specific Paralogs

0

250000

500000

750000

1000000

1250000Su

bjec

t Orf

Posit

ion

0 250000 500000 750000 1000000 1250000

Query Orf Position

Page 45: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

Expansion of MCP Family in V. cholerae

E.coli gi1787690

B.subtilis gi2633766Synechocystis sp. gi1001299

Synechocystis sp. gi1001300Synechocystis sp. gi1652276

Synechocystis sp. gi1652103H.pylori gi2313716H.pylori99 gi4155097C.jejuni Cj1190c

C.jejuni Cj1110cA.fulgidus gi2649560A.fulgidus gi2649548

B.subtilis gi2634254B.subtilis gi2632630B.subtilis gi2635607B.subtilis gi2635608B.subtilis gi2635609

B.subtilis gi2635610B.subtilis gi2635882

E.coli gi1788195E.coli gi2367378E.coli gi1788194

E.coli gi1789453

C.jejuni Cj0144C.jejuni Cj0262c

H.pylori gi2313186H.pylori99 gi4154603

C.jejuni Cj1564

C.jejuni Cj1506cH.pylori gi2313163H.pylori99 gi4154575

H.pylori gi2313179H.pylori99 gi4154599

C.jejuni Cj0019cC.jejuni Cj0951c

C.jejuni Cj0246cB.subtilis gi2633374

T.maritima TM0014

T.pallidum gi3322777T.pallidum gi3322939

T.pallidum gi3322938B.burgdorferi gi2688522T.pallidum gi3322296

B.burgdorferi gi2688521T.maritima TM0429T.maritima TM0918T.maritima TM0023

T.maritima TM1428T.maritima TM1143

T.maritima TM1146P.abyssi PAB1308

P.horikoshii gi3256846P.abyssi PAB1336P.horikoshii gi3256896

P.abyssi PAB2066P.horikoshii gi3258290P.abyssi PAB1026P.horikoshii gi3256884

D.radiodurans DRA00354D.radiodurans DRA0353

D.radiodurans DRA0352P.abyssi PAB1189P.horikoshii gi3258414

B.burgdorferi gi2688621M.tuberculosis gi1666149

V.cholerae VC0512V.cholerae VCA1034

V.cholerae VCA0974V.cholerae VCA0068

V.cholerae VC0825V.cholerae VC0282

V.cholerae VCA0906V.cholerae VCA0979

V.cholerae VCA1056V.cholerae VC1643

V.cholerae VC2161V.cholerae VCA0923

V.cholerae VC0514V.cholerae VC1868

V.cholerae VCA0773V.cholerae VC1313

V.cholerae VC1859V.cholerae VC1413

V.cholerae VCA0268V.cholerae VCA0658

V.cholerae VC1405V.cholerae VC1298

V.cholerae VC1248V.cholerae VCA0864V.cholerae VCA0176

V.cholerae VCA0220V.cholerae VC1289

V.cholerae VCA1069V.cholerae VC2439

V.cholerae VC1967V.cholerae VCA0031V.cholerae VC1898V.cholerae VCA0663

V.cholerae VCA0988V.cholerae VC0216V.cholerae VC0449

V.cholerae VCA0008V.cholerae VC1406

V.cholerae VC1535V.cholerae VC0840

V.cholerae VC0098V.cholerae VCA1092

V.cholerae VC1403V.cholerae VCA1088

V.cholerae VC1394

V.cholerae VC0622

NJ

**

*****

****

**

****

***

****

**

*

****

**

**

**

****

****

***

****

** ****

***

**

*

***

****

**

*

****

*

Heidelberg et al. (2000)

Page 46: UC Davis EVE161 Lecture 11 by @phylogenomics

Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014

After the Genomes

• Better analysis and annotation

• Comparative genomics

• Functional genomics (Experimental analysis of gene function on a genome scale)

• Genome-wide gene expression studies

• Proteomics

• Genome wide genetic experiments