genome composition

Post on 09-Jan-2016

28 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Genome Composition. Dan Graur. Genome Composition in Bacteria. Carsonella ruddii has a very low GC content. The selectionist explanation views GC content as an adaptation. G:C pairs are more stable than A:T pairs. - PowerPoint PPT Presentation

TRANSCRIPT

1

Genome CompositionGenome Composition

Dan GraurDan Graur

2

Genome Genome Composition in Composition in

BacteriaBacteria

Carsonella ruddii has a very low GC content.

4

5

The selectionist explanation views GC content as an adaptation.

G:C pairs are more stable than A:T pairs.

Preferential usage of amino acids encoded by GC-rich codons (e.g., ala and arg) and avoidance of amino acids encoded by GC-poor codons (e.g., ser and lys).

T-T dimers are sensitive to UV radiation.

NoNoempiricalempiricalevidenceevidence

6

The mutationist explanation

Rate of substitution G/C T/A is Rate of substitution T/A G/C is

Noboru SueokaUniversity of Colorado

7

at equilibrium: PGC =ν

ν +μ

8

GC mutational pressure: μν

=1−PGCPGC

9

μν

=3 ⇒ 25% GC

μν

=1 ⇒ 50% GC

μν

=.33 ⇒ 75% GC

Mycoplasma capricolum

Escherichia coli

Micrococcus luteus

10

11

Differences in the way the leading and lagging strands of DNA are replicated can result in strand-dependent mutation patterns.

The expectation under no-strand-bias conditions is

fA = fT and fC = fG

12

Deviations from equal Deviations from equal

mutation rates between the mutation rates between the

two strands are quantified by two strands are quantified by

the the skewskew..

13

SX=Y =fX −fYfX +fY

The skew is a measure of inequality between the frequencies of nucleotides X and Y on a strand.

14

If there are no violations of the no-strand-bias conditions:

SX=Y

=0

15

Skew values are calculated for sliding windows of predetermined lengths, and are plotted on a skew diagram.

16

Bacillus subtilis

chirochorechirochore chirochorechirochore

17

18

Chlamidia trachomatis

19

Compositional Properties of Eukaryotic Genomes

20

GC content of bacterial genomes ranges from ~24% to ~74%

Intergenomicvariability

GC content of vertebrate genomes ranges from ~40% to ~45%

21

TTGACCGATGACCCCGGTTCAGGCTTCACCACAGTGTGGAACGCGGTCGTCTCCGAACTTAACGGCGACCCTAAGGTTGACGACGGACCCAGCAGTGATGCTAATCTCAGCGCTCCGCTGACCCCTCAGCAAAGGGCTTGGCTCAATCTCGTCCAGCCATTGACCATCGTCGAGGGGTTTGCTCTGTTATCCGTGCCGAGCAGCTTTGTCCAAAACGAAATCGAGCGCCATCTGCGGGCCCCGATTACCGACGCTCTCAGCCGCCGACTCGGACATCAGATCCAACTCGGGGTCCGCATCGCTCCGCCGGCGACCGACGAAGCCGACGACACTACCGTGCCGCCTTCCGAGAGATTGATGACAGCGCTGCGGCACGGGGCGATAACCAGCACAGTTGGCCAAGTTACTTCACCGAGCGCCCGCACAATACCGATTCCGCTACCGCTGGCGTAACCAGCCTTAACCGTCGCTACACCTTTGATACGTTCGTTATCGGCGCCTCCAACCGGTTCGCGCACGCCGCCGCCTTGGCGATCGCAGAAGCACCCGCCCGCGCTTACAACCCCCTGTTCATCTGGGGCGAGTCCGGTCTCGGCAAGACACACCTGCTACACGCGGCAGGCAACTATGCCCAACGGTTGTTCCCGGGAATGCGGGTCAAATATGTCTCCACCGAGGAATTCACCAACGACTTCATTAACTCGCTCCGCGATGACCGCAAGGTCGCATTCAAACGCAGCTACCGCGACGTAGACGTGCTGTTGGTCGACGACATCCAATTCATTGAAGGCAAAGAGGGTATTCAAGAGGAGTTCTTCCACACCTTCAACACCTTGCACAATGCCAACAAGCAAATCGTCATCTCATCTGACCGCCCACCCAAGCAGCTCGCCACCCTCGAGGACCGGCTGAGAACCCGCTTTGAGTGGGGGCTGATCACTGACGTACAACCACCCGAGCTGGAGACCCGCATCGCCATCTTGCGCAAGAAAGCACAGATGGAACGGCTCGCGGTCCCCGACGATGTCCTCGAACTCATCGCCAGCAGTATCGAACGCAATATCCGTGAACTCGAGGCCGAGGAATTCACCAACGACTTCATTAACTCGCTCCGCGATGACCGCAAGGTCGCATTCAAACGCAGCTACCGCGACGTAGACGTGCTGTTGGTCGACGACATCCAATTCATTGAAGGCAAAG

Interspecific variation among vertebrate genomes is low. However, vertebrates seem to have a much more complex intragenomic compositional organization (internal structure) than prokaryotic genomes.

22

TTGACCGATGACCCCGGTTCAGGCTTCACCACAGTGTGGAACGCGGTCGTCTCCGAACTTAACGGCGACCCTAAGGTTGACGACGGACCCAGCAGTGATGCTAATCTCAGCGCTCCGCTGACCCCTCAGCAAAGGGCTTGGCTCAATCTCGTCCAGCCATTGACCATCGTCGAGGGGTTTGCTCTGTTATCCGTGCCGAGCAGCTTTGTCCAAAACGAAATCGAGCGCCATCTGCGGGCCCCGATTACCGACGCTCTCAGCCGCCGACTCGGACATCAGATCCAACTCGGGGTCCGCATCGCTCCGCCGGCGACCGACGAAGCCGACGACACTACCGTGCCGCCTTCCGAGAGATTGATGACAGCGCTGCGGCACGGGGCGATAACCAGCACAGTTGGCCAAGTTACTTCACCGAGCGCCCGCACAATACCGATTCCGCTACCGCTGGCGTAACCAGCCTTAACCGTCGCTACACCTTTGATACGTTCGTTATCGGCGCCTCCAACCGGTTCGCGCACGCCGCCGCCTTGGCGATCGCAGAAGCACCCGCCCGCGCTTACAACCCCCTGTTCATCTGGGGCGAGTCCGGTCTCGGCAAGACACACCTGCTACACGCGGCAGGCAACTATGCCCAACGGTTGTTCCCGGGAATGCGGGTCAAATATGTCTCCACCGAGGAATTCACCAACGACTTCATTAACTCGCTCCGCGATGACCGCAAGGTCGCATTCAAACGCAGCTACCGCGACGTAGACGTGCTGTTGGTCGACGACATCCAATTCATTGAAGGCAAAGAGGGTATTCAAGAGGAGTTCTTCCACACCTTCAACACCTTGCACAATGCCAACAAGCAAATCGTCATCTCATCTGACCGCCCACCCAAGCAGCTCGCCACCCTCGAGGACCGGCTGAGAACCCGCTTTGAGTGGGGGCTGATCACTGACGTACAACCACCCGAGCTGGAGACCCGCATCGCCATCTTGCGCAAGAAAGCACAGATGGAACGGCTCGCGGTCCCCGACGATGTCCTCGAACTCATCGCCAGCAGTATCGAACGCAATATCCGTGAACTCGAGGCCGAGGAATTCACCAACGACTTCATTAACTCGCTCCGCGATGACCGCAAGGTCGCATTCAAACGCAGCTACCGCGACGTAGACGTGCTGTTGGTCGACGACATCCAATTCATTGAAGGCAAAG

How are nucleotides distributed along the genome?Uniform? Patchy? Clines?

23

“When vertebrate genomic DNA is randomly sheared into fragments 30-100 kb in size and the fragments are separated by base composition, the fragments cluster into a small number of classes distinguished from each other by their GC content. Each class is characterized by bands of similar, but not identical, base compositions.”

(Macaya et al. 1976; Thiery et al. 1976; Bernardi et al. 1985)

Equilibrium centrifugation in Cs2SO4 density gradient

24

carp

25

The Isochore Theory - Giorgio Bernardi

carp

26

27

Isochores do not merit the prefix “iso.”

Lander et al. (2001)

28

Post genomic era (2001)

Objections against the isochore theory:Objections against the isochore theory:

““We can rule out a strict notion of isochores as We can rule out a strict notion of isochores as compositionally homogeneous.”compositionally homogeneous.” Lander et al. (2001) Lander et al. (2001)

““There are no isochores in chromosomes 21 and 22.”There are no isochores in chromosomes 21 and 22.” Häring and Kyper (2001)Häring and Kyper (2001)

Defense of the isochore theory:Defense of the isochore theory:

““The conclusion of the authors that ‘isochores’ are The conclusion of the authors that ‘isochores’ are not ‘strict isochores’ is correct, however isochore are not ‘strict isochores’ is correct, however isochore are fairlyfairly homogeneous regions.” homogeneous regions.” Bernardi (2001) Bernardi (2001)

29

30

In search of isochores…

Questions: Do isochores exist? Is the isochore theory a useful (or practical)

concept?

31

Segmentation Models

• Assumption: Sequences can be partitioned into a number of segments each with a characteristic GC content.

• Each segment has a certain degree of internal homogeneity (or similarity).

32

In search of isochores…

Methodology: Define rigorously 6 attributes of isochores and

of the isochore theory as applied to humans Test attributes against the human genome

data

33

Attributes of isochores

A1. Distinguishability: An isochore is a DNA segment that has a characteristic GC content that differs significantly from the GC content of adjacent isochores.

A2. Homogeneity: An isochore is more homogeneous in its composition than the chromosome on which it resides.

A3. Minimum length: The length of an isochore exceeds a certain cutoff value. In the literature, the most commonly mentioned value is 300 Kb.

34

Attributes of the isochore theory in humans

A4. Genome coverage: The overwhelming majority of the human genome consists of segments abiding by A1-A3. Non-isochoric DNA takes up only a small fraction of the genome.

35

A5. Isochore families: The human genome comprises of five isochore families, each described by a particular Gaussian distribution of GC content.

Attributes of the isochore theory in humans

36

A6. Isochore assignment into families: It is possible to classify each isochore into its isochore family based solely on its compositional properties.

Practicality of the isochore theory

37

Segment length distribution

The fitted regression line

(solid line) indicates that

the tail of the distribution

exhibits power-law decay

with an exponent of –2.38.

P L–2.38

38

Power laws everywhere!

39

Isochore families

1

2

3

4

Most parsimonious Gaussian fit to putative isochores

40

Homogeneous “isochores” in vertebrates

41

Assignment into families

Classification errors reach values of 70%. Only a minute fraction of segments can be classified with an expected error under 5%.

42

Summary

(A1) Distinguishability

(A2) Homogeneity

(A3) Minimum length X

(A1) Genome coverage

(A2) Isochore families families

(A3) Isochore assignment into families X

43

Conclusion:

The isochore theory may have reached the limits of its usefulness as a description of genomic compositional structures.

44

45

46

As of December 2004

17 genetic codes

11 mitochondrial

5 nuclear

1 nuclear + mitochondrial

47

Lock & Key Hypothesis

48

Frozen accidents

EvolutionaryDead Ends

49

50

The codon-The codon-capture capture hypothesishypothesis

Thomas Jukes

51

AAA = lysine

Universalgeneticcode

52AAA = asparagine

Echinodermata

53

Hemichordata

AAA = unassigned

54

top related