[hmg] 04 - gene evolution · genome evolutiongenome evolution [gene evolution] genome changes •...

29
1 Gene Evolution Gene Evolution [Gene Evolution] Forces affecting genome evolution Forces affecting genome evolution Genome Evolution Genome Evolution [Gene Evolution] Genome changes Genome changes Mutation Mutation Recombination Recombination Transposition Transposition Gene transfer (e.g., between organelles and nuclear DNA) Gene transfer (e.g., between organelles and nuclear DNA) Deletion and Deletion and duplication duplication major mechanism major mechanism for the expansion in the size of genomes as for the expansion in the size of genomes as organisms evolved from simple to more complex is duplication of organisms evolved from simple to more complex is duplication of whole genomes whole genomes as well as duplication of as well as duplication of specific sequences specific sequences Genome Evolution Genome Evolution [Gene Evolution] Early recognized Early recognized A A redundant duplicates redundant duplicates of a of a gene gene may acquire may acquire divergent divergent mutations mutations and eventually emerge and eventually emerge as a as a new gene new geneJ. B. S. Haldane (1932) J. B. S. Haldane (1932) Gene Duplication Gene Duplication [Gene Evolution] Early recognized Early recognized The The Bar Bar gene duplication gene duplication first duplication mutation described in the literature (1936) first duplication mutation described in the literature (1936) Gene Duplication Gene Duplication [Gene Evolution] Early recognized Early recognized before the before the advent of biochemical advent of biochemical and and molecular biology techniques molecular biology techniques only only few examples of duplicate genes were discovered few examples of duplicate genes were discovered late 1950s late 1950s α α- and and β β-chains of hemoglobin chains of hemoglobin were recognized to have been derived were recognized to have been derived from duplicate genes from duplicate genes Gene Duplication Gene Duplication later later isozyme isozyme and and cytological studies cytological studies provided provided evidence for the frequent evidence for the frequent occurrence of gene duplication occurrence of gene duplication during evolution during evolution

Upload: others

Post on 03-Jul-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

1

Gene EvolutionGene EvolutionGene Evolution[Gene Evolution]

Forces affecting genome evolutionForces affecting genome evolutionForces affecting genome evolution

Genome EvolutionGenome EvolutionGenome Evolution

[Gene Evolution]

Genome changes Genome changes Genome changes

•• MutationMutation•• RecombinationRecombination•• TranspositionTransposition•• Gene transfer (e.g., between organelles and nuclear DNA)Gene transfer (e.g., between organelles and nuclear DNA)•• Deletion and Deletion and duplication duplication

–– major mechanismmajor mechanism for the expansion in the size of genomes as for the expansion in the size of genomes as organisms evolved from simple to more complex is duplication of organisms evolved from simple to more complex is duplication of whole genomeswhole genomes as well as duplication of as well as duplication of specific sequencesspecific sequences

Genome EvolutionGenome EvolutionGenome Evolution[Gene Evolution]

Early recognizedEarly recognizedEarly recognized

““A A redundant duplicatesredundant duplicates of a of a genegene may acquire may acquire divergent divergent mutationsmutations and eventually emerge and eventually emerge as a as a new genenew gene””

J. B. S. Haldane (1932)J. B. S. Haldane (1932)

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Early recognizedEarly recognizedEarly recognized

The The BarBar gene duplicationgene duplication•• first duplication mutation described in the literature (1936)first duplication mutation described in the literature (1936)

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Early recognizedEarly recognizedEarly recognized•• before the before the advent of biochemical advent of biochemical andand molecular biology techniquesmolecular biology techniques only only

few examples of duplicate genes were discovered few examples of duplicate genes were discovered •• late 1950s late 1950s

–– αα-- and and ββ--chains of hemoglobinchains of hemoglobin were recognized to have been derived were recognized to have been derived from duplicate genesfrom duplicate genes

Gene DuplicationGene DuplicationGene Duplication

•• laterlater–– isozymeisozyme and and cytological studiescytological studies

provided provided evidence for the frequent evidence for the frequent occurrence of gene duplicationoccurrence of gene duplicationduring evolutionduring evolution

Page 2: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

2

[Gene Evolution]

The standard model of genome evolutionThe standard model of genome evolutionThe standard model of genome evolution

Gene DuplicationGene DuplicationGene Duplication

MutationsMutationsMutations

DNA sequenceDNA sequenceDNA sequence

Selection (purifying or positive)

Selection Selection ((purifying or purifying or positivepositive))

Random drift of neutral mutationsRandom drift of neutral mutationsRandom drift of neutral mutations

DNA sequence (altered) DNA sequence DNA sequence (altered(altered) )

The engine The engine The engine

The steering wheelThe steering The steering wheelwheel

[Gene Evolution]

Redundancy createsRedundancy createsRedundancy creates

•• SusumuSusumu OhnoOhno (1970) (1970) gene duplication is the only means by which a new gene can arisegene duplication is the only means by which a new gene can arise

natural selection merely modified, while redundancy creatednatural selection merely modified, while redundancy created–– other means of creating new functions are now known, butother means of creating new functions are now known, but–– Ohno's view remains Ohno's view remains largely validlargely valid

Gene DuplicationGene DuplicationGene Duplication

A new function is created by A new function is created by a.a. duplicating an old gene duplicating an old gene ✪✪b.b. modifying one of the copies modifying one of the copies ✪✪

✪✪

✪✪

[Gene Evolution]

Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution1.1. For each protein, For each protein, the rate of evolutionthe rate of evolution in terms of amino acid substitutions is in terms of amino acid substitutions is

approximately approximately constant per year per siteconstant per year per site for various lines, as long as the for various lines, as long as the functionfunction and and tertiary structuretertiary structure of the molecule remain essentially of the molecule remain essentially unalteredunaltered

Gene DuplicationGene DuplicationGene Duplication

DNA packaging DNA packaging proteinprotein•• highlyhighly constrainedconstrained•• slowslow evolutionevolution

Clotting Clotting proteinsproteins•• fewfew constraintsconstraints•• rapidrapid evolutionevolution

[Gene Evolution]

Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution1.1. For each protein, For each protein, the rate of evolutionthe rate of evolution in terms of amino acid substitutions is in terms of amino acid substitutions is

approximately approximately constant per year per siteconstant per year per site for various lines, as long as the for various lines, as long as the functionfunction and and tertiary structuretertiary structure of the molecule remain essentially of the molecule remain essentially unalteredunaltered

–– rates of replacementrates of replacement substitutions are substitutions are higher among functionally higher among functionally less important genesless important genes

Gene DuplicationGene DuplicationGene Duplication

8.598.592.792.79136136γγ5.885.882.212.21159159ββ113.533.531.411.41166166αα11

InterferonsInterferons

6.126.120.000.00101101HistoneHistone 44

6.396.390.000.00135135HistoneHistone 33

HistonesHistones

SilentSilent((per 10per 109 9 y)y)

ReplacementReplacement((per 10per 1099 y)y)

Length Length (bp)(bp)GeneGene

Histones Histones involvedinvolved with fundamental processes (DNA transcription and synthesis)with fundamental processes (DNA transcription and synthesis)Interferons Interferons lessless important (one of many immune system genes)important (one of many immune system genes)

PreproinsulinPreproinsulin

ProinsulinProinsulin

[Gene Evolution]

Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution2.2. Functionally less important moleculesFunctionally less important molecules or or partsparts of molecules of molecules evolve fasterevolve faster

than more important ones (in terms of mutant substitutions)than more important ones (in terms of mutant substitutions)

Gene DuplicationGene DuplicationGene Duplication

SignalSignal B chainB chain C peptideC peptide A chainA chain

SS SS

1.2 X 101.2 X 10--99

subst/site/yearsubst/site/year0.2 X 100.2 X 10--99

subst/site/yearsubst/site/year1.1 X 101.1 X 10--99

subst/site/yearsubst/site/year

[Gene Evolution]

Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution2.2. Functionally less important moleculesFunctionally less important molecules or or partsparts of molecules of molecules evolve fasterevolve faster

than more important ones (in terms of mutant substitutions)than more important ones (in terms of mutant substitutions)

Gene DuplicationGene DuplicationGene Duplication

Conservation in a typical geneon the basis of 3,165 human-mouse pairs

Conservation in a Conservation in a typicaltypical genegeneon the basis of 3,165 humanon the basis of 3,165 human--mouse pairsmouse pairs

Start of transcriptionStart of transcription PolyadenylationPolyadenylation sitesite

Splice sitesSplice sitesStart of translationStart of translation

Page 3: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

3

[Gene Evolution]

Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution2.2. Functionally less important moleculesFunctionally less important molecules or or partsparts of molecules of molecules evolve fasterevolve faster

than more important ones (in terms of mutant substitutions) than more important ones (in terms of mutant substitutions)

Gene DuplicationGene DuplicationGene Duplication

11 22 33 44

PseudogenesPseudogenes

55’’ flanking regionflanking region

55’’ UTRUTR

Nondegenerate sitesNondegenerate sites

22--fold degenerate sitesfold degenerate sites

44--fold degenerate sitesfold degenerate sites

IntronsIntrons

33’’ UTRUTR

33’’ flanking regionflanking region

substitutions/site/10substitutions/site/10--99 yearyear

[Gene Evolution]

Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution3.3. Those mutant Those mutant substitutionssubstitutions that are that are less disruptiveless disruptive to the existing structure to the existing structure

and function of the molecule (conservative substitutions) and function of the molecule (conservative substitutions) occur more occur more frequentlyfrequently in evolution than more disruptive onesin evolution than more disruptive ones

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution4.4. Gene duplicationGene duplication must must always precede the emergence of a genealways precede the emergence of a gene having a having a

new functionnew function5.5. Selective eliminationSelective elimination of definitely deleterious mutant and random fixation of of definitely deleterious mutant and random fixation of

selectively neutral or very slightly deleterious mutants selectively neutral or very slightly deleterious mutants occur far more occur far more frequently in evolution than positive Darwinian selectionfrequently in evolution than positive Darwinian selection of definitely of definitely advantageous mutantsadvantageous mutants

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Kimura and Ohta’s Laws of Molecular EvolutionKimura and OhtaKimura and Ohta’’s Laws of Molecular Evolutions Laws of Molecular Evolution1.1. For each protein, For each protein, the rate of evolutionthe rate of evolution in terms of amino acid substitutions is in terms of amino acid substitutions is

approximately approximately constant per year per siteconstant per year per site for various lines, as long as the for various lines, as long as the functionfunction and tertiary structure of the molecule remain essentially and tertiary structure of the molecule remain essentially unalteredunaltered

2.2. Functionally less important moleculesFunctionally less important molecules or or partsparts of molecules of molecules evolve fasterevolve fasterthan more important ones (in terms of mutant substitutions) than more important ones (in terms of mutant substitutions)

3.3. Those mutant Those mutant substitutionssubstitutions that are that are less disruptiveless disruptive to the existing structure to the existing structure and function of the molecule (conservative substitutions) and function of the molecule (conservative substitutions) occur more occur more frequentlyfrequently in evolution than more disruptive ones Gene duplication must in evolution than more disruptive ones Gene duplication must always precede the emergence of a gene having a new functionalways precede the emergence of a gene having a new function

4.4. Gene duplicationGene duplication must must always precede the emergence of a genealways precede the emergence of a gene having a having a new functionnew function

5.5. Selective eliminationSelective elimination of definitely deleterious mutant and random fixation of of definitely deleterious mutant and random fixation of selectively neutral or very slightly deleterious mutants selectively neutral or very slightly deleterious mutants occur far more occur far more frequently in evolution than positive Darwinian selectionfrequently in evolution than positive Darwinian selection of definitely of definitely advantageous mutantsadvantageous mutants

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

The standard model of genome evolutionThe standard model of genome evolutionThe standard model of genome evolution

Gene DuplicationGene DuplicationGene Duplication

Single nucleotide changes, short insertions or deletions, inversions, recombinations, domain

duplication, gene duplication, cluster duplication, segment duplication, chromosome duplication,

genome duplication, etc.

Single nucleotide changes, short insertions or Single nucleotide changes, short insertions or deletions, inversions, recombinations, domain deletions, inversions, recombinations, domain

duplication, gene duplication, cluster duplication, duplication, gene duplication, cluster duplication, segment duplication, chromosome duplication, segment duplication, chromosome duplication,

genome duplication, etc.genome duplication, etc.

DNA sequenceDNA sequenceDNA sequence

Selection (purifying or positive)

Selection Selection ((purifying or purifying or positivepositive))

Random drift of neutral mutationsRandom drift of neutral mutationsRandom drift of neutral mutations

DNA sequence (altered) DNA sequence DNA sequence (altered(altered) )

The engine The engine The engine

The steering wheelThe steering The steering wheelwheel

[Gene Evolution]

Detection of natural selection Detection of natural selection Detection of natural selection

Overwhelming support for neutralist predictionOverwhelming support for neutralist prediction1.1. synonymous vs. nonsynonymous vs. non--synonymous substitution ratessynonymous substitution rates2.2. accelerated rate of psuedogene evolutionaccelerated rate of psuedogene evolution

Synonymous substitutionsSynonymous substitutions do do NOT changeNOT change encoded amino acidencoded amino acid

NonNon--synonymous substitutionssynonymous substitutions DO changeDO change encoded amino acidencoded amino acid

•• if DNA divergence includes if DNA divergence includes neutral mutationsneutral mutations–– 3rd position should change more rapidly3rd position should change more rapidly

•• synonymous mutations are more likely to be neutralsynonymous mutations are more likely to be neutral

•• if most DNA changes were due to if most DNA changes were due to adaptive evolutionadaptive evolution–– most changes would occur most changes would occur in thein the 1st 1st andand 2nd codon positions2nd codon positions

in most genes ever studied in most genes ever studied synonymous sites change at a higher ratesynonymous sites change at a higher rate than than nonnon--synonymous sitessynonymous sites

Gene DuplicationGene DuplicationGene Duplication

Page 4: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

4

Types of natural selectionTypes of natural selection1.1. purifying (negative) selectionpurifying (negative) selection removalremoval of of deleteriousdeleterious variantsvariants2.2. diversifying (positive) selectiondiversifying (positive) selection fixationfixation of of adaptiveadaptive variantsvariants

Types of substitution rates for protein coding genesTypes of substitution rates for protein coding genes1.1. synonymous substitution ratesynonymous substitution rate ((KKss or or dsds))

number of number of synonymous substitutionssynonymous substitutions per per synonymous sitesynonymous site

(# of synonymous changes divided by the # of synonymous sites)(# of synonymous changes divided by the # of synonymous sites)

–– rate of substitution for DNA changes that do not change the rate of substitution for DNA changes that do not change the encoded amino acidsencoded amino acids

2.2. nonnon--synonymous substitution ratesynonymous substitution rate ((KKAA or or dndn))

number of number of nonnon--synonymous substitutionssynonymous substitutions per per nonnon--synonymous sitesynonymous site

(# of non(# of non--synonymous changes divided by the # of nonsynonymous changes divided by the # of non--synonymous sites)synonymous sites)

–– rate of substitution for DNA changes that do change the encoded rate of substitution for DNA changes that do change the encoded amino acidsamino acids

thethe relative levels for these rates relative levels for these rates indicate theindicate the mode of selection mode of selection for a genefor a gene

[Gene Evolution]

Detection of natural selection Detection of natural selection Detection of natural selection

Gene DuplicationGene DuplicationGene Duplication

Neutral theory provides Neutral theory provides Null ModelNull Model for tests of selectionfor tests of selection

[Gene Evolution]

Detection of natural selection Detection of natural selection Detection of natural selection

Gene DuplicationGene DuplicationGene Duplication

0.00.0 1.01.0←← conservingconserving diversifying diversifying →→

KKAA/K/KSS (d(dNN/d/dSS, , ωω))

KKAA/K/KSS ≈≈ 1 1 Neutral evolution (no selection)Neutral evolution (no selection)an equal number of silent and aminoan equal number of silent and amino--acid replacement substitutions have been acid replacement substitutions have been preserved since duplicationpreserved since duplication

KKAA/K/KSS «« 1 1 Purifying SelectionPurifying Selectionmore aminomore amino--acid replacement substitutions than silent substitutions have beacid replacement substitutions than silent substitutions have been en eliminated since duplication eliminated since duplication

•• some aminosome amino--acid changes had deleterious effectsacid changes had deleterious effects

KKAA/K/KSS »» 1 1 Diversifying (positive) SelectionDiversifying (positive) Selectionmore aminomore amino--acid replacement substitutions than silent substitutions have beacid replacement substitutions than silent substitutions have been en preserved since duplicationpreserved since duplication

abundance of replacement substitutions conferring a selective adabundance of replacement substitutions conferring a selective advantagevantage

[Gene Evolution]

HomologyHomologyHomology

Similarity due to inheritance from a common ancestorSimilarity due to inheritance from a common ancestor

HomologsHomologssequences that have sequences that have common originscommon origins but but may or may not may or may not have common activityhave common activity

1.1. OrthologsOrthologshomologs produced by speciationhomologs produced by speciation

•• genes derived from a common ancestor that diverged due to genes derived from a common ancestor that diverged due to divergence of the organismsdivergence of the organisms they are associated withthey are associated with

•• tend to have tend to have similar functionsimilar function

2.2. ParalogsParalogshomologs produced by gene duplicationhomologs produced by gene duplication

•• genes derived from a common ancestral gene that duplicated withigenes derived from a common ancestral gene that duplicated within an n an organism and then subseqeuntly organism and then subseqeuntly diverged by accumulated mutationdiverged by accumulated mutation

•• tend to have tend to have slightly different functionsslightly different functions

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

HomologyHomologyHomology

Similarity due to inheritance from a common ancestorSimilarity due to inheritance from a common ancestor

Gene DuplicationGene DuplicationGene Duplication

duplicationduplication

evolutionevolution

speciationspeciation

evolutionevolution

ParalogsParalogs AA1 and 1 and AA22BB1 and 1 and BB22

Orthologs Orthologs AA1 and 1 and BB11AA2 and 2 and BB22

[Gene Evolution]

HomologyHomologyHomology

Similarity due to inheritance from a common ancestorSimilarity due to inheritance from a common ancestor

Gene DuplicationGene DuplicationGene Duplication

ParalogsParalogs

[Gene Evolution]

Types of Gene DuplicationTypes of Types of Gene DuplicationGene Duplication

An increase in the number of copies of a DNA segment can be brouAn increase in the number of copies of a DNA segment can be brought ght about by about by several typesseveral types of gene duplicationof gene duplication

•• classified according to the classified according to the extent of the genomic region involvedextent of the genomic region involved

1.1. partial or internal gene duplicationpartial or internal gene duplication

2.2. complete gene duplicationcomplete gene duplication

3.3. partial chromosomal duplicationpartial chromosomal duplication

4.4. complete chromosomal duplicationcomplete chromosomal duplication

5.5. polyploidy or genome duplicationpolyploidy or genome duplication

Gene DuplicationGene DuplicationGene Duplication

Page 5: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

5

[Gene Evolution]

Types of Gene DuplicationTypes of Types of Gene DuplicationGene Duplication

Gene DuplicationGene DuplicationGene Duplication

deleteriousness determined by the mode deleteriousness determined by the mode of reproduction and sexof reproduction and sex--determinationdeterminationcommoncommonpolyploidypolyploidy

almost invariably deleteriousalmost invariably deleteriouscommoncommonpolysomypolysomy

almost invariably deleteriousalmost invariably deleteriousrarerarepartial polysomypartial polysomy

deleterious only in organisms in which the deleterious only in organisms in which the genome is replicated as unitgenome is replicated as unitfrequentfrequentwhole genewhole gene

deleterious if it affects reading framedeleterious if it affects reading framevery frequentvery frequentpartial genepartial gene

Effects on FitnessEffects on FitnessMutational Mutational occurrenceoccurrence

Extent of Extent of duplicationduplication

[Gene Evolution]

Equal & Unequal Crossing OverEqual & Unequal Crossing OverEqual & Unequal Crossing Over

Equal Crossing OverEqual Crossing Over

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Equal & Unequal Crossing OverEqual & Unequal Crossing OverEqual & Unequal Crossing Over

Unqual Crossing OverUnqual Crossing Over

Gene DuplicationGene DuplicationGene Duplication

DeletionDeletion

DuplicationDuplication

[Gene Evolution]

Equal & Unequal Crossing OverEqual & Unequal Crossing OverEqual & Unequal Crossing Over

Unqual Crossing OverUnqual Crossing Over

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Domains and exonsDomains and exonsDomains and exons

1.1. Functional domainFunctional domainwellwell--defined region within a protein that defined region within a protein that performs a specific functionperforms a specific functione.g. substrate bindinge.g. substrate binding

2.2. Structural domainStructural domain or or modulemodulewellwell--defined region within a protein that constitutes a defined region within a protein that constitutes a stablestable, independently , independently folding, folding, compact structural unitcompact structural unit within the protein that can be within the protein that can be distinguished from all the other partsdistinguished from all the other parts

Defining the boundariesDefining the boundaries of a functional domain is often difficultof a functional domain is often difficultfunctionalityfunctionality is in many cases conferred byis in many cases conferred by aminoamino--acid residues acid residues that arethat arescatteredscattered throughout the polypeptidethroughout the polypeptide

Structural modules are collinear with the aminoStructural modules are collinear with the amino--acid sequence of a proteinacid sequence of a proteini.e., a module consists of a continuous stretch of amino acidsi.e., a module consists of a continuous stretch of amino acids

a.a. if if functionalityfunctionality is conferred by a is conferred by a modulemodulea duplication will a duplication will increase the number of functional segmentsincrease the number of functional segments

b.b. if if functionalityfunctionality is conferred by is conferred by aminoamino--acid residues acid residues scattered among scattered among different modulesdifferent modules

a a duplication may not be functionally desirableduplication may not be functionally desirable

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Domains and exonsDomains and exonsDomains and exons

Possible relationships between the protein structural domains anPossible relationships between the protein structural domains and d the arrangements of the exons in the genethe arrangements of the exons in the gene

Gene DuplicationGene DuplicationGene Duplication

each exon corresponds exactly to each exon corresponds exactly to a structural domaina structural domain

approximate correspondence approximate correspondence

an exon encodes 2 or more domainsan exon encodes 2 or more domains

a single structural domain is a single structural domain is encoded by 2 or more exonsencoded by 2 or more exons

lack of correspondence between lack of correspondence between exons and domainsexons and domains

Page 6: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

6

[Gene Evolution]

Domains and exonsDomains and exonsDomains and exons

αα and and ββ chainschains of theof the vertebrate hemoglobin vertebrate hemoglobin •• consist of consist of 4 domains4 domains, whereas their genes consist of only , whereas their genes consist of only 3 exons3 exons, the second , the second

of which encodes two adjacent domainsof which encodes two adjacent domains•• it was postulated that a merger occurred between two exons as a it was postulated that a merger occurred between two exons as a result of the result of the

loss of a central intronloss of a central intron•• homologous globin genes in plants (leghemoglobins) were found tohomologous globin genes in plants (leghemoglobins) were found to contain an contain an

additional intron at or very near the position predicted by the additional intron at or very near the position predicted by the domain structure domain structure of globinsof globins

•• a similar intron was found in the globin genes of a nematodea similar intron was found in the globin genes of a nematodeduring the evolution of the globinduring the evolution of the globin--gene family gene family

from a from a 44--exon ancestral geneexon ancestral geneseveral lineages lost some or all of their three intronsseveral lineages lost some or all of their three introns, thereby , thereby generating a panoply of exon/intron permutationsgenerating a panoply of exon/intron permutations

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Domains and exonsDomains and exonsDomains and exons

αα and and ββ chainschains of theof the vertebrate hemoglobin vertebrate hemoglobin •• internal organization of the human internal organization of the human αα11 and and ββ genesgenes

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Domains and exonsDomains and exonsDomains and exons

αα-- and and ββ--chainschains of theof the vertebrate hemoglobinvertebrate hemoglobin

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Domains and exonsDomains and exonsDomains and exons•• domain duplications at the proteindomain duplications at the protein level generally indicate that an level generally indicate that an exon exon

duplicationduplication has occurred at the DNA levelhas occurred at the DNA level–– it has been suggested that exon duplication is one of the most iit has been suggested that exon duplication is one of the most important mportant

types of internal gene duplicationtypes of internal gene duplication•• proteins show internal repeats of amino acid sequencesproteins show internal repeats of amino acid sequences•• these repeats often correspond to functional or structural domaithese repeats often correspond to functional or structural domains ns

These findings suggestThese findings suggest

1.1. the genes for the genes for many proteins were formed by internal gene duplicationmany proteins were formed by internal gene duplication

2.2. the the functionfunction of these proteins was of these proteins was improvedimproved by increasing their by increasing their stabilitystability or or the the number of active sitesnumber of active sites

3.3. internal duplications can also provide internal duplications can also provide redundant DNAredundant DNA segments for a gene to segments for a gene to develop new functionsdevelop new functions

–– many many complex genescomplex genes might have might have evolved from smallevolved from small, , simplesimpleprimordial genes primordial genes viavia internal duplicationinternal duplication and and subsequent subsequent modificationmodification

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Internal duplicationsInternal duplicationsInternal duplicationsProteins with internal domain duplications taking up 50% or moreProteins with internal domain duplications taking up 50% or more of the of the total length of the proteintotal length of the protein

Gene DuplicationGene DuplicationGene Duplication

878722360360826826Villin Villin 100100774242284284Tropomyosin Tropomyosin αα chain chain 10010033195195584584Serum albumin Serum albumin

9999885757461461Ribonuclease/angiogenin inhibitor Ribonuclease/angiogenin inhibitor 858555354354

22117117223030

3358658638173817PrePre--propro--von Willebrand factorvon Willebrand factor5050557979790790Plasminogen Plasminogen 95952260960912801280Multidrug resistanceMultidrug resistance--1 P1 P--glycoportein glycoportein 79793348048019271927LactaseLactase--phlorizin hydrolase phlorizin hydrolase 5454226868251251InterleukinInterleukin--2 receptor 2 receptor

10010044108108423423Immunoglobulin Immunoglobulin εε chain C region chain C region 989833108108329329Immunoglobulin Immunoglobulin γγ chain C region chain C region 979722447447917917Hexokinase Hexokinase 949422207207439439Homopexin Homopexin

100100227474148148CalciumCalcium--dependent regulator protein dependent regulator protein 9696559191474474αα11ββ--glycoprotein glycoprotein

% repetition

# repeats

repeat length

protein lengthProtein

[Gene Evolution]

Domain duplicationsDomain duplicationsDomain duplicationsVariable and Constant regions of immunoglobulin genesVariable and Constant regions of immunoglobulin genes•• probably probably derived from a common primordial domainderived from a common primordial domain, but have since , but have since

acquired distinct propertiesacquired distinct properties•• despite common molecular ancestrydespite common molecular ancestry

–– the the variablevariable region of immunoglobulins region of immunoglobulins binds antigensbinds antigens–– the the constantconstant region mediates region mediates nonnon--antigenic functionsantigenic functions

Gene DuplicationGene DuplicationGene Duplication

Page 7: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

7

[Gene Evolution]

Domain duplicationsDomain duplicationsDomain duplicationsVariable and Constant regions of immunoglobulin genesVariable and Constant regions of immunoglobulin genes

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Creation of new functionCreation of new functionCreation of new function

3 pathways can lead to the creation of a new function3 pathways can lead to the creation of a new function1.1. de novode novo appearanceappearance from nonfunctional sequencefrom nonfunctional sequence due to accumulation of due to accumulation of

mutationsmutations

2.2. replacementreplacement due to due to change of one function into anotherchange of one function into another

3.3. creation of a creation of a novel function from a redundant copynovel function from a redundant copy of an old function of an old function following duplicationfollowing duplication

Gene DuplicationGene DuplicationGene Duplication

D I C ED I C E

D I R ED I R E

D A R ED A R E

C A R EC A R E

C A R DC A R D

D I C ED I C E

[Gene Evolution]

Creation of new functionCreation of new functionCreation of new function

3 pathways can lead to the creation of a new function3 pathways can lead to the creation of a new function

1.1. de novode novo appearanceappearance from nonfunctional sequencefrom nonfunctional sequence due to due to accumulation of mutationsaccumulation of mutations

2.2. replacementreplacement due to due to change of one function into anotherchange of one function into another

3.3. creation of a creation of a novel function from a redundant copynovel function from a redundant copy of an old of an old function following duplicationfunction following duplication

Following gene duplication 3 things may happen to the copiesFollowing gene duplication 3 things may happen to the copies

a.a. all copies may retain the same functionall copies may retain the same function

b.b. some copies may diesome copies may die

c.c. some copies may evolve into new functionssome copies may evolve into new functions

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Creation of new functionCreation of new functionCreation of new function

Prevalence of gene duplicationPrevalence of gene duplication•• gene duplications gene duplications arise spontaneously at high ratesarise spontaneously at high rates in bacteria, in bacteria,

bacteriophages, insects and mammals, and are bacteriophages, insects and mammals, and are generally viablegenerally viable

•• mutation is not the ratemutation is not the rate--limitinglimiting step in the evolutionary process of gene step in the evolutionary process of gene duplicationduplication

•• only a small fraction of all duplicated genes are retainedonly a small fraction of all duplicated genes are retained

•• an even smaller fraction evolves new functionsan even smaller fraction evolves new functions

•• the the probability of nonfunctionalization is much higherprobability of nonfunctionalization is much higher than that of evolving than that of evolving a new functiona new function

•• an an increase in gene number can occur quite rapidly under selection increase in gene number can occur quite rapidly under selection pressurepressure for increased amounts of a gene productfor increased amounts of a gene product

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Gene redundancyGene redundancyGene redundancy

Duplicated genes can be divided into 2 typesDuplicated genes can be divided into 2 types

1.1. Invariant repeats Invariant repeats •• identical or nearly identical in sequence to one anotheridentical or nearly identical in sequence to one another–– the repetition of identical sequences is correlated with the synthe repetition of identical sequences is correlated with the synthesis of thesis of

increased quantities of the gene productincreased quantities of the gene product that is that is required for the required for the normal function of the organismnormal function of the organism

dose repetitionsdose repetitions–– common whenever a metabolic need for producing large quantities common whenever a metabolic need for producing large quantities of of

specific RNAs or proteins arisesspecific RNAs or proteins arises

2.2. Variant repeats Variant repeats •• copies of a gene that differ in their sequence to a lesser or grcopies of a gene that differ in their sequence to a lesser or greater extenteater extent

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene redundancyGene redundancyGene redundancy

1.1. Invariant repeatsInvariant repeats

Gene DuplicationGene DuplicationGene Duplication

Histone genesHistone genes

rDNAsrDNAs

Page 8: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

8

[Gene Evolution]

Gene redundancyGene redundancyGene redundancy

1.1. Invariant repeatsInvariant repeatsNumbers of rRNA and tRNA genes per haploid genome in various orgNumbers of rRNA and tRNA genes per haploid genome in various organismsanisms

Gene DuplicationGene DuplicationGene Duplication

8 8 ×× 1010996,5006,500--7,8007,800500500--760760Xenopus laevisXenopus laevis

3 3 ×× 101099~ 6,500~ 6,500150150--170170Rattus norvegicusRattus norvegicus

3 3 ×× 101099~ 1,300~ 1,300~ 300~ 300HumanHuman

5 5 ×× 101088~ 1,050~ 1,0508080--280280Physarum polycephalumPhysarum polycephalum

2 2 ×× 101088590590--900900120120--240240Drosophila melanogasterDrosophila melanogaster

2 2 ×× 101088~ 800~ 80011Tetrahymena thermophilaTetrahymena thermophila

8 8 ×× 101077~ 300~ 300~ 55~ 55Caenorhabditis elegansCaenorhabditis elegans

5 5 ×× 101077~ 360~ 360~ 140~ 140Saccharomyces cerevisiaeSaccharomyces cerevisiae

2 2 ×× 101077~ 2,600~ 2,600~ 100~ 100Neurospora crassaNeurospora crassa

4 4 ×× 101066~ 100~ 10077Escherichia coliEscherichia coli

2 2 ×× 101055373722Nicotiana tabacumNicotiana tabacum chloroplastchloroplast

2 2 ×× 101044222211Human mitochondrionHuman mitochondrion

genome size genome size (bp)(bp)# tRNA genes# tRNA genes# # rRNA setsrRNA setsGenome SourceGenome Source

[Gene Evolution]

Gene redundancyGene redundancyGene redundancy

2.2. Variant repeats Variant repeats

““As long as there are other copies of a gene that function normalAs long as there are other copies of a gene that function normally, ly, a duplicate gene may accumulate deleterious mutations and a duplicate gene may accumulate deleterious mutations and become nonfunctionalbecome nonfunctional without adversely affecting the fitness of without adversely affecting the fitness of the organismthe organism””

J. B. S. Haldane (1933)J. B. S. Haldane (1933)

““Because Because deleterious mutations occur far more oftendeleterious mutations occur far more often than than advantageous ones, a advantageous ones, a redundant duplicate gene is more likely redundant duplicate gene is more likely to become nonfunctionalto become nonfunctional than to evolve into a new genethan to evolve into a new gene””

Susumu Ohno (1972)Susumu Ohno (1972)

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Gene redundancyGene redundancyGene redundancy

2.2. Variant repeatsVariant repeats

some copies may evolve into new functionssome copies may evolve into new functions–– can perform markedly different functionscan perform markedly different functions

a.a. thrombinthrombin and and trypsintrypsin•• thrombin cleaves fibrinogen during the process of blood clottingthrombin cleaves fibrinogen during the process of blood clotting•• trypsin digestive enzymetrypsin digestive enzyme

b.b. lactalbuminlactalbumin and and lysozymelysozyme•• lactalbumin is a subunit of the enzyme that catalyzes the synthelactalbumin is a subunit of the enzyme that catalyzes the synthesis sis

of the sugar lactoseof the sugar lactose•• lysozyme dissolves certain bacteria by cleaving the polysaccharilysozyme dissolves certain bacteria by cleaving the polysaccharide de

component of their cell wallscomponent of their cell walls

•• in some cases, a in some cases, a novel functionnovel function may be achieved through may be achieved through few substitutionsfew substitutions–– lactate dehydrogenaselactate dehydrogenase can be changed into a can be changed into a malate dehydromalate dehydro--

genasegenase by replacing just 1 out of its 317 amino acidsby replacing just 1 out of its 317 amino acids

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene redundancyGene redundancyGene redundancy

The origin of new gene function after gene duplicationThe origin of new gene function after gene duplication

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Gene redundancyGene redundancyGene redundancy

The origin of new gene function after gene duplicationThe origin of new gene function after gene duplication•• more more complex species evolvecomplex species evolve by by adding new gene functionsadding new gene functions

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Young duplicate genes in the human genomeYoung duplicate genes in the human genomeYoung duplicate genes in the human genome

Young duplicate genesYoung duplicate genesduplicate genes with Kduplicate genes with Kss < 0.3< 0.3

•• 250 pairs of young human duplicates studied250 pairs of young human duplicates studied–– 145 showed significant evidence that one copy had evolved faster145 showed significant evidence that one copy had evolved faster than the than the

other at the aminoother at the amino--acid levelacid level

•• KKAA/K/KSS ratioratio–– index of functional constraints index of functional constraints –– the smaller the Ka/Ks ratio is, the stronger the functional consthe smaller the Ka/Ks ratio is, the stronger the functional constraints aretraints are

65 pairs showed significantly different K65 pairs showed significantly different KAA/K/KSS ratios ratios –– after gene duplication 26% of the duplicate pairs have after gene duplication 26% of the duplicate pairs have

experienced different functional constraintsexperienced different functional constraints

Gene DuplicationGene DuplicationGene Duplication

Page 9: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

9

[Gene Evolution]

Young duplicate genes in the human genomeYoung duplicate genes in the human genomeYoung duplicate genes in the human genome

Gene DuplicationGene DuplicationGene Duplication

Ka/Ks > 1 in 113 genesKKaa//KKss > 1 in 113 genes> 1 in 113 genes

[Gene Evolution]

Young duplicate genes in the human genomeYoung duplicate genes in the human genomeYoung duplicate genes in the human genome

Gene DuplicationGene DuplicationGene Duplication

fast-evolving genesfastfast--evolving genesevolving genes slow-evolving genesslowslow--evolving genesevolving genes

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

result from complete gene duplicationresult from complete gene duplicationthe genes that belong to a group of repeated sequences in a genothe genes that belong to a group of repeated sequences in a genome me

•• functional or nonfunctional members of a gene family may functional or nonfunctional members of a gene family may a.a. reside in close proximityreside in close proximity to one another on the same chromosome to one another on the same chromosome b.b. located on different chromosomeslocated on different chromosomes

SuperfamiliesSuperfamilies•• term coined by Dayhoff (1978) in order to distinguish term coined by Dayhoff (1978) in order to distinguish distantly relateddistantly related

proteinsproteins from closely related onesfrom closely related ones•• similarity <50%similarity <50% at the aminoat the amino--acid levelacid level

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene FamiliesGene FamiliesGene FamiliesFunctionally similar genes are occasionally clustered, but usualFunctionally similar genes are occasionally clustered, but usually ly dispersed throughout the genomedispersed throughout the genome

Gene DuplicationGene DuplicationGene Duplication

Histone genesHistone genes

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Duplicated regions on human chromosomesDuplicated regions on human chromosomesParalogons on human chromosome 17Paralogons on human chromosome 17

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene FamiliesGene FamiliesGene FamiliesDuplicated Regions on Human ChromosomesDuplicated Regions on Human Chromosomes

Gene DuplicationGene DuplicationGene Duplication

2424 of these regions of these regions correspondcorrespondto known genomic to known genomic disordersdisorders

Blue intrachromosomalRed interchromosomal segmental

duplications

Page 10: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

10

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Evolutionarily related genes in Evolutionarily related genes in Bacillus subtilisBacillus subtilis genomegenome

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Large genomes contain more paralogs than smaller genomesLarge genomes contain more paralogs than smaller genomes

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Large genomes contain more paralogs than smaller genomes Large genomes contain more paralogs than smaller genomes

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

IsozymesIsozymes•• enzymes that catalyse the enzymes that catalyse the same biochemical reactionsame biochemical reaction but may differ in but may differ in

a.a. tissue specificitytissue specificityb.b. developmental regulation developmental regulation c.c. electrophoretic mobilityelectrophoretic mobilityd.d. biochemical propertiesbiochemical properties

encoded by different lociencoded by different loci, usually duplicated genes, usually duplicated genes

AllozymesAllozymes–– distinct forms of an enzymedistinct forms of an enzyme–– encoded by encoded by different allelesdifferent alleles at a at a single locussingle locus

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Lactate dehydrogenase (LDH): Lactate dehydrogenase (LDH): developmental speciationdevelopmental speciation•• 2 genes encode for the 2 genes encode for the αα and and ββ subunits of in mammalssubunits of in mammals•• these 2 subunits form these 2 subunits form 5 tetrameric isozymes5 tetrameric isozymes all of which catalyze either all of which catalyze either

a.a. the conversion the conversion lactate lactate pyruvatepyruvate in the presence of the oxidized in the presence of the oxidized coenzyme nicotinamide adenine dinucleotide (coenzyme nicotinamide adenine dinucleotide (NAD+NAD+) )

b.b. pyruvatepyruvate lactate lactate in the presence of the reduced coenzyme (in the presence of the reduced coenzyme (NADHNADH) )

Isozymes rich in Isozymes rich in ββ subunitssubunits–– have a high affinity for NAD+have a high affinity for NAD+–– function as true lactate dehydrogenase in function as true lactate dehydrogenase in aerobically metabolizing aerobically metabolizing

tissuestissuese.g. e.g. heartheart

Isozymes rich in Isozymes rich in αα subunitssubunits–– have a high affinity for NADHhave a high affinity for NADH–– are especially geared to serve as pyruvate reductases in are especially geared to serve as pyruvate reductases in anaerobically anaerobically

metabolizing tissuesmetabolizing tissuese.g. e.g. skeletal muscleskeletal muscle

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Lactate dehydrogenase (LDH): Lactate dehydrogenase (LDH): developmental speciationdevelopmental speciationDevelopmental sequence of LDH production in the heartDevelopmental sequence of LDH production in the heart

Gene DuplicationGene DuplicationGene Duplication

•• the more anaerobic the heart is (e.i. the more anaerobic the heart is (e.i. early stages of gestationearly stages of gestation) ) •• the the higherhigher the proportion of the proportion of LDH isozymes rich in LDH isozymes rich in αα subunitssubunits will bewill be

ββ--subunitsubunit

αα--subunitsubunit

Page 11: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

11

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Lactate dehydrogenase (LDH): Lactate dehydrogenase (LDH): developmental speciationdevelopmental speciationDevelopmental sequence of LDH production in the heartDevelopmental sequence of LDH production in the heart

Gene DuplicationGene DuplicationGene Duplication

the the two duplicate genestwo duplicate genes have become have become specialized to different tissuesspecialized to different tissuesand to and to different developmental stagesdifferent developmental stages

ββ--subunitsubunit

αα--subunitsubunit

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

OpsinsOpsins

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among Primates

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among Primates

Gene DuplicationGene DuplicationGene Duplication

Stink Gorilla More

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among Primates

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among Primates–– color vision in color vision in humanshumans, , apesapes, and , and Old World monkeysOld World monkeys is mediated in the is mediated in the

eye by eye by 3 types of photoreceptor cells3 types of photoreceptor cells (cones), which transduce photic (cones), which transduce photic energy into electrical potentialsenergy into electrical potentials

Gene DuplicationGene DuplicationGene Duplication

Page 12: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

12

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among Primates–– each typeeach type of colorof color--sensitive sensitive conecone is maximally is maximally sensitive to a certain sensitive to a certain

wavelengthwavelength, depending on the kind of color, depending on the kind of color--sensitive pigment sensitive pigment ((photopigmentphotopigment) present in the cone) present in the cone

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among Primates•• in humans, the in humans, the redred, , greengreen, and , and blueblue cones are maximally sensitive at cones are maximally sensitive at

approximately approximately 560560, , 530530, and , and 430430 nm, respectivelynm, respectively•• each color stimulates one or more kinds of coneseach color stimulates one or more kinds of cones

–– blue light stimulates blue conesblue light stimulates blue cones–– yellow light stimulates red and green cones equallyyellow light stimulates red and green cones equally–– white light stimulates all three types of cones equallywhite light stimulates all three types of cones equally

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among Primates•• each coloreach color--sensitive photopigment consists of 2 partssensitive photopigment consists of 2 parts

1.1. a protein called (a protein called (opsinopsin) ) 2.2. a lipid vitamina lipid vitamin--A1 derivative (A1 derivative (retinalretinal) )

the the color specificitycolor specificity is determined by the is determined by the opsinsopsins•• members of a superfamily of Gmembers of a superfamily of G--proteinprotein--coupled receptorscoupled receptors

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesGenes coding for opsinesGenes coding for opsines

a.a. the the blue opsinblue opsin is encoded by an is encoded by an autosomal geneautosomal geneb.b. the the redred and and greengreen opsinsopsins are encoded by are encoded by XX--linked geneslinked genes

•• each X chromosome contains each X chromosome contains only one redonly one red--opsin geneopsin gene, but , but •• may contain may contain more than one greenmore than one green--opsin geneopsin gene

Gene DuplicationGene DuplicationGene Duplication

AutosomeAutosomeX chromosomeX chromosome

96% amino96% amino--acid similarityacid similarity

43% amino43% amino--acid similarity acid similarity (500 Mya)(500 Mya)

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesGenes coding for opsinesGenes coding for opsines

a.a. the the blue opsinblue opsin is encoded by an is encoded by an autosomal geneautosomal geneb.b. the the redred and and greengreen opsinsopsins are encoded by are encoded by XX--linked geneslinked genes

the aminothe amino--acid sequences of the acid sequences of the redred and and greengreen opsins are opsins are 96% similar96% similar, but , but they only share they only share 43% amino43% amino--acid similarityacid similarity with the with the blueblue opsinopsin

the blue opsin gene and the ancestor of the green and red opsin the blue opsin gene and the ancestor of the green and red opsin genes genes diverged about 500 Myadiverged about 500 Mya

the close linkage and high similarity between the red and green the close linkage and high similarity between the red and green opsin opsin genes point to a very recent gene duplicationgenes point to a very recent gene duplication

Gene DuplicationGene DuplicationGene Duplication

X chromosomeX chromosome

96% amino96% amino--acid similarityacid similarity

43% amino43% amino--acid similarity acid similarity (500 Mya)(500 Mya)

AutosomeAutosome

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesColor Deficiency Color Deficiency ((ColorColor--Blindness)Blindness)

–– inability to distinguish one or more of theinability to distinguish one or more of the primary colorsprimary colors–– colorcolor--blind persons may be blind to one, two or all of the three colorblind persons may be blind to one, two or all of the three colors s

NormalNormal

1.1. tritanopiatritanopia blindness to blindness to blueblue

•• cannot distinguish between blue and yellowcannot distinguish between blue and yellow

2.2. deuteranopiadeuteranopia blindness to blindness to greengreen•• unable to see the green part of the visible spectrumunable to see the green part of the visible spectrum

3.3. protanopiaprotanopia blindness to blindness to red red •• unable to distinguish between red and greenunable to distinguish between red and green

4.4. MonochromatismMonochromatism or total coloror total color--blindness blindness •• all hues are perceived as all hues are perceived as variations of grayvariations of gray

Gene DuplicationGene DuplicationGene Duplication

Page 13: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

13

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesColor Deficiency Color Deficiency ((ColorColor--Blindness)Blindness)DichromatismDichromatism

–– most common form of colormost common form of color--blindnessblindness

–– affect affect ~~12%12% of of ♂♂ and and ~~0.2 % 0.2 % ♀♀–– many dicromatic persons are unaware that they are colormany dicromatic persons are unaware that they are color--blindblind

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesIshihara PlatesIshihara Plates

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesGenes coding for opsinesGenes coding for opsinesa.a. NewNew--World monkeysWorld monkeys

only one Xonly one X--linked pigment genelinked pigment geneb.b. OldOld--World monkeys, including apes and humansWorld monkeys, including apes and humans

have two or morehave two or morea a duplication occurred about 25duplication occurred about 25--35 Mya35 Mya in the ancestor of Oldin the ancestor of Old--World World monkeys after their divergence from the Newmonkeys after their divergence from the New--World monkeysWorld monkeysas a consequence of this duplication, as a consequence of this duplication, OldOld--World monkeys are trichromaticWorld monkeys are trichromatic

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Primates phylogenetic treePrimates phylogenetic treePrimates phylogenetic tree

Gene DuplicationGene DuplicationGene Duplication

duplication of the X-linked pigment geneduplication of the Xduplication of the X--linked linked pigment genepigment gene

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesGenes coding for opsinesGenes coding for opsines

NewNew--World monkeysWorld monkeys possess onlypossess only 2 loci 2 loci for the opsinsfor the opsinsone autosomalone autosomal and and one Xone X--linkedlinked

Gene DuplicationGene DuplicationGene Duplication

–– the exception are howler monkeys (the exception are howler monkeys (AlouattaAlouatta) ) which have one autosomal and two Xwhich have one autosomal and two X--linked linked opsin genesopsin genes

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesGenes coding for opsinesGenes coding for opsines

NewNew--World monkeysWorld monkeys possess onlypossess only 2 loci 2 loci for the opsinsfor the opsinsone autosomalone autosomal and and one Xone X--linkedlinked

•• however, in many Newhowever, in many New--World monkeys (e.g., squirrel monkeys and tamarins), World monkeys (e.g., squirrel monkeys and tamarins), the the XX--linked opsin locus is highly polymorphiclinked opsin locus is highly polymorphic

Gene DuplicationGene DuplicationGene Duplication

•• 22 of these alleles have of these alleles have maximalmaximal--sensitivitysensitivity peakspeakssimilar to those of human similar to those of human redred and and greengreen opsinopsin

–– the third allele has an intermediate maximalthe third allele has an intermediate maximal--sensitivity peak sensitivity peak

♀♀ heterozygousheterozygous are are tritrichromachromatictic

♂♂ and and ♀♀ homozygoushomozygous are are dichromaticdichromatic

Page 14: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

14

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among Primates

Gene DuplicationGene DuplicationGene Duplication

♀♀ womanwoman trichromatictrichromatic

♂♂ manman trichromatictrichromatic

♂♂ man color blindman color blinddichromaticdichromatic

dichromaticdichromatic

♀♀ OWM homozygous OWM homozygous dichromaticdichromatic

dichromaticdichromatic

♂♂ OWMOWMdichromaticdichromatic

dichromaticdichromatic

♀♀ OWM heterozygousOWM heterozygous trichromatictrichromatic

oror

oror

oror

[Gene Evolution]

Gene FamiliesGene FamiliesGene Families

Opsins: the color vision among PrimatesOpsins: the color vision among PrimatesHumans, apes and African monkeysHumans, apes and African monkeys

–– achieved achieved trichromatictrichromatic vision by a vision by a mechanism akin to isozymesmechanism akin to isozymesdistinct proteins encoded by different locidistinct proteins encoded by different loci

Heterozygous Heterozygous ♀♀ squirrel monkeyssquirrel monkeys–– achieve achieve trichromacy through the use of two trichromacy through the use of two allozymesallozymes

distinct proteins encoded by different allelic forms at a singledistinct proteins encoded by different allelic forms at a single locuslocus

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Gene LossGene LossGene Loss

~~7,000 genetic diseases documented in the medical literature7,000 genetic diseases documented in the medical literature–– mutations can easily destroy the function of a proteinmutations can easily destroy the function of a protein--coding genecoding gene

The vast The vast majority of mutations are deleteriousmajority of mutations are deleterious1.1. eliminatedeliminated quickly from the population, or quickly from the population, or 2.2. maintained at very low frequenciesmaintained at very low frequencies due to due to

a.a. overdominant selectionoverdominant selectionb.b. genetic driftgenetic drift

Because Because deleterious mutations occur far more oftendeleterious mutations occur far more often than than advantageous ones, a advantageous ones, a redundant duplicate gene is more likely to redundant duplicate gene is more likely to become nonfunctionalbecome nonfunctional than to evolve into a new genethan to evolve into a new gene””

Susumu Ohno (1972)Susumu Ohno (1972)

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Gene LossGene LossGene Loss

Unprocessed pseudogenesUnprocessed pseudogenes–– results of the results of the nonfunctionalizationnonfunctionalization or or silencing of a genesilencing of a gene due to due to

deleterious mutationsdeleterious mutations–– generally derived via the generally derived via the silencing silencing of a duplicate functional geneof a duplicate functional gene

–– contain contain multiple defectsmultiple defects•• frameshiftsframeshifts•• premature stop codonspremature stop codons•• obliteration of splicing sites or regulatory elementsobliteration of splicing sites or regulatory elements

–– difficult to identifydifficult to identify the mutation that was the direct the mutation that was the direct causecause of gene silencingof gene silencing

Why are pseudogenes interesting?Why are pseudogenes interesting?–– provide information of how the genomic DNA has been changed withprovide information of how the genomic DNA has been changed without out

evolutionary pressureevolutionary pressure–– can be used as a model for determining the rate of nucleotide sucan be used as a model for determining the rate of nucleotide substitution, bstitution,

insertion, deletioninsertion, deletion

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Gene LossGene LossGene Loss

Unprocessed pseudogene formationUnprocessed pseudogene formation

Gene DuplicationGene DuplicationGene Duplication

as time as time goes by...goes by...

mutations occurmutations occur

stop codon, frame shift, etc.stop codon, frame shift, etc.

pseudogene (pseudogene (ψψ))functional copyfunctional copy

no functional constraintsno functional constraints

[Gene Evolution]

Globin EvolutionGlobin EvolutionGlobin EvolutionThe The globin superfamilyglobin superfamily has experienced has experienced all the possible evolutionary all the possible evolutionary

pathwayspathways that can occur in families of repeated sequencesthat can occur in families of repeated sequences

1.1. retention of original function retention of original function

2.2. acquisition of new functionacquisition of new function

3.3. loss of functionloss of function

In humans, the globin superfamily consists of In humans, the globin superfamily consists of 5 families5 families1.1. αα--globin familyglobin family on chromosome 16on chromosome 162.2. ββ--globin familyglobin family on chromosome 11 on chromosome 11 3.3. myoglobinmyoglobin, single member on chromosome 22, single member on chromosome 224.4. neuroglobinneuroglobin, single member on chromosome 4, single member on chromosome 45.5. cytoglobincytoglobin, single member on chromosome 17, single member on chromosome 17

4 types of functional proteins4 types of functional proteins–– myoglobinmyoglobin–– hemoglobinhemoglobin–– neuroglobinneuroglobin–– cytoglobincytoglobin

Gene DuplicationGene DuplicationGene Duplication

Page 15: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

15

[Gene Evolution]

Globin EvolutionGlobin EvolutionGlobin Evolution

TheThe globin superfamilyglobin superfamily

Gene DuplicationGene DuplicationGene Duplication

Chromosome 16Chromosome 16

Chromosome 11Chromosome 11

Chromosome 22Chromosome 22

Chromosome 14Chromosome 14

Chromosome 17Chromosome 17

[Gene Evolution]

Globin EvolutionGlobin EvolutionGlobin Evolution

TheThe globin superfamilyglobin superfamily•• the globins are be very ancient in origin the globins are be very ancient in origin

–– globinglobin--like proteins exist in all life forms studied like proteins exist in all life forms studied •• Neuroglobin is the first to have branched offNeuroglobin is the first to have branched off•• MyoglobinMyoglobin and and HemoglobinHemoglobin diverged diverged 800 Mya 800 Mya

–– before the emergence of annelid wormsbefore the emergence of annelid wormsMyoglobinMyoglobin

–– remained remained monomericmonomeric–– evolved a higher affinity for oxygenevolved a higher affinity for oxygen than hemoglobin than hemoglobin –– became became oxygenoxygen--storage protein in musclesstorage protein in muscles

HemoglobinHemoglobin–– acquired a acquired a tetramerictetrameric structure structure –– became the became the oxygen carrier in bloodoxygen carrier in blood–– much more much more refinedrefined and and regulatedregulated

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Globin EvolutionGlobin EvolutionGlobin Evolution

TheThe globin superfamilyglobin superfamily

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Globin EvolutionGlobin EvolutionGlobin Evolution

TheThe globin superfamilyglobin superfamily

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Globin EvolutionGlobin EvolutionGlobin Evolution

TheThe globin superfamilyglobin superfamilyMammalian hemoglobinMammalian hemoglobin•• has acquired several capabilities that are absent in myoglobinhas acquired several capabilities that are absent in myoglobin1.1. binding of 4 oxygen molecules cooperativelybinding of 4 oxygen molecules cooperatively2.2. responding to the acidity and carbonresponding to the acidity and carbon--dioxide concentration inside reddioxide concentration inside red--blood cells blood cells

3.3. regulating its oxygen affinity through the level of organic phosregulating its oxygen affinity through the level of organic phosphate in the bloodphate in the blood

the the heteromeric structureheteromeric structure of hemoglobin has facilitated these refinements of of hemoglobin has facilitated these refinements of the function of hemoglobinthe function of hemoglobin

•• made up of made up of 2 types of chains2 types of chains–– one encoded by an one encoded by an αα family memberfamily member–– the other by a member of the the other by a member of the ββ familyfamily

•• the the αα and and ββ families diverged following a gene duplication about families diverged following a gene duplication about 450450--500 Mya500 Mya–– tandem duplicationtandem duplication resulting in 2 linked genes on the same chromosomeresulting in 2 linked genes on the same chromosome

•• chromosomal linkage is preserved in raychromosomal linkage is preserved in ray--finned fishes and amphibiansfinned fishes and amphibians

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Globin EvolutionGlobin EvolutionGlobin Evolution

TheThe globin superfamilyglobin superfamilyOrganization of the Organization of the αα and and ββ globin familiesglobin familiesInternal organization of Internal organization of the the αα11 and and ββ genes are showngenes are shown

Gene DuplicationGene DuplicationGene Duplication

Page 16: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

16

[Gene Evolution]

Globin EvolutionGlobin EvolutionGlobin Evolution

TheThe globin superfamilyglobin superfamilyMammalian hemoglobinMammalian hemoglobin•• thethe αα andand ββ familiesfamilies have diverged in both have diverged in both physiological propertiesphysiological properties and and

ontological regulationontological regulation

Gene DuplicationGene DuplicationGene Duplication

embryonic geneembryonic gene 1 adult 1 adult genesgenes fetal genefetal gene3 unprocessed pseudogenes3 unprocessed pseudogenes

αα familyfamily4 functional genes4 functional genes

embryonic geneembryonic gene 2 fetal genes2 fetal genes 2 adult genes2 adult genesunprocessed pseudogeneunprocessed pseudogene

ββ familyfamily5 functional genes5 functional genes

[Gene Evolution]

Globin EvolutionGlobin EvolutionGlobin Evolution

TheThe globin superfamilyglobin superfamilyMammalian hemoglobinMammalian hemoglobin•• thethe αα andand ββ familiesfamilies have diverged in both have diverged in both physiological propertiesphysiological properties and and

ontological regulation ontological regulation 1.1. distinct hemoglobinsdistinct hemoglobins appear at appear at different developmental stagesdifferent developmental stages

ζζ22εε22 andand αα22εε22 in the in the embryoembryo

αα22γγ22 in the in the fetusfetus

αα22ββ22 andand αα22δδ22 in adultsin adults

the the θθ11 gene is mainly transcribed gene is mainly transcribed 55--8 weeks after conception8 weeks after conception at veryat verylow levelslow levels (the protein has not yet been detected (the protein has not yet been detected in vivoin vivo))

Gene DuplicationGene DuplicationGene Duplication

[Gene Evolution]

Globin EvolutionGlobin EvolutionGlobin Evolution

TheThe globin superfamilyglobin superfamilyMammalian hemoglobinMammalian hemoglobin•• thethe αα andand ββ familiesfamilies have diverged in both have diverged in both physiological propertiesphysiological properties and and

ontological regulation ontological regulation 1.1. distinct hemoglobinsdistinct hemoglobins appear at appear at different developmental stagesdifferent developmental stages2.2. differences in oxygendifferences in oxygen--binding affinity have evolved binding affinity have evolved

–– embryonicembryonic and and fetalfetal hemoglobins (hemoglobins (ζζ22εε22 αα22εε22 and and αα22γγ22) have a ) have a

higher oxygen affinityhigher oxygen affinity than adult hemoglobins (than adult hemoglobins (αα22ββ22 and and αα22δδ22))

–– better functionbetter function in the relatively in the relatively hypoxic environmenthypoxic environment in which the in which the embryo and the fetus resideembryo and the fetus reside

gene duplicationgene duplication resulted in evolutionary resulted in evolutionary refinements of physiological refinements of physiological systemssystems

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Globin EvolutionGlobin EvolutionGlobin Evolution

TheThe globin superfamilyglobin superfamily

Gene DuplicationGene DuplicationGene Duplication

αα1 and 1 and αα22•• produce identical polypeptidesproduce identical polypeptides•• present in humans and all the apes present in humans and all the apes •• arisen about 20 million years agoarisen about 20 million years ago

GGγγ 1 1 andand AAγγafter the separation after the separation of the simian lineage of the simian lineage from the prosimiansfrom the prosimians

[Gene Evolution]

Globin EvolutionGlobin EvolutionGlobin Evolution

TheThe globin superfamilyglobin superfamily

Gene DuplicationGene DuplicationGene Duplication[Gene Evolution]

Types of exon shufflingTypes of exon shufflingTypes of exon shuffling1.1. exon duplicationexon duplication

–– duplication of one or more exons in a gene (type of internal dupduplication of one or more exons in a gene (type of internal duplication) lication)

2.2. exon insertionexon insertion–– structural or functional domains are exchanged between proteinsstructural or functional domains are exchanged between proteins

3.3. exon deletionexon deletion–– removal of a segment of amino acids from the proteinremoval of a segment of amino acids from the protein

Mosaic ProteinMosaic Protein–– protein encoded by a gene that contains regions that are also protein encoded by a gene that contains regions that are also

found in other genesfound in other genesevidence of exon shufflingevidence of exon shuffling during the evolution of their genesduring the evolution of their genes

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

Page 17: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

17

[Gene Evolution]

Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) •• activated by bloodactivated by blood--clotting Factor XIIaclotting Factor XIIa

•• TPA converts TPA converts plasminogenplasminogen into its active form, into its active form, plasminplasmin, which dissolves , which dissolves fibrinfibrin, a soluble fibrous protein in blood clots, a soluble fibrous protein in blood clots

•• conversion of conversion of plasminogen plasminogen plasminplasmin is accelerated by the presence of is accelerated by the presence of fibrinfibrin, the substrate of plasmin, the substrate of plasmin

–– fibrin polymers bind both plasminogen and TPA aligning them for fibrin polymers bind both plasminogen and TPA aligning them for catalysiscatalysis

–– production of plasmin only in the proximity of fibrin (production of plasmin only in the proximity of fibrin (fibrinfibrin--specificityspecificity))

ProurokinaseProurokinase•• precursor of the precursor of the urinary plasminogen activatorurinary plasminogen activator•• lacks fibrin specificitylacks fibrin specificity

Exon Shuffling & Mosaic Proteins Exon Shuffling & Exon Shuffling & Mosaic Proteins Mosaic Proteins [Gene Evolution]

Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Diagram of blood coagulation and fibrinolysis Diagram of blood coagulation and fibrinolysis •• several mosaic proteins are involvedseveral mosaic proteins are involved

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

arrangement of the structural modules in arrangement of the structural modules in the mosaic proteins are shown as boxesthe mosaic proteins are shown as boxes

[Gene Evolution]

Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

fibronectin typefibronectin type--1 module (F1)1 module (F1)epidermal growthepidermal growth--factor module (EG)factor module (EG)

serine proteinase (protease) region homologous to that of trypsiserine proteinase (protease) region homologous to that of trypsinnkringles (KR)kringles (KR)

TPA contains a TPA contains a 4343--residue residue sequence at its Nsequence at its N--terminal end absent in prourokinaseterminal end absent in prourokinase•• this segment is this segment is homologoushomologous to one of the to one of the 3 finger domains3 finger domains of fibronectin of fibronectin

responsible for the responsible for the fibrin affinityfibrin affinitya large glycoprotein present in the a large glycoprotein present in the plasma and on cell surfaces which plasma and on cell surfaces which promotes cellular adhesionpromotes cellular adhesion

[Gene Evolution]

Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

fibronectin typefibronectin type--1 module (F1)1 module (F1)epidermal growthepidermal growth--factor module (EG)factor module (EG)

serine proteinase (protease) region homologous to that of trypsiserine proteinase (protease) region homologous to that of trypsinnkringles (KR)kringles (KR)

TPA contains a TPA contains a 4343--residue residue sequence at its Nsequence at its N--terminal end absent in prourokinaseterminal end absent in prourokinase•• this segment is this segment is homologoushomologous to one of the to one of the 3 finger domains3 finger domains of fibronectin of fibronectin

responsible for the responsible for the fibrin affinityfibrin affinity

•• a a deletiondeletion of this segment leads to a of this segment leads to a loss of the fibrin affinity of TPAloss of the fibrin affinity of TPA

exon shufflingexon shuffling must have been must have been responsible for the acquisition of this responsible for the acquisition of this domain by TPAdomain by TPA from either fibronectin or a similar proteinfrom either fibronectin or a similar protein

[Gene Evolution]

Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

fibronectin typefibronectin type--1 module (F1)1 module (F1)epidermal growthepidermal growth--factor module (EG)factor module (EG)

serine proteinase (protease) region homologous to that of trypsiserine proteinase (protease) region homologous to that of trypsinnkringles (KR)kringles (KR)

TPA TPA alsoalso contains contains •• a segment homologous to portions of the a segment homologous to portions of the epidermal growth factorepidermal growth factor precursor precursor

and the growthand the growth--factorfactor--like regions of other proteins (Factors VII, IX, X, and XII) like regions of other proteins (Factors VII, IX, X, and XII)

[Gene Evolution]

Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

fibronectin typefibronectin type--1 module (F1)1 module (F1)epidermal growthepidermal growth--factor module (EG)factor module (EG)

serine proteinase (protease) region homologous to that of trypsiserine proteinase (protease) region homologous to that of trypsinnkringles (KR)kringles (KR)

TPA TPA alsoalso contains contains •• two structures similar to the two structures similar to the kringles of plasminogenkringles of plasminogen

Page 18: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

18

[Gene Evolution]

Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

fibronectin typefibronectin type--1 module (F1)1 module (F1)epidermal growthepidermal growth--factor module (EG)factor module (EG)

serine proteinase (protease) region homologous to that of trypsiserine proteinase (protease) region homologous to that of trypsinnkringles (KR)kringles (KR)

TPA TPA alsoalso contains in the Ccontains in the C--terminal terminal •• regions homologous to the regions homologous to the protease parts of trypsinprotease parts of trypsin and other trypsinand other trypsin--like like

serine proteinases (e.g. plasminogen) serine proteinases (e.g. plasminogen) ➘➘ enzymes that hydrolyze proteins into peptide fragmentsenzymes that hydrolyze proteins into peptide fragments

[Gene Evolution]

Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

TPA acquired at least TPA acquired at least 5 DNA segments from 5 DNA segments from at leastat least 4 other genes4 other genes–– plasminogenplasminogen–– epidermal growth factorepidermal growth factor–– fibronectinfibronectin–– trypsin trypsin

[Gene Evolution]

Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Tissue plasminogen activator (TPA) Arrangement of TPA structural modulesArrangement of TPA structural modules

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

TPA acquired at least TPA acquired at least 5 DNA segments from 5 DNA segments from at leastat least 4 other genes4 other genesthethe junctionsjunctions of these acquired units of these acquired units coincide preciselycoincide precisely with the borders with the borders between between exonsexons and and intronsintrons

–– further evidence of further evidence of exons shufflingexons shuffling from one gene to anotherfrom one gene to another

[Gene Evolution]

Phase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations of the exonic structurePhase limitations of the exonic structure

–– must be respectedmust be respected for an exon to be inserted, deleted or duplicated without for an exon to be inserted, deleted or duplicated without causing a causing a frameshift in the reading frameframeshift in the reading frame

IntronsIntrons are classified into three are classified into three 3 types3 types according to the way in which the coding according to the way in which the coding region is interruptedregion is interrupted

–– phase 0phase 0 if it lies between two codonsif it lies between two codons–– phase 1phase 1 if it lies between the 1st and 2nd nucleotides of a codonif it lies between the 1st and 2nd nucleotides of a codon–– phase 2phase 2 if it lies between the 2nd and 3rd nucleotides of a codonif it lies between the 2nd and 3rd nucleotides of a codon

ExonsExons

–– grouped into grouped into classesclasses according to the according to the phases of their flanking intronsphases of their flanking introns

class 0class 0--0 0 exon flanked by a phaseexon flanked by a phase--0 intron at its 5' end and by a 0 intron at its 5' end and by a phasephase--0 intron at its 3' end0 intron at its 3' end

class 0class 0--1 1 phasephase--0 intron in 5' and phase0 intron in 5' and phase--1 intron in 3'1 intron in 3'

class 1class 1--2 2 phasephase--1 intron in 5' and phase1 intron in 5' and phase--2 intron in 3'2 intron in 3'

etc...etc...

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

[Gene Evolution]

Phase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations of the exonic structurePhase limitations of the exonic structure

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

GGC AAGGGC AAG gtaagtgtaagt ................ (................ (PyPy))nnncagncag GTC AACGTC AAC

GlyGly LysLys ValVal AsnAsnPhase 0Phase 0

GG CAA GGG CAA G gtaagtgtaagt ................ (................ (PyPy))nnncagncag GT CAA CGT CAA C

GlnGln GG lyly GlnGlnPhase 1Phase 1

G GCA AGG GCA AG gtaagtgtaagt ................ (................ (PyPy))nnncagncag G TCA ACG TCA AC

AlaAla ArAr gg SerSerPhase 2Phase 2

[Gene Evolution]

Phase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations on exon shuffling

Acceptance of mutants created by intronic recombination Acceptance of mutants created by intronic recombination Several levels of selectionSeveral levels of selection determine whether intronic recombination mutant will determine whether intronic recombination mutant will be be fixedfixed or or rejectedrejected

1.1. chimeric intron must be spliced correctlychimeric intron must be spliced correctly–– otherwise translation will probably run into a stop codon in theotherwise translation will probably run into a stop codon in the

mRNA/intron region and form a mRNA/intron region and form a truncated proteintruncated protein

2.2. twotwo nonnon--orthologous introns must be in theorthologous introns must be in the same phasesame phasea.a. must split the reading frame in the same phasemust split the reading frame in the same phaseb.b. downstream exon must be translated in its original phase to prevdownstream exon must be translated in its original phase to prevent ent

frameshift mutationsframeshift mutationsc.c. symmetrical exonssymmetrical exons

3.3. new protein must be able to adopt a stable conformationnew protein must be able to adopt a stable conformation

4.4. selective advantage of having a new functional domainselective advantage of having a new functional domain–– impact of exon insertion may initially be mitigated by alternateimpact of exon insertion may initially be mitigated by alternate splicingsplicing

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

Page 19: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

19

[Gene Evolution]

Phase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations of the exonic structurePhase limitations of the exonic structure

–– only symmetrical exons can be duplicatedonly symmetrical exons can be duplicated in tandem, in tandem, insertedinserted or or deleteddeleted without affecting the reading framewithout affecting the reading frame

symmetrical exons symmetrical exons flanked by introns of the same phase at both endsflanked by introns of the same phase at both ends

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

00 22 11 11 11 22 11 22 11

00 22 11 11 22 11 22 11

00 22 11 11 22 11 22 11

11 11

11 11 1111

11 11

[Gene Evolution]

Phase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations on exon shuffling

Phase limitations of the exonic structurePhase limitations of the exonic structure–– only symmetrical exons can be duplicatedonly symmetrical exons can be duplicated in tandem, in tandem, insertedinserted or or

deleteddeleted without affecting the reading framewithout affecting the reading frame–– duplication or deletion of duplication or deletion of asymmetrical exonsasymmetrical exons would would disrupt the disrupt the

reading frame downstreamreading frame downstream–– the the lengthlength of a symmetrical exon is always a of a symmetrical exon is always a multiplemultiple of of 3 nucleotides3 nucleotides

–– insertion of symmetrical exons is also restrictedinsertion of symmetrical exons is also restricted•• 00--0 exons can only be inserted in phase0 exons can only be inserted in phase--0 introns 0 introns •• 11--1 exons can only be inserted into phase1 exons can only be inserted into phase--1 introns 1 introns •• 22--2 exons can only be inserted into phase2 exons can only be inserted into phase--2 introns2 introns

all the exons coding for theall the exons coding for the modules of mosaic proteins modules of mosaic proteins areare symmetricalsymmetrical

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

[Gene Evolution]

Phase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations on exon shufflingPhase limitations of the exonic structurePhase limitations of the exonic structure

–– only symmetrical exons can be duplicatedonly symmetrical exons can be duplicated in tandem, in tandem, insertedinserted or or deleteddeleted without affecting the reading framewithout affecting the reading frame

–– duplication or deletion of duplication or deletion of asymmetrical exonsasymmetrical exons would would disrupt the reading disrupt the reading frame downstreamframe downstream

Exon Shuffling & Mosaic ProteinsExon Shuffling & Exon Shuffling & Mosaic ProteinsMosaic Proteins

22 22 22 22 22 22 22

DeletionDeletion22 22

InsertionInsertion

DuplicationDuplication

22 22 00 00 11 11 00

DeletionDeletion11 00

InsertionInsertion

DuplicationDuplication

[Gene Evolution]

Evolutionary roleEvolutionary roleEvolutionary role•• probably probably did not play a roledid not play a role in the formation of genes in the formation of genes in thein the early stages of early stages of

evolutionevolution

•• full bloom with the full bloom with the evolution of spliceosomal intronsevolution of spliceosomal introns, which do not play a , which do not play a role in their own excisionrole in their own excision

–– these introns these introns contain mainly nonessential partscontain mainly nonessential parts

could accomodatecould accomodate quantities of quantities of foreignforeign DNADNA

Factors favouring intronic recombinationFactors favouring intronic recombination•• middle middle repetitive sequencesrepetitive sequences flanking an exon may flanking an exon may facilitate facilitate looping outlooping out or or

insertioninsertion of modules by of modules by intronic recombinationintronic recombination•• ApolipoproteinApolipoprotein

–– number of tandem kringle domains ranges from 12 to 51 copiesnumber of tandem kringle domains ranges from 12 to 51 copies•• in one variant, 24 of the 37 kringle domains have identical nuclin one variant, 24 of the 37 kringle domains have identical nucleotide eotide

sequences, suggesting very recent duplicationsequences, suggesting very recent duplication–– isoformsisoforms containingcontaining different numbers of kringledifferent numbers of kringle domainsdomains do not do not

follow simple Mendelian patterns of inheritancefollow simple Mendelian patterns of inheritance•• offspring often have apolipoprotein isoforms that differ from thoffspring often have apolipoprotein isoforms that differ from those of ose of

parentsparents

Exon ShufflingExon ShufflingExon Shuffling

[Gene Evolution]

Overlapping GenesOverlapping GenesOverlapping Genes

•• a a DNA segmentDNA segment can can code for more than one gene productcode for more than one gene product by using by using different reading framesdifferent reading frames or or different initiation codonsdifferent initiation codons

–– widespread phenomenon in DNA and RNA viruses, as well as in orgawidespread phenomenon in DNA and RNA viruses, as well as in organelles nelles and bacteriaand bacteria

–– also known in nuclear eukaryotic genomesalso known in nuclear eukaryotic genomes•• can also arise by the can also arise by the use of the complementary strand of a geneuse of the complementary strand of a gene

Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions

e.g. the genes specifying e.g. the genes specifying tRNAtRNAIleIle and and tRNAtRNAGlnGln

in the human mitochondrial genome in the human mitochondrial genome –– located on different strands located on different strands –– 33--nucleotide overlap between them nucleotide overlap between them

5'5'——CTACTA——3' in tRNA3' in tRNAIleIle

5'5'——TAGTAG——3' in tRNA3' in tRNAGlnGln

•• also the also the ND6ND6 coding sequence corresponds coding sequence corresponds to the complementary strand of to the complementary strand of cytBcytB

[Gene Evolution]

Overlapping GenesOverlapping GenesOverlapping Genes

•• ORF are abundantORF are abundant throughout the genomethroughout the genome–– even a random DNA sequence might contain ORF hundreds of bp longeven a random DNA sequence might contain ORF hundreds of bp long

•• potential CDSpotential CDS of considerable length existof considerable length exist1.1. in a in a different reading framedifferent reading frame of an existing gene of an existing gene 2.2. on the on the complementary strandcomplementary strand

•• an additional mRNA will be transcribed and translated into a newan additional mRNA will be transcribed and translated into a new protein if protein if

a.a. by chance such a ORF contains an initiation codon and a transcriby chance such a ORF contains an initiation codon and a transcriptionption--initiation siteinitiation site

b.b. such sites are created by mutationsuch sites are created by mutation

•• the the rate of evolutionrate of evolution is expected to be is expected to be slowerslower in DNA encoding overlapping in DNA encoding overlapping genes than in similar DNA sequences that only use one reading frgenes than in similar DNA sequences that only use one reading frameame

–– higher proportion of nondegenerate siteshigher proportion of nondegenerate sites–– reduced proportion of synonymous mutationsreduced proportion of synonymous mutations–– 3rd codon3rd codon position on a given strand is the position on a given strand is the 1st codon1st codon position on its position on its

complementary strandcomplementary strand

Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions

Page 20: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

20

[Gene Evolution]

Alternative splicing Alternative splicing Alternative splicing

•• production of different mRNAs from the same DNA segmentproduction of different mRNAs from the same DNA segment–– translated into different polypeptidestranslated into different polypeptides

•• the the distinction between exons and intronsdistinction between exons and introns is no longer absolute but is no longer absolute but depends on the mRNA of referencedepends on the mRNA of reference

2 types of exons2 types of exons

1.1. constitutiveconstitutive–– included within all the mRNAs transcribed from a geneincluded within all the mRNAs transcribed from a gene

2.2. facultativefacultative–– exons that are exons that are sometimes spliced insometimes spliced in and and sometimes spliced outsometimes spliced out

2 types of 2 types of aalternative splicinglternative splicing

1.1. unconditionalunconditional–– two or more mRNA variants aretwo or more mRNA variants are produced in all tissues produced in all tissues

expressing the geneexpressing the gene

2.2. conditionalconditional–– tissue tissue specificspecific–– developmentaldevelopmental--stage stage specificspecific–– physiologicalphysiological--state state specificspecific

Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions

[Gene Evolution]

Alternative splicing Alternative splicing Alternative splicing

RNA splicingRNA splicing removal of introns & ligation of adjacent exonsremoval of introns & ligation of adjacent exons

Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions

splicingsplicing

splice variant Isplice variant I splice variant IIsplice variant II

alternativealternativesplicingsplicing

constitutive exonsconstitutive exonsfacultative exonfacultative exon

[Gene Evolution]

Alternative splicing Alternative splicing Alternative splicing

Intron retentionIntron retention•• an unspliced intron can result in the addition of a peptide segman unspliced intron can result in the addition of a peptide segment ent

–– the ORF must be maintainedthe ORF must be maintained•• more commonly intron retention results in the more commonly intron retention results in the premature termination of premature termination of

translationtranslation due to due to frameshiftsframeshifts

Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions

splice variant Isplice variant I

alternativealternativesplicingsplicing

splice variant IIsplice variant II

[Gene Evolution]

Alternative splicing Alternative splicing Alternative splicing

Alternative internal donor or acceptor sitesAlternative internal donor or acceptor sites•• excisions of introns of different lengths with complementary varexcisions of introns of different lengths with complementary variation in the size iation in the size

of neighboring exonsof neighboring exons

Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions

alternative internal donor sitealternative internal donor site

alternative internal acceptoralternative internal acceptor sitesite

[Gene Evolution]

Alternative splicing Alternative splicing Alternative splicing

Alternative transcription initiation and polyadenylation sites Alternative transcription initiation and polyadenylation sites •• different mRNAs that are produced from the same gene differ fromdifferent mRNAs that are produced from the same gene differ from one another one another

only at their 5' or 3' endsonly at their 5' or 3' ends•• alternative polyadenylation sites are common in eukaryotic nuclealternative polyadenylation sites are common in eukaryotic nuclear genesar genes

Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions

alternative transcription initiation sitealternative transcription initiation site

alternative polyadenylation sitealternative polyadenylation site

TATATATA

TATATATA

AATAAAATAA

AATAAAATAA

[Gene Evolution]

Alternative splicing Alternative splicing Alternative splicing

Mutually exclusive exonsMutually exclusive exons•• 2 exons are never spliced out together, nor are both retained in2 exons are never spliced out together, nor are both retained in the same mRNAthe same mRNA

Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions

splice variant Isplice variant I splice variant IIsplice variant II

alternativealternativesplicingsplicing

•• M1M1 and and M2M2 forms of forms of pyruvate kinasepyruvate kinase–– mutually exclusive use of exons 9 and 10 mutually exclusive use of exons 9 and 10 ofof a single genea single gene

Page 21: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

21

[Gene Evolution]

Alternative splicing Alternative splicing Alternative splicing

CassetteCassette exonsexons•• special case of mutual exclusivity special case of mutual exclusivity •• a a cassettecassette is either is either spliced inspliced in or or spliced outspliced out in the alternative mRNA in the alternative mRNA •• usually the reading frame is maintained whether such an exon is usually the reading frame is maintained whether such an exon is in or outin or out

Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions

splice variant Isplice variant I splice variant IIsplice variant II

alternativealternativesplicingsplicing

TroponinTroponin--TT genegene•• 55 cassette cassette exonsexons in conjunction with in conjunction with 2 mutually exclusive 2 mutually exclusive exonsexons

productionproduction of of 64 different proteins64 different proteins from a single from a single genegene

[Gene Evolution]

Alternative splicing Alternative splicing Alternative splicing

Alternative splicing as a means of developmental regulationAlternative splicing as a means of developmental regulationDrosophila melanogasterDrosophila melanogaster

•• at least 3 genes are involved in the process of sex determinatioat least 3 genes are involved in the process of sex determinationn–– doublesexdoublesex ((dsxdsx))–– SexlethalSexlethal ((SxlSxl))–– transformertransformer ((tratra))

•• are are spliced differentlyspliced differently in in malesmales and and femalesfemales

Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions

4 653214321 65321

dsx

4321tra

2

363 4 5 7 821

stop

Sxl

stop

♂♂ mRNAmRNA

♂♂ mRNAmRNA

♂♂ mRNAmRNA

♀♀ mRNAmRNA

♀♀ mRNAmRNA

♀♀ mRNAmRNA

[Gene Evolution]

Alternative splicing Alternative splicing Alternative splicing

Evolution of alternative splicingEvolution of alternative splicing•• requires that an requires that an alternative splice junctionalternative splice junction site be site be created created de novode novo•• created with an appreciable frequency by mutationcreated with an appreciable frequency by mutation

–– splicing signals are usually 5splicing signals are usually 5--10 nucleotides long10 nucleotides long•• many such examples are known.many such examples are known.

ββ++ thalassemiathalassemia•• synonymous nucleotide substitutionsynonymous nucleotide substitution in the gene (in the gene (GGT GGT GGAGGA = glycine)= glycine)•• however, however, not silentnot silent

•• activation of the activation of the new splicing site in the new splicing site in the ββ globin geneglobin gene–– stronger than the old splice sitestronger than the old splice site

•• production of a production of a frameshifted proteinframeshifted protein

Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions

[Gene Evolution]

Alternative splicing Alternative splicing Alternative splicing

ββ++ thalassemiathalassemia•• synonymous nucleotide substitutionsynonymous nucleotide substitution in the gene (in the gene (GGT GGT GGAGGA = glycine)= glycine)•• however, however, not silentnot silent

•• activation of the activation of the new splicing site in the new splicing site in the ββ globin geneglobin gene–– stronger than the old splice sitestronger than the old splice site

•• production of a production of a frameshifted proteinframeshifted protein

Alternative Pathways For Producing New FunctionsAlternative Pathways For Producing New FunctionsAlternative Pathways For Producing New Functions

[Gene Evolution]

An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon

members of a repeatedmembers of a repeated--sequence family are generally very similar sequence family are generally very similar to to each othereach other within one specieswithin one species

members of the family from closely related species may differ grmembers of the family from closely related species may differ greatlyeatly

•• if each duplicate sequence evolves independentlyif each duplicate sequence evolves independently

–– the the similaritysimilarity betweenbetween any two randomly chosen any two randomly chosen sequencessequences within a within a speciesspecies is expected to be the is expected to be the same as that between two sequencessame as that between two sequenceschosen between the specieschosen between the species

•• observed patterns reveal a observed patterns reveal a high degree of withinhigh degree of within--species homogeneityspecies homogeneityamong duplicated sequencesamong duplicated sequences

Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]

An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon

•• if each duplicate sequence evolves independentlyif each duplicate sequence evolves independently–– the the similaritysimilarity betweenbetween any two randomly chosen any two randomly chosen sequencessequences within a within a

speciesspecies is expected to be the is expected to be the same as that between two sequencessame as that between two sequenceschosen between the specieschosen between the species

Concerted EvolutionConcerted EvolutionConcerted Evolution

Page 22: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

22

[Gene Evolution]

An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon

members of a repeatedmembers of a repeated--sequence family are generally very similar sequence family are generally very similar to to each othereach other within one specieswithin one species

•• observed patterns reveal a observed patterns reveal a high degree of withinhigh degree of within--species homogeneityspecies homogeneityamong duplicated sequencesamong duplicated sequences

Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]

An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon

members of a repeatedmembers of a repeated--sequence family are generally very similar sequence family are generally very similar to to each othereach other within one specieswithin one species

Concerted EvolutionConcerted EvolutionConcerted Evolution

[Gene Evolution]

An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon

XenopusXenopus and most other vertebratesand most other vertebrates•• the the genes specifying the 18S and 28S rRNAgenes specifying the 18S and 28S rRNA are present in are present in hundreds of hundreds of

copiescopies and are arranged in and are arranged in one or a few tandem arraysone or a few tandem arrays•• each repeated unit consists of a each repeated unit consists of a transcribedtranscribed and a and a nontranscribednontranscribed segmentsegment

Concerted EvolutionConcerted EvolutionConcerted Evolution

•• rRNA genes in rRNA genes in X. laevisX. laevis and and X. borealisX. borealis1.1. 18S18S and and 28S28S genes of the two species were virtually genes of the two species were virtually identicalidentical2.2. NTS regionsNTS regions differed greatly between the two speciesdiffered greatly between the two species

very similar within each individualvery similar within each individual and and among among individualsindividuals within a specieswithin a species

NTS regions in each species have evolved in concertNTS regions in each species have evolved in concertbut have but have diverged rapidly between speciesdiverged rapidly between species

Xenopus laevisXenopus laevis Xenopus borealisXenopus borealis

[Gene Evolution]

An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon

Evolutionary scenarios for homogenization tandemly repeated arraEvolutionary scenarios for homogenization tandemly repeated arrayy

a.a. Stringent SelectionStringent Selection

Concerted EvolutionConcerted EvolutionConcerted Evolution

•• the function of the repeats depends on the function of the repeats depends on their specific nucleotide sequencetheir specific nucleotide sequence

•• beneficial mutations are fixed by positive beneficial mutations are fixed by positive selection (selection (++) )

•• deleterious mutations are eliminated by deleterious mutations are eliminated by purifying selection (purifying selection (––))

HoweverHowever

•• NTS regions have no known functionNTS regions have no known function

–– do not appear to be subject to do not appear to be subject to stringent selective constraintsstringent selective constraints

++

++

++

--

++

++

[Gene Evolution]

An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon

Evolutionary scenarios for homogenization tandemly repeated arraEvolutionary scenarios for homogenization tandemly repeated arrayya.a. Stringent Selection (NO)Stringent Selection (NO)

b.b. Recent MultiplicationRecent Multiplication

Concerted EvolutionConcerted EvolutionConcerted Evolution

•• the repeated family arises through the the repeated family arises through the amplification of a single unitamplification of a single unit

•• the homogeneity reflects the fact that the homogeneity reflects the fact that there had there had notnot been been enough timeenough time for the for the members of the multigene family members of the multigene family to to divergediverge from each otherfrom each other

•• it is expected that the homogeneity of the family would graduallit is expected that the homogeneity of the family would gradually decreasey decrease–– mutations would accumulate in the family members through geneticmutations would accumulate in the family members through genetic

drift, particularly in regions that are subject to no stringent drift, particularly in regions that are subject to no stringent structural constraintsstructural constraints

HoweverHoweverintraspecific homogeneity of intraspecific homogeneity of NTS regions in NTS regions in XenopusXenopus does not does not decrease with evolutionarydecrease with evolutionary timetime

[Gene Evolution]

An unexpected evolutionary phenomenonAn unexpected evolutionary phenomenonAn unexpected evolutionary phenomenon

Evolutionary scenarios for homogenization tandemly repeated arraEvolutionary scenarios for homogenization tandemly repeated arrayya.a. Stringent Selection (NO)Stringent Selection (NO)b.b. Recent Multiplication (NO)Recent Multiplication (NO)

c.c. Concerted EvolutionConcerted Evolution

Concerted EvolutionConcerted EvolutionConcerted Evolution

individual member of a gene familyindividual member of a gene family does does not evolve independentlynot evolve independently of the other of the other members of the familymembers of the familyiit t exchanges sequence informationexchanges sequence information with with other members other members reciprocallyreciprocally or or nonnon--reciprocallyreciprocally

through through genetic interactionsgenetic interactions among its members, among its members, a multigene family a multigene family evolves in concert as a unitevolves in concert as a unitresults in a results in a homogenized set of nonallelic homologous sequenceshomogenized set of nonallelic homologous sequencesconcerted evolution requires concerted evolution requires

1.1. horizontal transfer of mutations among the family membershorizontal transfer of mutations among the family members((homogenizationhomogenization))

2.2. spread of mutations to all individuals in the populationspread of mutations to all individuals in the population ((fixationfixation))

Page 23: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

23

[Gene Evolution]

Mechanisms of concerted evolutionMechanisms of concerted evolutionMechanisms of concerted evolution

•• Gene Conversion Gene Conversion •• Unequal CrossingUnequal Crossing--Over Over •• SlippedSlipped--Strand Mispairing (Replication Slippage)Strand Mispairing (Replication Slippage)•• Duplicative TranspositionDuplicative Transposition

Concerted EvolutionConcerted EvolutionConcerted Evolution

result in a result in a homogenized set of nonallelic homologous sequenceshomogenized set of nonallelic homologous sequences

[Gene Evolution]

Gene Conversion Gene Conversion Gene Conversion

nonreciprocal recombination process nonreciprocal recombination process in whichin which two sequences two sequences interact interact in such a way thatin such a way that one is converted by the otherone is converted by the other

•• according to the chromatids involved in the process, gene converaccording to the chromatids involved in the process, gene conversion can be sion can be divided into several different typesdivided into several different types

1.1. intrachromatid conversionintrachromatid conversionexchange between paralogous sequences on the same chromatidexchange between paralogous sequences on the same chromatid

2.2. sistersister--chromatid conversionchromatid conversionexchange between paralogous sequences from complementary chromatexchange between paralogous sequences from complementary chromatidsids

3.3. classical conversionclassical conversionexchanges between alleles at the same locusexchanges between alleles at the same locus

4.4. semiclassical conversionsemiclassical conversionexchange between paralogous genes from two homologous chromosomeexchange between paralogous genes from two homologous chromosomess

5.5. ectopic conversionectopic conversionexchange between paralogous sequences located on nonhomologous exchange between paralogous sequences located on nonhomologous chromosomeschromosomes

Concerted EvolutionConcerted EvolutionConcerted Evolution

[Gene Evolution]

Gene Conversion Gene Conversion Gene Conversion

Concerted EvolutionConcerted EvolutionConcerted Evolution

ectopicectopic

classicalclassical semiclassicalsemiclassical

sistersister--chromatidchromatid

intrachromatidintrachromatid

homologoushomologouschromosomechromosome

pairpair

nonhomologousnonhomologouschromosomechromosome

[Gene Evolution]

Gene Conversion Gene Conversion Gene Conversion •• gene conversion has been gene conversion has been found in all speciesfound in all species and at and at all lociall loci that were that were

examined in detailexamined in detail

–– most important types of GC are the most important types of GC are the nonallelic conversionsnonallelic conversions

•• the the rate of gene conversion varies with genomic locationrate of gene conversion varies with genomic location

Unbiased Unbiased gene conversiongene conversion–– sequence A has as much chance of converting sequence B as sequence A has as much chance of converting sequence B as –– sequence B has of converting sequence Asequence B has of converting sequence A

Biased Biased gene conversiongene conversion–– the probabilities of gene conversion between two sequences in ththe probabilities of gene conversion between two sequences in the two e two

possible directions occur are unequalpossible directions occur are unequal•• then one sequence is the then one sequence is the mastermaster and the other is the and the other is the slaveslave

more commonmore common than unbiased gene conversionthan unbiased gene conversion

Concerted EvolutionConcerted EvolutionConcerted Evolution

[Gene Evolution]

Mechanisms of concerted evolutionMechanisms of concerted evolutionMechanisms of concerted evolution

•• Gene Conversion Gene Conversion

•• Unequal CrossingUnequal Crossing--OverOver•• SlippedSlipped--Strand Mispairing (Replication Slippage)Strand Mispairing (Replication Slippage)•• Duplicative TranspositionDuplicative Transposition

Concerted EvolutionConcerted EvolutionConcerted Evolution

result in a result in a homogenized set of nonallelic homologous sequenceshomogenized set of nonallelic homologous sequences

[Gene Evolution]

Unequal Crossing-Over Unequal CrossingUnequal Crossing--Over Over •• reciprocal recombination process that creates reciprocal recombination process that creates

–– a sequence a sequence duplicationduplication in one chromatid or chromosome and in one chromatid or chromosome and –– a a corresponding deletioncorresponding deletion in the otherin the other

•• may occur either may occur either –– between the 2 sister chromatidsbetween the 2 sister chromatids of a chromosome during mitosis in a of a chromosome during mitosis in a

germgerm--line cell line cell –– between two homologous chromosomes between two homologous chromosomes at meiosisat meiosis

•• despite the number of repeats either increases or decreasesdespite the number of repeats either increases or decreases–– both daughter chromosomesboth daughter chromosomes have a have a more homogenous repeat makemore homogenous repeat make--

upup than the parental chromosomethan the parental chromosome

Concerted EvolutionConcerted EvolutionConcerted Evolution

Page 24: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

24

[Gene Evolution]

Unequal Crossing-Over Unequal CrossingUnequal Crossing--Over Over

Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]

Unequal Crossing-Over Unequal CrossingUnequal Crossing--Over Over •• if this process is repeatedif this process is repeated

–– the the numbers of each variant repeatnumbers of each variant repeat on a chromosome will on a chromosome will fluctuatefluctuatewith timewith time

–– eventually eventually one type will become dominantone type will become dominant in the familyin the family

one type of repeat may spread throughout a gene family due to one type of repeat may spread throughout a gene family due to repeated rounds of unequal crossingrepeated rounds of unequal crossing--overover

unequal crossingunequal crossing--over has also been suggested to have played a much more over has also been suggested to have played a much more important roleimportant role than gene conversion in the than gene conversion in the concerted evolution of concerted evolution of immunoglobulin VH gene familyimmunoglobulin VH gene family in mousein mouse

Concerted EvolutionConcerted EvolutionConcerted Evolution

[Gene Evolution]

Unequal Crossing-Over Unequal CrossingUnequal Crossing--Over Over

Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]

Relative roles of gene conversion and unequal crossing-overRelative roles of gene conversion and unequal crossingRelative roles of gene conversion and unequal crossing--overover

Gene conversion have several advantagesGene conversion have several advantages over unequal crossingover unequal crossing--overover

1.1. GC causes GC causes no change in gene numberno change in gene number–– UCUC--O generates changes in the number of repeated genes within a famO generates changes in the number of repeated genes within a familyily

•• sometimes cause a significant dosage imbalancesometimes cause a significant dosage imbalancee.g. e.g. deletion of one of the two deletion of one of the two αα--globin genesglobin genes following following

unequal crossingunequal crossing--over gives rise to the mild form of over gives rise to the mild form of αα--thalassemiathalassemia in homozygotesin homozygotes

Concerted EvolutionConcerted EvolutionConcerted Evolution

[Gene Evolution]

Relative roles of gene conversion and unequal crossing-overRelative roles of gene conversion and unequal crossingRelative roles of gene conversion and unequal crossing--overover

Gene conversion have several advantagesGene conversion have several advantages over unequal crossingover unequal crossing--overover

1.1. GC causes GC causes no change in gene numberno change in gene number

2.2. GC can act as a GC can act as a correction mechanismcorrection mechanism on on a.a. tandem repeats but also ontandem repeats but also on

b.b. dispersed repeatsdispersed repeats within a chromosome, between homologous within a chromosome, between homologous chromosomes, or between nonhomologous chromosomeschromosomes, or between nonhomologous chromosomes

–– UCUC--O is severely restricted when repeats dispersed on nonhomologousO is severely restricted when repeats dispersed on nonhomologouschromosomes are involvedchromosomes are involved

Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]

Relative roles of gene conversion and unequal crossing-overRelative roles of gene conversion and unequal crossingRelative roles of gene conversion and unequal crossing--overover

Gene conversion have several advantagesGene conversion have several advantages over unequal crossingover unequal crossing--overover

1.1. GC causes GC causes no change in gene numberno change in gene number

2.2. GC can act as a GC can act as a correction mechanismcorrection mechanism on on a.a. tandem repeats but also ontandem repeats but also on

b.b. dispersed repeatsdispersed repeats within a chromosome, between homologous within a chromosome, between homologous chromosomes, or between nonhomologous chromosomeschromosomes, or between nonhomologous chromosomes

–– UCO is severely restricted when repeats dispersed on nonhomologoUCO is severely restricted when repeats dispersed on nonhomologous us chromosomes are involvedchromosomes are involved

Concerted EvolutionConcerted EvolutionConcerted Evolution

Page 25: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

25

[Gene Evolution]

Relative roles of gene conversion and unequal crossing-overRelative roles of gene conversion and unequal crossingRelative roles of gene conversion and unequal crossing--overover

Advantages of Gene Conversion over Unequal CrossingAdvantages of Gene Conversion over Unequal Crossing--OverOver1.1. GC causes GC causes no change in gene numberno change in gene number

2.2. GC can act as a GC can act as a correction mechanismcorrection mechanism on tandem and dispersed repeatson tandem and dispersed repeats

3.3. GC can be GC can be biasedbiased, i.e., have a preferred direction, i.e., have a preferred direction–– experimental data from fungi have shown that bias in the directiexperimental data from fungi have shown that bias in the direction of on of

gene conversion is common and often stronggene conversion is common and often strong–– theoretical studies have shown that even a theoretical studies have shown that even a small bias can have a large small bias can have a large

effecteffect on the probability of fixationon the probability of fixation of repeated mutantsof repeated mutants

Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]

Relative roles of gene conversion and unequal crossing-overRelative roles of gene conversion and unequal crossingRelative roles of gene conversion and unequal crossing--overover

Advantages of Unequal CrossingAdvantages of Unequal Crossing--Over over Gene Conversion Over over Gene Conversion 1.1. UCO is UCO is fasterfaster and and more efficientmore efficient in bringing about concerted in bringing about concerted

evolutionevolution–– at the mutation level, UCO occurs more frequently than GCat the mutation level, UCO occurs more frequently than GC

2.2. in a GC event, only a small region is involvedin a GC event, only a small region is involved–– in yeastin yeast

•• an unequal crossingan unequal crossing--over event involves on average ~20,000 bpover event involves on average ~20,000 bp•• a GC track cannot exceed 1,500 bpa GC track cannot exceed 1,500 bp

Concerted EvolutionConcerted EvolutionConcerted Evolution

[Gene Evolution]

Factors affecting the rate of concerted evolutionFactors affecting the rate of concerted evolutionFactors affecting the rate of concerted evolution

1.1. thethe number of repeatsnumber of repeats–– the number of UCO required for the fixation of a variant repeat the number of UCO required for the fixation of a variant repeat increases increases

roughly with nroughly with n22 (n = the number of repeats)(n = the number of repeats)

2.2. thethe arrangement of the repeatsarrangement of the repeats–– dispersed arrangement dispersed arrangement

a.a. causes UCO to lead to disastrous genetic consequencescauses UCO to lead to disastrous genetic consequencesb.b. reduces the frequency of gene conversionreduces the frequency of gene conversion

3.3. relativerelative sizes of slowly and rapidly evolving regions within the sizes of slowly and rapidly evolving regions within the repeatrepeat unitunit

–– both UCO and GC depend on sequence similarity for the misalignmeboth UCO and GC depend on sequence similarity for the misalignment of nt of repeatsrepeats

–– the more coding regionsthe more coding regions (slowly evolving) there are, (slowly evolving) there are, the higher the the higher the ratesrates concerted evolution will beconcerted evolution will be

Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]

Factors affecting the rate of concerted evolutionFactors affecting the rate of concerted evolutionFactors affecting the rate of concerted evolution

4.4. constraints on homogeneityconstraints on homogeneity–– if function requires large amounts of an invariable gene productif function requires large amounts of an invariable gene product

e.g. rRNA and histone genese.g. rRNA and histone genesselection against variationselection against variation

–– if the function requires a large amount of diversityif the function requires a large amount of diversitye.g. immunoglobulin and histocompatibility genese.g. immunoglobulin and histocompatibility genes

selection against homogeneityselection against homogeneity

5.5. mechanismsmechanisms of concerted evolutionof concerted evolution–– concerted evolution under UCO is quicker than that under GCconcerted evolution under UCO is quicker than that under GC

6.6. population sizepopulation size•• the time required for a variant to become fixed in a population the time required for a variant to become fixed in a population depends depends

on population sizeon population size

Concerted EvolutionConcerted EvolutionConcerted Evolution

[Gene Evolution]

Evolutionary implications of concerted evolutionEvolutionary implications of concerted evolutionEvolutionary implications of concerted evolution

1.1. spread of advantageous mutationsspread of advantageous mutations–– spread of deleterious mutations is avoided by selectionspread of deleterious mutations is avoided by selection

2.2. retardation of divergence of duplicate genesretardation of divergence of duplicate genes3.3. obliteration on evolutionary historyobliteration on evolutionary history4.4. generation of allelic variationgeneration of allelic variation

Concerted EvolutionConcerted EvolutionConcerted Evolution

Gene ConversionGene Conversion

2 loci, 2 loci, 44 allelesalleles

2 loci, 3 alleles2 loci, 3 alleles

decreasedecreasevariationvariation

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

Concerted evolution Concerted evolution erases the record of molecular divergenceerases the record of molecular divergenceduring the evolution of paralogous sequencesduring the evolution of paralogous sequences

–– observing similar paralogous sequences from a species, it is usuobserving similar paralogous sequences from a species, it is usually impossible ally impossible to distinguish between two possible alternativesto distinguish between two possible alternatives

a.a. the sequences have onlythe sequences have only recently divergedrecently diverged from one another by duplicationfrom one another by duplication

b.b. the sequences have the sequences have evolved in concertevolved in concert

The phylogenetic approachThe phylogenetic approach can be good choicecan be good choice

Concerted EvolutionConcerted EvolutionConcerted Evolution

Page 26: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

26

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

The two The two αα--globin genesglobin genes in humans are almost identicalin humans are almost identical•• thought to have duplicated quite recentlythought to have duplicated quite recently

–– no sufficient time for them to diverge in sequenceno sufficient time for them to diverge in sequence

Concerted EvolutionConcerted EvolutionConcerted Evolution

•• duplicated duplicated αα--globin genes were also globin genes were also discovered in distantly relateddiscovered in distantly related speciesspecies

•• 2 possibilities2 possibilitiesa.a. multiple multiple genegene--duplication events occurred independentlyduplication events occurred independently in many in many

evolutionary lineagesevolutionary lineagesb.b. the two genes are quite ancient (duplicated once in the common the two genes are quite ancient (duplicated once in the common

ancestor) but ancestor) but their antiquity was subsequently obscured by their antiquity was subsequently obscured by concerted evolutionconcerted evolution

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes–– arose by a duplication that occurred approximately 55 Mya, afterarose by a duplication that occurred approximately 55 Mya, after the the

divergence between prosimians and simiandivergence between prosimians and simian

Concerted EvolutionConcerted EvolutionConcerted Evolution

•• since the African apes diverged much later, we would expect since the African apes diverged much later, we would expect the the GGγγ orthologousorthologous genes from apes to be much genes from apes to be much more similarmore similar to each to each other other than to anythan to any of the of the AAγγ paralogsparalogs

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

Concerted EvolutionConcerted EvolutionConcerted Evolution

duplication of Aγ and Gγ-globin genes duplication of Aduplication of Aγγ and Gand Gγγ--globin genes globin genes

African apes divergenceAfrican apes divergenceAfrican apes divergence

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apesthe orthologs should be closer to one another than the paralogsthe orthologs should be closer to one another than the paralogs

Concerted EvolutionConcerted EvolutionConcerted Evolution

duplication 55 Myaduplicationduplication 5555 MyaMya

speciation 5-7 Myaspeciation speciation 55--77 MyaMya

γγγ

AγAAγγ GγGGγγ

AγAAγγ GγGGγγAγAAγγ GγGGγγ

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apesin humansin humans

55’’ AAγγ ≠≠ GGγγ at 7 out of 1,550 nucleotide positions (at 7 out of 1,550 nucleotide positions (0.5%0.5%))

33’’ AAγγ ≠≠ GGγγ at 145 out of 1,550 nucleotide positions (at 145 out of 1,550 nucleotide positions (9.4%9.4%))

–– assuming that the 5assuming that the 5’’ and the 3and the 3’’ are subject to are subject to similar functional constraintssimilar functional constraints5' end of the gene underwent gene conversion5' end of the gene underwent gene conversion

–– intron 2intron 2 in both genes in all apes contains a in both genes in all apes contains a stretch ofstretch of (TG)(TG)nn that can that can serve as a serve as a hotspot for recombinationhotspot for recombination events involved in the process of events involved in the process of gene conversiongene conversion

Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes

Concerted EvolutionConcerted EvolutionConcerted Evolution

exon 3exonexon 33

Orthologous genesOrthologous genes from apes are more from apes are more

similar to each othersimilar to each other if their if their 33’’ partsparts

(i.e. the 3rd exons) are considered(i.e. the 3rd exons) are considered

AγAAγγ

GγGGγγ

Page 27: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

27

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes

Concerted EvolutionConcerted EvolutionConcerted Evolution

exon 3exonexon 33 exons 1 and 2 exonsexons 1 and 2 1 and 2

The 5The 5’’ parts (exons 1 and 2)parts (exons 1 and 2)•• different phylogenetic patterndifferent phylogenetic pattern•• paralogous exons within each species paralogous exons within each species

resemble each other moreresemble each other more than they than they resemble their orthologous counterparts in resemble their orthologous counterparts in other apesother apes

AγAAγγ

GγGGγγ

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes

Concerted EvolutionConcerted EvolutionConcerted Evolution

exon 3exonexon 33 exons 1 and 2 exonsexons 1 and 2 1 and 2

This tree contains an additional anomalyThis tree contains an additional anomaly•• it clusters it clusters chimpanzeechimpanzee and and gorillagorilla as a as a cladeclade

AγAAγγ

GγGGγγ

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes

Concerted EvolutionConcerted EvolutionConcerted Evolution

exon 3exonexon 33 exons 1 and 2 exonsexons 1 and 2 1 and 2

AγAAγγ

GγGGγγ

AγAAγγ

GγGGγγ

expected treefor exons 1 and 2

ifGC had only occurredin the human lineage

expectedexpected treetreefor exons 1 and 2for exons 1 and 2

ififGCGC had had only occurredonly occurredin the in the human lineagehuman lineage

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes

Concerted EvolutionConcerted EvolutionConcerted Evolution

exon 3exonexon 33 exons 1 and 2 exonsexons 1 and 2 1 and 2

AγAAγγ

GγGGγγ

AγAAγγ

GγGGγγ

expected tree if GC had only occurred in the human lineageexpected tree if GC had only expected tree if GC had only occurred in the human lineageoccurred in the human lineage

each of the 3 lineageseach of the 3 lineages has experienced has experienced multiple independent GC multiple independent GC eventsevents

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes

Concerted EvolutionConcerted EvolutionConcerted Evolution

exons 1 and 2 exonsexons 1 and 2 1 and 2

Assuming that all parts of the genes evolve at equal ratesAssuming that all parts of the genes evolve at equal rates•• it is possible to it is possible to date the last gene conversion eventdate the last gene conversion event by using by using

a.a. the degrees of similarity between the two sequences, and the degrees of similarity between the two sequences, and b.b. the date for the gene duplication eventthe date for the gene duplication event

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

The AThe Aγγ and Gand Gγγ--globin genes in the great apesglobin genes in the great apes

Concerted EvolutionConcerted EvolutionConcerted Evolution

exons 1 and 2 exonsexons 1 and 2 1 and 2

11--2 Mya2 Mya last conversion event in the last conversion event in the human lineage human lineage

–– i.e., after the divergence between human and chimpanzeei.e., after the divergence between human and chimpanzee

GC in the chimpGC in the chimp and and gorillagorilla lineages lineages occurred independentlyoccurred independently

Page 28: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

28

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

GC in the evolution of XGC in the evolution of X--linked color vision geneslinked color vision genesRed and green opsin genes in humansRed and green opsin genes in humans

–– intron 4 and 2 sequences of these two genes are identical (or neintron 4 and 2 sequences of these two genes are identical (or nearly identical)arly identical)–– unexpected low divergencesunexpected low divergences because the duplication event producing the because the duplication event producing the

two genes have occurred before the separation of the human and Otwo genes have occurred before the separation of the human and Old World ld World monkey lineages (monkey lineages (<35 Mya)<35 Mya)

Concerted EvolutionConcerted EvolutionConcerted Evolution

exon 1exonexon 11 exon 2exonexon 22 exon 3exonexon 33 exon 4exonexon 44 exon 5exonexon 55 exon 6exonexon 66

intron 2intronintron 22 intron 4intronintron 44

Ks = 5.6Ka = 1.5KKs = 5.6s = 5.6KKa = 1.5a = 1.5

Ks = 1.9Ka = 3.5KKs = 1.9s = 1.9KKa = 3.5a = 3.5

Ks = 7.6Ka = 3.9KKs = 7.6s = 7.6KKa = 3.9a = 3.9

Ks = 0.0Ka = 0.0KKs = 0.0s = 0.0KKa = 0.0a = 0.0K = 0.3KK = 0.3= 0.3 K = 0.0KK = 0.0= 0.0Ks = 0

Ka = 0KKs = 0s = 0KKa = 0a = 0

Ks = 2.0Ka = 1.5KKs = 2.0s = 2.0KKa = 1.5a = 1.5

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

GC in the evolution of XGC in the evolution of X--linked color vision geneslinked color vision genesRed and green opsin genes in humansRed and green opsin genes in humans

–– divergences in the introns are significantly lower than both thedivergences in the introns are significantly lower than both the synonymous synonymous and the nonsynonymous divergence in the coding sequences of exonand the nonsynonymous divergence in the coding sequences of exons 2s 2--55

–– high similarities in the intron sequences might be due to high similarities in the intron sequences might be due to very recent GCvery recent GC, , probably probably during evolution of the human lineageduring evolution of the human lineage

Concerted EvolutionConcerted EvolutionConcerted Evolution

exon 1exonexon 11 exon 2exonexon 22 exon 3exonexon 33 exon 4exonexon 44 exon 5exonexon 55 exon 6exonexon 66

intron 2intronintron 22 intron 4intronintron 44

Ks = 5.6Ka = 1.5KKs = 5.6s = 5.6KKa = 1.5a = 1.5

Ks = 1.9Ka = 3.5KKs = 1.9s = 1.9KKa = 3.5a = 3.5

Ks = 7.6Ka = 3.9KKs = 7.6s = 7.6KKa = 3.9a = 3.9

Ks = 0.0Ka = 0.0KKs = 0.0s = 0.0KKa = 0.0a = 0.0K = 0.3KK = 0.3= 0.3 K = 0.0KK = 0.0= 0.0Ks = 0

Ka = 0KKs = 0s = 0KKa = 0a = 0

Ks = 2.0Ka = 1.5KKs = 2.0s = 2.0KKa = 1.5a = 1.5

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

GC in the evolution of XGC in the evolution of X--linked color vision geneslinked color vision genesRed and green opsin genes in humansRed and green opsin genes in humans

–– GC can occur in exons as well as in intronsGC can occur in exons as well as in introns–– GC events in exons may be GC events in exons may be disadvantageousdisadvantageous

•• reduce the differences between the red and green pigment genes reduce the differences between the red and green pigment genes •• reduce the ability to distinguish betweenreduce the ability to distinguish between red and green red and green colorscolors

the resultant changes may be the resultant changes may be eliminated from the populationeliminated from the population

High frequency of High frequency of redred--greengreen or or greengreen--red fusion genesred fusion genes in human populationsin human populations

–– ~~16% of Caucasian 16% of Caucasian ♂♂–– ~~21% of African21% of African--American American ♂♂

suggests that suggests that during meiosis during meiosis thethe red red andand green pigment genes green pigment genes maymaymispairmispair frequentlyfrequently because of their high sequence similaritybecause of their high sequence similarity

•• mispairing during meiosis increases the probability of GCmispairing during meiosis increases the probability of GC

•• high levels of high levels of similarity in intronssimilarity in introns facilitate mispairing and facilitate mispairing and recombination, leading to production of recombination, leading to production of hybrid geneshybrid genes

Concerted EvolutionConcerted EvolutionConcerted Evolution[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

GC in the evolution of XGC in the evolution of X--linked color vision geneslinked color vision genesRed and green opsin genes in humans, apes and OWMRed and green opsin genes in humans, apes and OWM

–– intron 4 sequences between the two genes have been strongly or cintron 4 sequences between the two genes have been strongly or completely ompletely homogenized in all 3 specieshomogenized in all 3 species

Concerted EvolutionConcerted EvolutionConcerted Evolution

intron 4K = 0.0

intronintron 44KK = 0.0= 0.0

exon 4+5 Ks = 8.1Ka = 5.8

exonexon 4+5 4+5 KKs = 8.1s = 8.1KKa = 5.8a = 5.8

intron 4K = 0.3

intronintron 44KK = 0.3= 0.3

exon 4+5Ks = 6.6Ka = 5.1

exonexon 4+54+5KKs = 6.6s = 6.6KKa = 5.1a = 5.1

intron 4K = 0.9

intronintron 44KK = 0.9= 0.9

exon 4+5Ks = 11.5Ka = 5.1

exonexon 4+54+5KKs = 11.5s = 11.5KKa = 5.1a = 5.1

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

GC in the evolution of XGC in the evolution of X--linked color vision geneslinked color vision genesRed and green opsin genes in humansRed and green opsin genes in humans

–– two or more conversion eventstwo or more conversion events may have occurred at different times in may have occurred at different times in introns 4introns 4 of the two pigment genes in of the two pigment genes in baboonsbaboons

Concerted EvolutionConcerted EvolutionConcerted Evolution

exon 4+5Ks = 4.2Ka = 0.0

exon 4+5exon 4+5KKs = 4.2s = 4.2KKa = 0.0a = 0.0

intron 4K = 1.1

intronintron 44KK = 1.1= 1.1

exon 4+5Ks = 5.8Ka = 1.1

exon 4+5exon 4+5KKs = 5.8s = 5.8KKa = 1.1a = 1.1

intron 4K = 7.3

intronintron 44KK = 7.3= 7.3

exon 4+5Ks = 0.0Ka = 0.7

exon 4+5exon 4+5KKs = 0.0s = 0.0KKa = 0.7a = 0.7

intron 4K = 1.0

intronintron 44KK = 1.0= 1.0

exon 4+5Ks = 9.0Ka = 1.3

exon 4+5exon 4+5KKs = 9.0s = 9.0KKa = 1.3a = 1.3

intron 4K = 7.1

intronintron 44KK = 7.1= 7.1

[Gene Evolution]

Detecting Concerted EvolutionDetecting Concerted EvolutionDetecting Concerted Evolution

GC in the evolution of XGC in the evolution of X--linked color vision geneslinked color vision genesRed and green opsin genes in humansRed and green opsin genes in humans

–– strong strong natural selectionnatural selection for maintaining the distinct functions of exons 4 and for maintaining the distinct functions of exons 4 and 5 of the red and green pigment genes has 5 of the red and green pigment genes has acted against sequence acted against sequence homogenization homogenization of theseof these exonsexons

Concerted EvolutionConcerted EvolutionConcerted Evolution

exon 4+5Ks = 4.2Ka = 0.0

exon 4+5exon 4+5KKs = 4.2s = 4.2KKa = 0.0a = 0.0

intron 4K = 1.1

intronintron 44KK = 1.1= 1.1

exon 4+5Ks = 5.8Ka = 1.1

exon 4+5exon 4+5KKs = 5.8s = 5.8KKa = 1.1a = 1.1

intron 4K = 7.3

intronintron 44KK = 7.3= 7.3

exon 4+5Ks = 0.0Ka = 0.7

exon 4+5exon 4+5KKs = 0.0s = 0.0KKa = 0.7a = 0.7

intron 4K = 1.0

intronintron 44KK = 1.0= 1.0

exon 4+5Ks = 9.0Ka = 1.3

exon 4+5exon 4+5KKs = 9.0s = 9.0KKa = 1.3a = 1.3

intron 4K = 7.1

intronintron 44KK = 7.1= 7.1

Page 29: [HMG] 04 - Gene Evolution · Genome EvolutionGenome Evolution [Gene Evolution] Genome changes • Mutation • Recombination • Transposition • Gene transfer (e.g., between organelles

29

[Gene Evolution]

Gene EvolutionGene EvolutionGene Evolution[Gene Evolution]

Concerted evolution of genes and pseudogenesConcerted evolution of genes and pseudogenesConcerted evolution of genes and pseudogenes

Pseudogenes Pseudogenes may representmay represent reservoirs of genetic information reservoirs of genetic information that that participate in theparticipate in the evolution of new genesevolution of new genes, rather than, rather than relics relics of of inactivated genes whose fate is genomic extinctioninactivated genes whose fate is genomic extinction

–– the the proximity of a gene to a pseudogeneproximity of a gene to a pseudogene, however, may not only spell , however, may not only spell rebirth for the pseudogene, but also rebirth for the pseudogene, but also death for the genedeath for the gene

2121--hydroxylase (cytochrome P21) genehydroxylase (cytochrome P21) gene

–– example of gene death by concerted evolutionexample of gene death by concerted evolution

–– involved in the involved in the Congenital Adrenal HyperplasiaCongenital Adrenal Hyperplasia

–– 1010--exon geneexon gene located on located on chromosome 6chromosome 6 in a region in which many MHC in a region in which many MHC and complement genes are interspersed with each otherand complement genes are interspersed with each other

–– there is a there is a paralogous unprocessed pseudogene in the vicinityparalogous unprocessed pseudogene in the vicinity

–– in many organisms one of the genes became nonfunctionalin many organisms one of the genes became nonfunctional

–– the the nonfunctionalization nonfunctionalization eventevent occurred independentlyoccurred independently in many lineagesin many lineages

ortholog of the ortholog of the human functional genehuman functional gene pseudogene in mousepseudogene in mouse

ortholog of the ortholog of the human pseudogenehuman pseudogene functional gene in mousefunctional gene in mouse

Concerted EvolutionConcerted EvolutionConcerted Evolution

[Gene Evolution]

Concerted evolution of genes and pseudogenesConcerted evolution of genes and pseudogenesConcerted evolution of genes and pseudogenes

2121--hydroxylase (cytochrome P21) genehydroxylase (cytochrome P21) gene–– Congenital Adrenal HyperplasiaCongenital Adrenal Hyperplasia (21(21--Hydroxylase deficiency)Hydroxylase deficiency)

Concerted EvolutionConcerted EvolutionConcerted Evolution

–– hundreds of mutations in the 21hundreds of mutations in the 21--hydroxylase hydroxylase gene have been describedgene have been described

–– 75% of them are due to gene conversion75% of them are due to gene conversion