shrish tiwari ccmb, hyderabad comparative genomics: overview

37
Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Upload: jewel-carr

Post on 27-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Shrish Tiwari

CCMB, Hyderabad

Comparative Genomics: Overview

Page 2: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Introduction

• Sequences of 340 species available (274 bacterial, 25 archaeal and 41 eukaryotic)

• An additional 848 prokaryotic and 560 eukaryotic genome projects are ongoing

• Comparison of genomes can provide insights into the functional regions as well as genome dynamics

Page 3: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Sequence Comparison

• Let us look at a simple example

A A T T G A - A T C G C C A

A – A T C A C A G – G A T C5 matches, 6 mismatches, 3 indels

A A T T G A – A T C G C - C A

A A T – C A C A – G G A T C –7 matches, 3 mismatches, 5 indels

Page 4: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Sequence Comparison

• Requirements for sequence comparison:– A scoring scheme or scoring matrix– A search algorithm to identify the optimal

alignment

• Scoring matrices available: PAM, BLOSUM

• Search algorithm used: Dynamic programming

Page 5: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Applications

• Tracing our origins and history

• Assessing the diversity of a species

• Finding virulence genes

• Designing primers for novel species

• Identifying disease-causing mutations

• Predicting mutations in viral genome and design vaccines

Page 6: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Comparative Genomics

• Of distantly related species: look for similarities/conserved regions to infer functional regions of the genome; example mouse and man

• Of closely related species: look for differences, identify subtle mutations that make one species different from the other, understand how genomes evolve; examples chimp and man, virulent E. coli and benign E. coli

Page 7: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Comparative Genomics

• Comparison of the 73Kbp region of human β-globin with mouse and chimp genome shows 1) small stretches covering the first two exons and intervening intron matching at ~73% identity between human and mouse, 2) almost the complete 73Kbp region matches at ~97% for human and chimp

Page 8: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

How different are we?

• Physical similarity is striking

Page 9: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

How different are we?

• Socially, we have similar behaviour, including cooperation, warfare, politics and even bribery

Page 10: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Ape the toolmaker

Page 11: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Chimp Genome: Statistics

• Sequence of a single male captive-born chimpanzee from West Africa sub-species Pan troglodytes verus, obtained using a whole genome shotgun approach

• Assembly of the genome was done with PCAP and ARACHNE programs

• PCAP is a de novo assembly method; ARACHNE uses the human genome build 34 to facilitate and confirm contig linking and has more continuity

Page 12: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Chimp Genome: Statistics

• 3.6 fold redundancy of autosomes and 1.8 fold for sex chromosomes; covers 94% of chimp genome with >98% of the sequence in high quality bases (quality score >40, error rate <10-4)

• 50% of the sequence (N50) in contigs of length >15.7Kbp and supercontigs of length >8.6Mbp

Page 13: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Chimp Genome Sequence

• Chimp genomes are polymorphic within and between subspecies

• 1.66 million high-quality SNPs identified, of which 1.01 million are heterozygous in the primary donor

• Diversity rates among West African chimps is 8x10-4 (roughly the same as human diversity) and 17.6x10-4 among Central African chimps

Page 14: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Genome Comparison

• Genome comparisons can help to reveal the molecular basis of these traits as well evolutionary mechanisms that have moulded our species

• Reciprocal nucleotide-level alignment of the chimp and human genome covers ~2.4Gbp of high quality sequence

Page 15: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Genome Comparison

• Observed difference nearly always a single event in time and not multiple independent changes over time

• Most differences reflect random drift and hold extensive information about mutational processes

• A minority of functionally important changes underlie our phenotypic differences

Page 16: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Segmental Duplication• Has had a larger impact (~2.7%) in

altering the genomic landscape than single nucleotide substitutions (~1.2%)

• They are responsible for the emergence of new genes and adaptation of humans to their environment

• Human genome particularly enriched in genes resulting from recent duplications

Page 17: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Segmental Duplication

• 33% of human duplications (>94% identity) are not duplicated in chimpanzee

• An estimated duplication rate of 4-5Mbp per million years

• These have resulted in differences in gene expression, disease-causing duplications and change in the genomic landscape in general

Page 18: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Segmental Duplication

• Chimp only duplications: 11 out 17 were found only in chimp and not in man or other great apes in a cross-species comparison, whereas 6 were found also in gorilla

• De novo duplications followed by deletion of older duplications are the most likely scenarios for excess of segmental duplications observed in human-ape genomes

Page 19: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Gene Evolution

• 13,454 pairs of human and chimp genes with unambiguous 1:1 orthology were used

• Rate of evolution of a gene assessed using the non-synonymous substitution rate KA

Page 20: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Gene Evolution

• The background rate is estimated as the synonymous substitution rate Ks

• KA/Ks is a measure of evolutionary constraint on a gene

• KA/Ks > 1 implies adaptive or positive selection, under the assumption that synonymous changes are neutral

Page 21: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Gene Evolution

• KA/Ks = 0.23 for human-chimpanzee lineage 77% of amino acid substitutions are removed by natural selection

• CpG and non-CpG substitution at synonymous sites show lower duvergence, ~50% and ~30% lower respectively, than in introns, implying evolutionary constraint on synonymous substitutions

Page 22: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Gene Evolution

• 585 gene of the 13,454 human-chimp orthologues have KA/KI > 1

• Given the low divergence between human-chimp genome, KA/KI statistic has large variance

• Simulations show that KA/KI > 1 would be expected to occur by chance in 263 cases, if purifying selection acts non-uniformly on genes

Page 23: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Gene Evolution

• The extreme outliers are: – glycophorin C, mediates P. falciparum

invasion pathways in human erythrocytes– granulysin, mediates antimicrobial

activity against intracellular pathogens– protamines & semenogelins involved in

reproduction– Mas-related gene family involved in

nociception

Page 24: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Conclusions• Mean rate of single nucleotide changes

1.23%, <1.06% corresponding to fixed divergence

• Regional variations same in hominid and murid genomes except at subtelomeric regions

• 25% changes in CpG which are similar in both male and female germ lines

• Indels fewer but account for 1.5% of euchromatic sequence being lineage specific

Page 25: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Conclusions• SINEs have been more active in human

while chimp has acquired two new retroviral elements

• Orthologous proteins differ by 2 amino acids, with ~29% identical

• Amino acid altering changes are more frequent in hominids compared to murids, but close to changes seen human polymorphisms

• Substitution rate at silent sites lower than at intronic sites => purifying selections

Page 26: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Is Y going extinct?

• X and Y chromosomes have evolved from an autosomal pair in ancient mammal nearly 300 million years ago

• Most of Y genes in the X-degenerate regions

• X-degenerate region of Y does not recombine, which may lead to rapid gene loss

• Rate of gene loss estimated at 5 genes every million years

Page 27: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Is Y going extinct?

• Assuming gene loss occurs randomly and that human and chimp separated nearly 6 million years ago, many chimp Y genes are expected to have no functional orthologues in human

• Orthologues of all human X-degenerate genes and pseudogenes were searched

• Chimpanzee orthologues of 16 genes and 11 pseudogenes were identified

Page 28: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Is Y going extinct?• All the 11 chimp orthologues of the

human pseudogenes were pseudogenes in the chimp as well, with majority of inactivating mutations shared

• This indicates that none of the pseudogenes were lost between human and chimp in the last 6 million years

• GenScan and BLAST analysis of the chimp X-degenerate Y transcripts revealed that none were chimp specific

Page 29: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Is Y going extinct?• Divergence of X-degenerate exons was

compared with those of introns for genes as well as pedudogenes

• The divergence was found to be less in the exons than introns for genes, but same or more in pseudogenes

• These results suggest that purifying selection has been more effective during human evolution than previously assumed

J.F. Hughes et al. (2005) Nature 437, 101-104

Page 30: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Summary

• While we can learn a lot from a comparison of the human-chimp genomes, they are too much alike to get meaningful answers to many questions, e.g. a DNA sequence found in humans but missing in chimps: was it added in humans or lost in chimps?

Page 31: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Summary

• A difference found could be significant or just a variant within one species

• Sequences of other primates will be needed to establish the uniqueness of changes seen in human and chimps

• Genomes of primates like the orang-utan and rhesus macaque are expected soon

Page 32: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Origin of Clothing

• Humans infested with head and body lice

• Head louse lives and feeds on the scalp

• Body louse lives in clothing and feeds on body

• Chimp louse used as outgroup

Page 33: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Origin of Clothing

• 2 sequences from mtDNA (ND4 and CYTB) and 2 from nuclear DNA (EF-1 and RPII) from 40 lice (26 head lice and 14 body lice) from 12 different geographic regions were used for analysis along with one chimpanzee louse

• Trees built using ND4 and CYTB nearly identical

Page 34: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Origin of Clothing

• Results:– Greater diversity seen in African lice

than in non-African lice African origin for body lice

– Body louse originated ~72000 years ago (assumption human and chimp lice diverged ~5.5 million years ago)

– Demographic expansion of body lice correlates with the spread of modern humans out of Africa

Page 35: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Origin of Clothing

• Results indicate a recent origin of clothing ~72000 years

R, Kittler, M. Kayser and M. Stoneking (2003) “Molecular evolution of Pediculus humanus and the origin of clothing” Current Biology 13, 1414-1417

Page 36: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview

Conclusions

• Genomes of human and model organisms were sequenced in order to understand ourselves at the molecular level

• Comparative genomics studies have revealed interesting features of genome evolution so far

• This is just the tip of the iceberg!!

Page 37: Shrish Tiwari CCMB, Hyderabad Comparative Genomics: Overview