molecular evidence for endosymbiosis perform blastp to investigate sequence similarity among domains...

24
Molecular evidence for endosymbiosis • Perform blastp to investigate sequence similarity among domains of life • Found yeast nuclear genes exhibit more sequence similarity (closer in evolutionary time) with archaeal genes • Found yeast mitochondrial genes exhibit more sequence similarity with eubacterial genes

Upload: lisa-barnett

Post on 29-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Molecular evidence for endosymbiosis

• Perform blastp to investigate sequence similarity among domains of life

• Found yeast nuclear genes exhibit more sequence similarity (closer in evolutionary time) with archaeal genes

• Found yeast mitochondrial genes exhibit more sequence similarity with eubacterial genes

Page 2: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

t-test and significance

• t-test determines if the data come from the same population or if there are significant differences

• Calculate the mean of data, standard deviation of each data set, derive a weighted standard deviation to be used in t-test

• Compare to t-critical value obtained from t-table or software

Page 3: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Origins of eukaryotic cells

Page 4: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Martin-Muller hypothesis

Martin and Muller hypothesis

Page 5: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Evidence from phylogenetic relationships

Page 6: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Leprae vs. tuberculosis

• Leprae (3.2Mb) is ~50% coding, contrasted with 4.4 Mb and 91% coding for tuberculosis

• Comparing genomes using Mummer:

• http://www.tigr.org/tigr-scripts/CMR2/webmum/mumplot

Page 7: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

How Mummer works:

• Uses suffix trees to create an internal representation of a genome sequence

• Identify maximal unique matches (MUM); version 2.0 uses streaming whereas 1.0 adds sequence 2 to suffix tree for sequence 1

• Alignment via Smith-Waterman

Page 8: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Origin of species

• Mitochondrial DNA and human evolution

• Evolution of pathogens

Page 9: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Phylogeny – data mining by biologists

• Molecular phylogenetics is using clustering techniques to discern relationships between different biological sequences

Page 10: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Why phylogenetics?

• Understand evolutionary history

• Map pathogen strain diversity for vaccines

• Assist in epidemiology (Dentist and HIV)

• Aid in prediction of function of novel genes

• Biodiversity

• Microbial ecology

Page 11: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Changes can occur

Page 12: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Observing differences in nucleotides

• The simplest measure of distance between two sequences is to count the # of sites where the two sequences differ

• If all sites are not equally likely to change, the same site may undergo repeated substitutions

• As time goes by, the number of differences between two sequences becomes less and less an accurate estimator of the actual number of substitutions that have occurred

Page 13: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

The relationship between time and substitutions is non-linear

Page 14: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Various models have been generated to more accurately estimate distance and evolution

• All use the following framework:

Probability matrix

pAC is the probability of a site starting with an A had a C at the end of time interval t, etc.

Base composition of sequence; fa = frequency of A

Page 15: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Jukes-Cantor Model

• Distance between any two sequences is given by: d = -3/4 ln(1-4/3p)

• p is the proportion of nucleotides that are different in the two sequences

• All substitutions are equally probable– Each position in matrix = ; except diagonal =

1-

Page 16: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Kimura’s two parameter model

• d = ½ ln[1/(1-2P-Q)] + ¼ ln[1/1-2Q)]

• P and Q are proportional differences between the two sequences due to transitions and transversions, respectively.

• Accounts for transition bias in sequences (transversions more rare)

Page 17: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Evolutionary models

Page 18: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Implementing models and building trees

Page 19: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Rooted vs. unrooted

• Root – ancestor of all taxa considered

• Unrooted – relationship without consideration of ancestry

• Often specify root with outgroup– Outgroup – distantly related species (ie.

mammals and an archaeal species)

Page 20: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Tree building

• Get protein/RNA/DNA sequences

• Construct multiple sequence alignment

• Compute pairwise distances (if necessary)

• Build tree – topology and distances

• Estimate reliability

• Visualize

Page 21: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Distance methods

• UPMGA

• Neighbor joining

Page 22: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Unweighted pair-group method using arithmetic averages (UPGMA)• Assumes a constant rate of gene

substitution, evolution• Clustering algorithm that measures

distances between all sequences, merges the closest pair, recalculates that node as an average, then merges the next closest pair, re-iterate

• Usually gives a rooted tree

Page 23: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Testing the reliability of trees

• Interior branch test or Bootstrap analysis

• Bootstrap analysis – subsequences or sequence deletion or replacement; re-draw trees; how many times do you get some branching? Bootstrap values of 70 (95) or greater are normally considered reliable

Page 24: Molecular evidence for endosymbiosis Perform blastp to investigate sequence similarity among domains of life Found yeast nuclear genes exhibit more sequence

Homework due on 10/6

• Discovery questions in Chapter 2

• 4, 25-27