Construction of Substitution Matrices
• BLOSUM: BLOcks SUbstitution Matrix
• PAM: Point Accepted Mutations
Substitution Matrices
• Contain values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution
Substitution Matrices
• Contain values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution
• Are constructed from a large and diverse sample of sequence alignments
Substitution Matrices
• Contain values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution
• Are constructed from a large and diverse sample of sequence alignments
• Multiple alignment of well studied gene sequences from different species
Substitution Matrices
• Contain values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution
• Are constructed from a large and diverse sample of sequence alignments
• Multiple alignment of well studied gene sequences from different species
• Use orthologs - functionally similar
Substitution Matrices
• Contain values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution
• Are constructed from a large and diverse sample of sequence alignments
• Multiple alignment of well studied gene sequences from different species
• Use orthologs - functionally similar• Observed substitutions tend to preserve
functions
Substitution Matrices
• Contain values proportional to the probability that amino acid A mutates into amino acid B for all pairs of amino acids through a period of evolution
• Are constructed from a large and diverse sample of sequence alignments
• Multiple alignment of well studied gene sequences from different species
• Use orthologs - functionally similar• Observed substitutions tend to preserve functions• Minimal gaps
How to Construct Substitution Matrices
Tabulate substitutions• A to A: 9867 times
• A to R: 2 times
• A to N: 9 times
• etc….
How to Construct Substitution Matrices
How to Construct Substitution Matrices (BLOSUM)
How to Construct Substitution Matrices (BLOSUM)
How to Construct Substitution Matrices
Finding the Random Mutation Rate
• Compute overall occurrence of an amino acid in a protein database
Finding the Random Mutation Rate
• Compute overall occurrence of an amino acid in a protein database
http://www.ebi.ac.uk/swissprot/sptr_stats/index.html
Finding the Random Mutation Rate
• Compute overall occurrence of an amino acid in a protein database
http://www.ebi.ac.uk/swissprot/sptr_stats/index.html
Finding the Random Mutation Rate
Example:
Expected random mutation rate is 1 in 10000 and observed mutation rate of W to R is 1 in 10
Score = log (0.1/0.0001) = log (1000) = +3
PAM Matrices
[1 point mutation per 100 amino acids]• does not take into account different evolutionary
rates between conserved and non-conserved regions
• PAM1 is 1% average change in amino acids
• PAM 250:??
PAM Matrices
PAM vs. BLOSUM
Basic Local Alignment Search Tool (BLAST)
• Heuristic method
BLAST Algorithm
BLAST Algorithm
BLAST Algorithm
What can we search and compare?
DNA vs DNA
Protein vs Protein
DNA vs Protein
Protein vs DNA
Reading Frames
The best BLAST program