creativecommons/licenses/by-sa/2.0
DESCRIPTION
http://creativecommons.org/licenses/by-sa/2.0/. From Genes to Proteins (and a little bit about allignments). Prof:Rui Alves [email protected] 973702406 Dept Ciencies Mediques Basiques, 1st Floor, Room 1.08 - PowerPoint PPT PresentationTRANSCRIPT
From Genes to Proteins(and a little bit about allignments)
Prof:Rui [email protected]
973702406Dept Ciencies Mediques Basiques,
1st Floor, Room 1.08Website of the
Course:http://web.udl.es/usuaris/pg193845/Courses/Bioinformatics_2007/ Course: http://10.100.14.36/Student_Server/
How Does BLAST Really Work?
• The BLAST programs improved the overall speed of searches while retaining good sensitivity (important as databases continue to grow) by breaking the query and database sequences into fragments ("words"), and initially seeking matches between fragments.
How Does BLAST Really Work?
• Word hits are then extended in either direction in an attempt to generate an alignment with a score exceeding the threshold of "S".
Extending the High Scoring Segment Pair (HSP)
NeighborhoodScore Threshold
MinimumScore
SignificanceDecay
BLAST Algorithm
• Sequences are split into words (default n=3)– Speed, computational efficiency
• Scoring of matches done using scoring matrices• HSP = high scoring segment pair
– BLAST algorithm extends the initial “seed” hit into an HSP
• Local optimal alignment• More than one HSP can be found
Where does the score (S) come from?
• The quality of each pair-wise alignment is represented as a score and the scores are ranked.
• Scoring matrices are used to calculate the score of the alignment base by base (DNA) or amino acid by amino acid (protein).
• The alignment score will be the sum of the scores for each position.
Predicting protein sequence from DNA sequence
• Protein sequence can be predicted by translating the cDNA and using the genetic code.
Codon Universal code
AAA K N(9,14)
AAC K
AAG N
AAT N
ACA T
ACC T
ACG T
ACT T
AGA R END(2) S(5,9,14,21) G(13)
AGC S
AGG R END(2) S(5,9,14,21) G(13)
AGT S
ATA I M(2,3,5,13,21)
ATC I
ATG M
ATT I
Echinoderm/Flatworm mitochondria
Flatworm mitochondria
Vertebrate mitochondria
Yeast mitochondria Invertebrate
mitochondria
Ascidian mitochondria
Trematode mitochondria
Codon Universal code
CAA Q
CAC H
CAG Q
CAT H
CCA P
CCC P
CCG P
CCT P
CGA R
CGC R
CGG R
CGT R
CTA L T(3)
CTC L T(3)
CTG L T(3) S(12)
CTT L T(3)
Yeast mitochondria
Candida
Codon Universal code
GAA E
GAC D
GAG E
GAT D
GCA A
GCC A
GCG A
GCT A
GGA G
GGC G
GGG G
GGT G
GTA V
GTC V
GTG V
GTT V
Codon Universal code
TAA END Q(6) Y(14)
TAC Y
TAG END Q(6,15) L(16,21)
TAT Y
TCA S END(21)
TCC S
TCG S
TCT S
TGA END W(2,3,4,5,6,9,13,14,21) C(10)
TGC C
TGG W
TGT C
TTA L
TTC F
TTG L
TTT F
Ciliate, Dasycladacean Hexamita
Translating yeast mitochondrial cDNA into protein sequence
ATGTCTCTTATATGA………SECIS sequence
TrpSerThrMetsCys
MetSerLeuIleTer
There is a Gene with a considerably different protein sequence from the one we would
predict from the universal genetic code!!!!!