intro sequence comparisons visualization alignments scoring
TRANSCRIPT
![Page 1: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/1.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Last time
• Introduction• What is Bioinformatics?• Databases in Bioinformatics
![Page 2: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/2.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Today: Sequence comparisons
• Visualisation• Different objectives• Pairwise alignments
![Page 3: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/3.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Sequence comparisons: Goals
• What are the similarities?• Local similarities — domains and motifs• What is variable?
• Identify positions — basis for evolutionarystudies
• Understand structural similarities• Determine ancestry
![Page 4: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/4.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Sequence comparisons: Goals
• What are the similarities?• Local similarities — domains and motifs• What is variable?
• Identify positions — basis for evolutionarystudies
• Understand structural similarities• Determine ancestry
![Page 5: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/5.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Sequence comparisons: Goals
• What are the similarities?• Local similarities — domains and motifs• What is variable?
• Identify positions — basis for evolutionarystudies
• Understand structural similarities
• Determine ancestry
![Page 6: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/6.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Sequence comparisons: Goals
• What are the similarities?• Local similarities — domains and motifs• What is variable?
• Identify positions — basis for evolutionarystudies
• Understand structural similarities• Determine ancestry
![Page 7: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/7.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Homology
• Definition: Homology = common ancestry
• Principle: Similarity⇒homology• Quote: ”These sequences are somewhat
homologous”. Bad!
Similarity 6= homology
• Correct: ”These sequences are somewhatsimilar”.
![Page 8: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/8.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Homology
• Definition: Homology = common ancestry• Principle: Similarity⇒homology
• Quote: ”These sequences are somewhathomologous”. Bad!
Similarity 6= homology
• Correct: ”These sequences are somewhatsimilar”.
![Page 9: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/9.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Homology
• Definition: Homology = common ancestry• Principle: Similarity⇒homology• Quote: ”These sequences are somewhat
homologous”.
Bad!
Similarity 6= homology
• Correct: ”These sequences are somewhatsimilar”.
![Page 10: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/10.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Homology
• Definition: Homology = common ancestry• Principle: Similarity⇒homology• Quote: ”These sequences are somewhat
homologous”. Bad!
Similarity 6= homology
• Correct: ”These sequences are somewhatsimilar”.
![Page 11: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/11.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Important questions
• When are two sequences significantlysimilar?
• How do we evaluate similarity?
![Page 12: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/12.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Important questions
• When are two sequences significantlysimilar?
• How do we evaluate similarity?
![Page 13: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/13.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Data
• DNA: genes, genomes, non-coding DNA,etc
• Codons• RNA• Peptides
![Page 14: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/14.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Idea of dotplots
Q V A S K I N T N ES
•
V
•
A
•
T
•
K
•
I
•
YMN
• •
E
•
Put dot where identical residues
, then filter outrandomness
![Page 15: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/15.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Idea of dotplots
Q V A S K I N T N ES •V •A •T •K •I •YMN • •E •
Put dot where identical residues
, then filter outrandomness
![Page 16: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/16.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Idea of dotplots
Q V A S K I N T N ES
•
V •A •T
•
K •I •YMN • •E •
Put dot where identical residues, then filter outrandomness
![Page 17: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/17.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Dotplots in practicePttMAP20 (horizontal) vs. OsMAP20 (vertical)
0 100
0
50
100
150
![Page 18: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/18.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Dotplots in practicePttMAP20 (horizontal) vs. OsMAP20 (vertical)
0 100
0
50
100
150
![Page 19: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/19.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Dotplots in practicePttMAP20 (horizontal) vs. OsMAP20 (vertical)
0 100
0
50
100
150
![Page 20: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/20.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Dotplots in practicePttMAP20 (horizontal) vs. OsMAP20 (vertical)
0 100
0
50
100
150
![Page 21: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/21.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
What happened here?
s1: A B C Ds2: A C B D
![Page 22: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/22.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
What happened here?
s1: A B C Ds2: A C B D
![Page 23: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/23.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Genomic dotplot
Many inversions around origin and termini of replication.
![Page 24: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/24.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Genomic dotplot
Many inversions around origin and termini of replication.
![Page 25: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/25.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Visualizing with alignmentOsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRR
S+ +PK + ++ +P F+LHT +RA+KRA FNY VA+KI NE +RPttMAP20 43 SKVAPKPFAKENTKPQE-FKLHTGQRALKRAMFNYSVATKIYMNEQQKR
OsMAP20 118 FEEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEE++ K+IEE E++ MRKEMV +AQLMP FD+PF PQRS+RPLTVP+E
PttMAP20 91 QIERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPRE
OsMAP20 167 PSFPSF
PttMAP20 140 PSF
OsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRRF 118|:.:||..:.::.:| ..|:|||.:||:|||.|||.||:||..||..:|.
PttMAP20 43 SKVAPKPFAKENTKP-QEFKLHTGQRALKRAMFNYSVATKIYMNEQQKRQ 91
OsMAP20 119 EEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEPS 168.|::.|:|||.|::.||||||.:|||||.||:||.||||:||||||:|||
PttMAP20 92 IERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPREPS 141
OsMAP20 169 F--LRLKC--CI 176| :..|| ||
PttMAP20 142 FHMVNSKCWSCI 153
![Page 26: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/26.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Visualizing with alignmentOsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRR
S+ +PK + ++ +P F+LHT +RA+KRA FNY VA+KI NE +RPttMAP20 43 SKVAPKPFAKENTKPQE-FKLHTGQRALKRAMFNYSVATKIYMNEQQKR
OsMAP20 118 FEEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEE++ K+IEE E++ MRKEMV +AQLMP FD+PF PQRS+RPLTVP+E
PttMAP20 91 QIERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPRE
OsMAP20 167 PSFPSF
PttMAP20 140 PSF
OsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRRF 118|:.:||..:.::.:| ..|:|||.:||:|||.|||.||:||..||..:|.
PttMAP20 43 SKVAPKPFAKENTKP-QEFKLHTGQRALKRAMFNYSVATKIYMNEQQKRQ 91
OsMAP20 119 EEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEPS 168.|::.|:|||.|::.||||||.:|||||.||:||.||||:||||||:|||
PttMAP20 92 IERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPREPS 141
OsMAP20 169 F--LRLKC--CI 176| :..|| ||
PttMAP20 142 FHMVNSKCWSCI 153
![Page 27: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/27.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Alignments
• Def: A pairwise alignment is a pairing ofsymbols between two sequences.
• Global alignment: Involves wholesequences.
• Local alignment: Involves parts ofsequences.
• Semiglobal or ends-free alignment: Ignore”overhang” in similar sequences withdifferent lengths
![Page 28: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/28.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Alignments
• Def: A pairwise alignment is a pairing ofsymbols between two sequences.
• Global alignment: Involves wholesequences.
• Local alignment: Involves parts ofsequences.
• Semiglobal or ends-free alignment: Ignore”overhang” in similar sequences withdifferent lengths
![Page 29: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/29.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Alignments
• Def: A pairwise alignment is a pairing ofsymbols between two sequences.
• Global alignment: Involves wholesequences.
• Local alignment: Involves parts ofsequences.
• Semiglobal or ends-free alignment: Ignore”overhang” in similar sequences withdifferent lengths
![Page 30: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/30.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Alignments
• Def: A pairwise alignment is a pairing ofsymbols between two sequences.
• Global alignment: Involves wholesequences.
• Local alignment: Involves parts ofsequences.
• Semiglobal or ends-free alignment: Ignore”overhang” in similar sequences withdifferent lengths
![Page 31: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/31.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Global vs localOsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRRF 118
|:.:||..:.::.:| ..|:|||.:||:|||.|||.||:||..||..:|.PttMAP20 43 SKVAPKPFAKENTKP-QEFKLHTGQRALKRAMFNYSVATKIYMNEQQKRQ 91
OsMAP20 119 EEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEPS 168.|::.|:|||.|::.||||||.:|||||.||:||.||||:||||||:|||
PttMAP20 92 IERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPREPS 141
OsMAP20 169 F--LRLKC--CI 176| :..|| ||
PttMAP20 142 FHMVNSKCWSCI 153
OsMAP20 1 MEK--TRKATSPKSSMTSSTGPKSPVRNGGSPPHKKSTSEFRGRKNESQI 48||| |:.|.......:|.:.|.|....|.:....|..
PttMAP20 1 MEKAHTKSALKKLVKASSQSAPWSNAARGMAKDDLKDP------------ 38
OsMAP20 49 FRKGGQDSITLDESKRRSPTSQTSPKRSSPKHEQPLSYFRLHTEERAIKR 98..|:|| .:||..:.::.:| ..|:|||.:||:||
PttMAP20 39 ---------LYDKSK-------VAPKPFAKENTKP-QEFKLHTGQRALKR 71
OsMAP20 99 AGFNYQVASKINTNEIIRRFEEKLSKVIEEREIKMMRKEMVHKAQLMPAF 148|.|||.||:||..||..:|..|::.|:|||.|::.||||||.:|||||.|
PttMAP20 72 AMFNYSVATKIYMNEQQKRQIERIQKIIEEEEVRTMRKEMVPRAQLMPYF 121
OsMAP20 149 DKPFHPQRSTRPLTVPKEPSF--LRLKC--CIGGEFHRHFCYNA------ 188|:||.||||:||||||:|||| :..|| ||..:...::..:|
PttMAP20 122 DRPFFPQRSSRPLTVPREPSFHMVNSKCWSCIPEDELYYYFEHAHPHDHA 171
OsMAP20 189 -KAIK 192|.:|
PttMAP20 172 WKPVK 176
![Page 32: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/32.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
More terminology
• Insertion• Deletion• Indel — when we don’t know• Gap — indel in an alignment• Indel character: usually ”–”
1 MEK--TRKATSPKSSMTSSTGPKSPVRNGGSPPHKKSTSEFRGRKNESQI 48||| |:.|.......:|.:.|.|....|.:....|..
1 MEKAHTKSALKKLVKASSQSAPWSNAARGMAKDDLKDP------------ 38
49 FRKGGQDSITLDESKRRSPTSQTSPKRSSPKHEQPLSYFRLHTEERAIKR 98..|:|| .:||..:.::.:| ..|:|||.:||:||
39 ---------LYDKSK-------VAPKPFAKENTKP-QEFKLHTGQRALKR 71
![Page 33: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/33.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Choosing alignment?OsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRR
S+ +PK + ++ +P F+LHT +RA+KRA FNY VA+KI NE +RPttMAP20 43 SKVAPKPFAKENTKPQE-FKLHTGQRALKRAMFNYSVATKIYMNEQQKR
OsMAP20 118 FEEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEE++ K+IEE E++ MRKEMV +AQLMP FD+PF PQRS+RPLTVP+E
PttMAP20 91 QIERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPRE
OsMAP20 167 PSFPSF
PttMAP20 140 PSF
OsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRRF 118|:.:||..:.::.:| ..|:|||.:||:|||.|||.||:||..||..:|.
PttMAP20 43 SKVAPKPFAKENTKP-QEFKLHTGQRALKRAMFNYSVATKIYMNEQQKRQ 91
OsMAP20 119 EEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEPS 168.|::.|:|||.|::.||||||.:|||||.||:||.||||:||||||:|||
PttMAP20 92 IERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPREPS 141
OsMAP20 169 F--LRLKC--CI 176| :..|| ||
PttMAP20 142 FHMVNSKCWSCI 153
![Page 34: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/34.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Principle: Identity• Def: The identity in an alignment is the
fraction of identical paired symbols.• Early selection criteria: Choose alignment
with highest identity
Here: 62112 ≈ 55% identity
OsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRRF 118|:.:||..:.::.:| ..|:|||.:||:|||.|||.||:||..||..:|.
PttMAP20 43 SKVAPKPFAKENTKP-QEFKLHTGQRALKRAMFNYSVATKIYMNEQQKRQ 91
OsMAP20 119 EEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEPS 168.|::.|:|||.|::.||||||.:|||||.||:||.||||:||||||:|||
PttMAP20 92 IERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPREPS 141
OsMAP20 169 F--LRLKC--CI 176| :..|| ||
PttMAP20 142 FHMVNSKCWSCI 153
![Page 35: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/35.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Principle: Identity• Def: The identity in an alignment is the
fraction of identical paired symbols.• Early selection criteria: Choose alignment
with highest identityHere: 62
112 ≈ 55% identityOsMAP20 69 SQTSPKRSSPKHEQPLSYFRLHTEERAIKRAGFNYQVASKINTNEIIRRF 118
|:.:||..:.::.:| ..|:|||.:||:|||.|||.||:||..||..:|.PttMAP20 43 SKVAPKPFAKENTKP-QEFKLHTGQRALKRAMFNYSVATKIYMNEQQKRQ 91
OsMAP20 119 EEKLSKVIEEREIKMMRKEMVHKAQLMPAFDKPFHPQRSTRPLTVPKEPS 168.|::.|:|||.|::.||||||.:|||||.||:||.||||:||||||:|||
PttMAP20 92 IERIQKIIEEEEVRTMRKEMVPRAQLMPYFDRPFFPQRSSRPLTVPREPS 141
OsMAP20 169 F--LRLKC--CI 176| :..|| ||
PttMAP20 142 FHMVNSKCWSCI 153
![Page 36: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/36.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Scoring an alignment
• Identity looses info on similarity
• Better: assign score to every pair ofsymbols. s(x , y) = cExample: for DNA
s A T G CA 2 -1 1 -1T -1 2 -1 1G 1 -1 2 -1C -1 1 -1 2
• Indel scores: s(x ,−) = s(−, x)?= −1
![Page 37: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/37.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Scoring an alignment
• Identity looses info on similarity• Better: assign score to every pair of
symbols. s(x , y) = cExample: for DNA
s A T G CA 2 -1 1 -1T -1 2 -1 1G 1 -1 2 -1C -1 1 -1 2
• Indel scores: s(x ,−) = s(−, x)?= −1
![Page 38: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/38.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Scoring an alignment
• Identity looses info on similarity• Better: assign score to every pair of
symbols. s(x , y) = cExample: for DNA
s A T G CA 2 -1 1 -1T -1 2 -1 1G 1 -1 2 -1C -1 1 -1 2
• Indel scores: s(x ,−) = s(−, x)?= −1
![Page 39: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/39.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Scoring an alignment• Alignment x , y from sequences x and y .
E.g.: x = AAGTT, y = AATT, alignment isx AAGTTy AA-TT
• Alignment score is
S(x , y) =
|x |∑i=1
s(xi , yi)
• Here:
S(x , y) = s(A, A) + s(A, A)
+ s(G,−) + s(T , T ) + s(T , T )
![Page 40: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/40.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Scoring an alignment• Alignment x , y from sequences x and y .
E.g.: x = AAGTT, y = AATT, alignment isx AAGTTy AA-TT
• Alignment score is
S(x , y) =
|x |∑i=1
s(xi , yi)
• Here:
S(x , y) = s(A, A) + s(A, A)
+ s(G,−) + s(T , T ) + s(T , T )
![Page 41: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/41.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
How do we choose an alignment?
• Want to choose best global alignment• Many alignments• Given x = x1x2 · · · xm and y = y1y2 · · · yn,
find x , y that maximize score S(x , y).
• Idea: Find best way of ending alignment
![Page 42: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/42.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
How do we choose an alignment?
• Want to choose best global alignment• Many alignments• Given x = x1x2 · · · xm and y = y1y2 · · · yn,
find x , y that maximize score S(x , y).• Idea: Find best way of ending alignment
![Page 43: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/43.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
How to end alignment: alternativesOne of:
x1 · · · xm−1y1 · · · yn−1
xmyn
Mm−1,n−1 + s(xm, yn)
or
x1 · · · xm−1y1 · · · yn
xm−
Mm−1,n + s(xm,−)
or
x1 · · · xmy1 · · · yn−1
−yn
Mm,n−1 + s(−, yn)
![Page 44: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/44.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
How to end alignment: alternativesOne of:
x1 · · · xm−1y1 · · · yn−1
xmyn
Mm−1,n−1 + s(xm, yn)
or
x1 · · · xm−1y1 · · · yn
xm−
Mm−1,n + s(xm,−)
or
x1 · · · xmy1 · · · yn−1
−yn
Mm,n−1 + s(−, yn)
![Page 45: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/45.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
How to end alignment: alternativesOne of:
x1 · · · xm−1y1 · · · yn−1
xmyn
Mm−1,n−1 + s(xm, yn)
or
x1 · · · xm−1y1 · · · yn
xm− Mm−1,n + s(xm,−)
or
x1 · · · xmy1 · · · yn−1
−yn
Mm,n−1 + s(−, yn)
![Page 46: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/46.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
How to end alignment: alternativesOne of:
x1 · · · xm−1y1 · · · yn−1
xmyn
Mm−1,n−1 + s(xm, yn)
or
x1 · · · xm−1y1 · · · yn
xm− Mm−1,n + s(xm,−)
or
x1 · · · xmy1 · · · yn−1
−yn
Mm,n−1 + s(−, yn)
![Page 47: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/47.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
A rekursion for max alignment score
Note: for global alignment
M0,0 = 0
Mm,n = max
Mm−1,n−1 + s(xm, yn) m > 0, n > 0Mm−1,n + s(xm,−) m > 0, n ≥ 0Mm,n−1 + s(−, yn) m ≥ 0, n > 0
We get:Mm,n = max
x ,yS(x , y)
![Page 48: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/48.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Computing Mm,n
• Keep Mi ,j in a table• Table + Rekursion = Dynamic Programming• Needleman-Wunch algorithm
• mn elements in table⇒Time complexity is ∼ mn.
• When filling the table, note alternatives.• Backtracking for retrieving the alignment.
![Page 49: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/49.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
Computing Mm,n
• Keep Mi ,j in a table• Table + Rekursion = Dynamic Programming• Needleman-Wunch algorithm• mn elements in table
⇒Time complexity is ∼ mn.• When filling the table, note alternatives.• Backtracking for retrieving the alignment.
![Page 50: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/50.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
DP and backtracking
From Eddy, Nature Biotech, 2004
![Page 51: Intro Sequence comparisons Visualization Alignments Scoring](https://reader030.vdocument.in/reader030/viewer/2022020705/61fb85c12e268c58cd5f2bc1/html5/thumbnails/51.jpg)
Intro Sequence comparisons Visualization Alignments Scoring Algorithms
DP for local alignments
• Smith-Waterman algorithm• Allow ”restarting” from zero.
M0,0 = 0
Mm,n = max
Mm−1,n−1 + s(xm, yn) m > 0, n > 0Mm−1,n + s(xm,−) m > 0, n ≥ 0Mm,n−1 + s(−, yn) m ≥ 0, n > 00 ← Here!