michael schroeder biotechnological center tu dresden biotec introduction based on chapter 1 lesk,...
TRANSCRIPT
![Page 1: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/1.jpg)
Michael Schroeder BioTechnological CenterTU Dresden Biotec
Introduction
based onChapter 1
Lesk, Introduction to Bioinformatics
![Page 2: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/2.jpg)
By Michael Schroeder, Biotec, 2
Contents
Molecular biology primer The role of computer science Phylogeny Sequence Searching
![Page 3: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/3.jpg)
By Michael Schroeder, Biotec, 3
23 June 2000: Draft of Human genome sequenced!
1953: Watson and Crick discover the structure of DNA 2000: Draft of human genome is published
“The most wondrous map ever produced by human kind” “One of the most significant scientific landmarks of all
time, comparable with the invention of the wheel or the splitting of the atom”
![Page 4: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/4.jpg)
By Michael Schroeder, Biotec, 4
High-throughput biomedicine
Microarrays Measure activity of thousands of genes at the same time Example:
Cancer Compare activity with and without drug treatment Result: Hundreds of candidate drug targets
RNAi (Noble prize 2004, Fire and Mello) Knock-down genes and observe effect Example:
Infectious diseases Which proteins orchestrate entry into cell? Result: Hundreds of candidate proteins
Atomic force microscopes (Noble prize Binnig) Pull protein out of membrane and measure force Example:
Eye diseases resulting fomr misfolding Result: Hundreds of candidate residues
![Page 5: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/5.jpg)
By Michael Schroeder, Biotec, 5
Drug Discovery
Challenge: Longer time to market, fewer drugs, exploding costs
Approach: Use of compound libraries and high-throughput screening
![Page 6: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/6.jpg)
By Michael Schroeder, Biotec, 6
HTS and Bioinformatics
High-throughput technologies have completely changed the work of biomedical researchers
Challenge: Interpret (often large) results of screens
Approach: Before running secondary assays use bioinformatics and IT to assemble all possible information
![Page 7: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/7.jpg)
By Michael Schroeder, Biotec, 7
Good News
10 thousands of 3D Structures
Millions ofSequences
Millions ofArticles
Hundreds of DBs/Tools
![Page 8: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/8.jpg)
By Michael Schroeder, Biotec, 8
Bad News: Data != Knowledge
How to analyse data, how to integrate data?
Comptuer science to the rescue…
![Page 9: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/9.jpg)
By Michael Schroeder, Biotec, 9
Examlpe: computer science is key for sequencing
Human genome is a string of length 3.200.000.000 Shotgun sequencing: Break multiple copies of string
into shorter substrings Example:
shotgunsequencing shotgunsequencing shotgunsequencing
cing en encing equ gun ing ns otgu seq sequ sh sho shot tg uenc un
Computing problem: Assemble strings
![Page 10: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/10.jpg)
By Michael Schroeder, Biotec, 10
Computer science key for sequencing
sh sho shot otgu tg gun un ns seq sequ equ uenc encing en cing ing
QUESTION: How can you handle long repetitive sequences?
Heeeeelllllllllllooooooo
QUESTION: Why was a draft announced? When was the finalversion ready?
![Page 11: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/11.jpg)
By Michael Schroeder, Biotec, 11
Arabidopsis thaliana
mouse
rat
Caenorhabitis elegans
Drosophilamelanogaster
Mycobacteriumleprae
Vibrio cholerae
Plasmodiumfalciparum
Mycobacteriumtuberculosis
Neisseria meningitidis
Z2491
Helicobacter pylori
Xylella fastidiosa
Borrelia burgorferi
Rickettsia prowazekii
Bacillus subtilis
Archaeoglobusfulgidus
Campylobacter jejuni
Aquifex aeolicus
Thermotoga maritima
Chlamydiapneumoniae
Pseudomonasaeruginosa
Ureaplasmaurealyticum
Buchnerasp. APS
Escherichia coli
Saccharomycescerevisiae
Yersinia pestis
Salmonellaenterica
Thermoplasmaacidophilum
![Page 12: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/12.jpg)
By Michael Schroeder, Biotec, 12
DNA – the molecule of life
http://www.ornl.gov/hgmis
![Page 13: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/13.jpg)
By Michael Schroeder, Biotec, 13
The genetic code
![Page 14: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/14.jpg)
By Michael Schroeder, Biotec, 14
Protein Structure
DNA: Nucleotides are very similar
and hence the structure of DNA is very uniform
Proteins: Great variety in three-
dimensional conformation to support diverse structure and functions
If heated, protein “unfolds” to biologically-inactive structure; in normal conditions protein folds
![Page 15: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/15.jpg)
By Michael Schroeder, Biotec, 15
Paradox
Translation from DNA sequence to amino acid sequence is very simple to describe, but requires immensely complicated machinery
(ribosome, tRNA) The folding of the protein sequence into its three-
dimensional structure is very difficult to describe But occurs spontaneously
![Page 16: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/16.jpg)
By Michael Schroeder, Biotec, 16
Central Dogma
DNA sequence determines protein sequence Protein sequence determines protein structure Protein structure determines protein function
![Page 17: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/17.jpg)
By Michael Schroeder, Biotec, 17
Sequence vs. structure similarity
Picture from www.jenner.ac.uk/YBF/DanielleTalbot.ppt
![Page 18: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/18.jpg)
By Michael Schroeder, Biotec, 18
Sequence vs. structure similarity
Picture from www.jenner.ac.uk/YBF/DanielleTalbot.ppt
High sequence similarity = high structure similary
![Page 19: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/19.jpg)
By Michael Schroeder, Biotec, 19
Sequence vs. structure similarity
Picture from www.jenner.ac.uk/YBF/DanielleTalbot.ppt
Low sequence similarityusuallylow structure similarity
![Page 20: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/20.jpg)
By Michael Schroeder, Biotec, 20
Sequence vs. structure similarity
Picture from www.jenner.ac.uk/YBF/DanielleTalbot.ppt
Low sequence similarity possibly stillhigh structure similary
![Page 21: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/21.jpg)
11% sequence identity, structure perfectly match
By Michael Schroeder, Biotec, 21
![Page 22: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/22.jpg)
Sequence similarity is key concept
Similar sequences are a hint for common ancestry and possibly similar function
![Page 23: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/23.jpg)
Sequence similarity is key concept
Similar sequences are a hint for common ancestry and possibly similar function
![Page 24: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/24.jpg)
Sequence similarity is key conceptExample: v-sys vs. PDGF
Example from early 80s: V-sys in simian sarcoma virus leads to cancer in infected cells PDGF in humans is a normal growth factor for cells V-sys and PDGF are 85% similar
Alignment from: http://pdf.aminer.org/000/244/500/design_and_implementation_of_a_dna_sequence_processor.pdf
![Page 25: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/25.jpg)
Sequence similarity is key concept
If an unknown sequence is found, deduce its function/structure indirectly by finding similar sequences, whose function/structure is known
Assumption: Evolution changes sequences “slowly” often maintaining main features of a sequence’s function/structure
![Page 26: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/26.jpg)
Sequence similarity is key concept
Similar sequences are a hint for common ancestry and possibly similar function
![Page 27: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/27.jpg)
Sequence is hint for evolutionary relationship
![Page 28: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/28.jpg)
By Michael Schroeder, Biotec, 28
How similar are sequences?
>sp|P00674|RNP_HORSE Ribonuclease pancreatic (EC 3.1.27.5) (RNase 1) (RNase A) - Equus caballus (Horse).
KESPAMKFERQHMDSGSTSSSNPTYCNQMMKRRNMTQGWCKPVNTFVHEPLADVQAICLQKNITCKNGQSNCYQSSSSMHITDCRLTSGSKYPNCAYQTSQKERHIIVACEGNPYVPVHFDASVEVST
>sp|P00673|RNP_BALAC Ribonuclease pancreatic (EC 3.1.27.5) (RNase 1) (RNase A) - Balaenoptera acutorostrata (Minke whale) (Lesser rorqual).
RESPAMKFQRQHMDSGNSPGNNPNYCNQMMMRRKMTQGRCKPVNTFVHESLEDVKAVCSQKNVLCKNGRTNCYESNSTMHITDCRQTGSSKYPNCAYKTSQKEKHIIVACEGNPYVPVHFDNSV
>sp|P00686|RNP_MACRU Ribonuclease pancreatic (EC 3.1.27.5) (RNase 1) (RNase A) - Macropus rufus (Red kangaroo) (Megaleia rufa).
ETPAEKFQRQHMDTEHSTASSSNYCNLMMKARDMTSGRCKPLNTFIHEPKSVVDAVCHQENVTCKNGRTNCYKSNSRLSITNCRQTGASKYPNCQYETSNLNKQIIVACEGQYVPVHFDAYV
![Page 29: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/29.jpg)
By Michael Schroeder, Biotec, 29
Multiple Alignment with ClustalW (www.ebi.ac.uk/clustalw)
CLUSTAL W (1.82) multiple sequence alignmensp|P00674|RNP_HORSEsp|P00673|RNP_BALACsp|P00686|RNP_MACRU
KESPAMKFERQHMDSGSTSSSNPTYCNQMMKRRNMTQGWCKPVNTFVHEPLADVQAICLQ 60 RESPAMKFQRQHMDSGNSPGNNPNYCNQMMMRRKMTQGRCKPVNTFVHESLEDVKAVCSQ 60 -ETPAEKFQRQHMDTEHSTASSSNYCNLMMKARDMTSGRCKPLNTFIHEPKSVVDAVCHQ 59 *:** **:*****: :......*** ** *.**.* ***:***:**. *.*:* *
KNITCKNGQSNCYQSSSSMHITDCRLTSGSKYPNCAYQTSQKERHIIVACEGNPYVPVHF 120 KNVLCKNGRTNCYESNSTMHITDCRQTGSSKYPNCAYKTSQKEKHIIVACEGNPYVPVHF 120 ENVTCKNGRTNCYKSNSRLSITNCRQTGASKYPNCQYETSNLNKQIIVACEG-QYVPVHF 118:*: ****::***:*.* : **:** *..****** *:**: :::******* ******
DASVEVST 128 DNSV---- 124 DAYV---- 122 * *
![Page 30: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/30.jpg)
By Michael Schroeder, Biotec, 30
Example: Number of Aligned Residues
Horse and Minke whale: 95 Minke whale and Red kangoroo: 82 Horse and Red kangoroo: 75
Conclusion: Horse and whale share the most identical residues
Horse and whale are placental, kangaroo is marsupial
![Page 31: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/31.jpg)
By Michael Schroeder, Biotec, 31
Example: Elephant and Mammoth
Mitochondrial cytochrome b from Siberian woolly mammoth
(Mammuthus primigenius) preserved in arctic perma frost
African elephant (Loxodonta africana) Indian elephant (Elephans maximus)
![Page 32: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/32.jpg)
By Michael Schroeder, Biotec, 32
Indian elephant: sp|P24958|CYB_LOXAF Mammoth: sp|P92658|CYB_MAMPR African elephant: sp|O47885|CYB_ELEMA
MTHIRKSHPLLKIINKSFIDLPTPSNISTWWNFGSLLGACLITQILTGLFLAMHYTPDTM 60MTHIRKSHPLLKILNKSFIDLPTPSNISTWWNFGSLLGACLITQILTGLFLAMHYTPDTM 60MTHTRKFHPLFKIINKSFIDLPTPSNISTWWNFGSLLGACLITQILTGLFLAMHYTPDTM 60*** ** ***:**:**********************************************
TAFSSMSHICRDVNYGWIIRQLHSNGASIFFLCLYTHIGRNIYYGSYLYSETWNTGIMLL 120TAFSSMSHICRDVNYGWIIRQLHSNGASIFFLCLYTHIGRNIYYGSYLYSETWNTGIMLL 120TAFSSMSHICRDVNYGWIIRQLHSNGASIFFLCLYTHIGRNIYYGSYLYSETWNTGIMLL 120************************************************************
LITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLVEWIWGGFSVDKATLNRFFA 180LITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTDLVEWIWGGFSVDKATLNRFFA 180LITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLVEWIWGGFSVDKATLNRFFA 180**************************************:*********************
LHFILPFTMIALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLGLLILILLLL 240LHFILPFTMIALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLGLLILILFLL 240FHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLGLLILILLLL 240:********:***********************************************:**
LLALLSPDMLGDPDNYMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALLLSILI 300LLALLSPDMLGDPDNYMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALLLSILI 300LLALLSPDMLGDPDNYMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSILI 300******************************************************:*****
LGLMPLLHTSKHRSMMLRPLSQVLFWTLTMDLLTLTWIGSQPVEYPYIIIGQMASILYFS 360LGIMPLLHTSKHRSMMLRPLSQVLFWTLATDLLMLTWIGSQPVEYPYIIIGQMASILYFS 360LGLMPLLHTSKHRSMMLRPLSQVLFWTLTMDLLTLTWIGSQPVEHPYIIIGQMASILYFS 360**:*************************: *** **********:***************
IILAFLPIAGVIENYLIK 378IILAFLPIAGMIENYLIK 378IILAFLPIAGMIENYLIK 378**********:*******
![Page 33: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/33.jpg)
By Michael Schroeder, Biotec, 33
Example: Elephant and Mammoth
Mammoth and African elephant have 10 mismatches, mammoth and Indian elephant 14.
Significant?
![Page 34: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/34.jpg)
By Michael Schroeder, Biotec, 34
Similarity and Homology
Important difference: Similarity is the measurement of resemblance of
sequences Homology: common ancestor
Similarity is gradual, homology is either true or false Similarity = now, homology = past events Homology is only very rarely directly observed (e.g. lab
population, clinical study of viral infection)
Homology is inferred from sequence similarity
![Page 35: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/35.jpg)
By Michael Schroeder, Biotec, 35
Homology = derived from common ancestor
Characteristics derived from a common ancestor are called homologous
E.g. eagle’s wing and human’s arm
Other apparently similar characteristics may have arisen independently by convergent evolution
E.g. eagle’s wing and bee’s wing. The most common ancestor of eagles and bees did not have wings
Homologous characters may diverge functionally E.g. bones in human middle and jaws of primitive fish
![Page 36: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/36.jpg)
By Michael Schroeder, Biotec, 36
Example: Homology/Similarity
The assertion that the cytocrome b sequences are homologues means that there is a common ancestor
BUT: 1. Maybe cytochrome b functionally requires so many
conserved residues and will hence occur in many species ( In fact, This is not the case here)
2. Maybe cytochrome b has to function this way in elephant-like species, but in fact started out from different ancestors (i.e. convergent evolution)
3. Maybe mammoth and African elephant have only fewer mismatches, because Indian elephant’s DNA mutated faster
4. Maybe all of them acquired cytochrome b through a virus (horizontal gene transfer)
![Page 37: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/37.jpg)
Similarity vs. Homology
Any sequence can be similar Sequences homologues if evolved from common
ancestor Homologous sequences:
Orthologs: similar biological function Paralogs: different biological function (after gene
duplication), e.g. lysozyme and α-lactalbumin, a mammalian regulatory protein
Assumption: Similarity indicator for homology Note, altered function of the expressed protein will
determine if the organism will survive to reproduce, and hence pass on the altered gene
![Page 38: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/38.jpg)
Sequence similarity is key concept
How similar are two sequences?How to align the sequences?How to align multiple sequences?How to find motifs?
![Page 39: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/39.jpg)
By Michael Schroeder, Biotec, 39
Sequence alignment
Global match: align all of one with all of the other sequence (mismatches, insertions, deletions) And.--so,.from.hour.to.hour.we.ripe.and.ripe|||| |||||||||||||||||||||||| ||||||And.then,.from.hour.to.hour.we.rot-.and.rot-
Local match: find region in one sequence that matches the other (mismatches, insertions, deletions ; ends can be ignored) My.care.is.loss.of.care,.by.old.care.done, ||||||||| ||||||||||||| |||||| ||Your.care.is.gain.of.care,.by.new.care.won
![Page 40: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/40.jpg)
By Michael Schroeder, Biotec, 40
Sequence alignment
Motif search: find matches of short sequence in long sequence Option:
perfect, 1 mismatch, mismatches+gaps+insertions+deletions
match ||||for the watch to babble and to talk is most tolerable
![Page 41: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/41.jpg)
By Michael Schroeder, Biotec, 41
Sequence alignment
Multiple sequence alignment
No.sooner.---met.--------.but.they.look’d
No.sooner.look’d.--------.but.they.lo-v’d
No.sooner.lo-v’d.--------.but.they.sigh’d
No.sooner.sigh’d.--------.but.they.--asked.one.another.the.reason
No.sooner.knew.the.reason.but.they.-------------sought.the.remedy
No.sooner. .but.they.
![Page 42: Michael Schroeder BioTechnological Center TU Dresden Biotec Introduction based on Chapter 1 Lesk, Introduction to Bioinformatics](https://reader035.vdocument.in/reader035/viewer/2022062321/56649e685503460f94b64afe/html5/thumbnails/42.jpg)
By Michael Schroeder, Biotec, 42
Quick check
By now you should Know the main data sources (sequence and structure) Know the role that bioinformatics plays Understand the difference between homology and similarity Understand what sequence comparison and alignment are