![Page 1: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/1.jpg)
.
Algorithms in Computational Biology
Fall 2005/6
Lecturer: Benny Chor (benny AT cs.tau.ac.il)
Lectures: Thursdays 18:00-21:00, Schreiber 007
![Page 2: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/2.jpg)
2
Course InformationRequirements & Grades: 20-25% homework, in five-to-six assignments,
containing both “dry” and “wet” problems. Submission - two weeks from posting.
Homework submission is obligatory. You are strongly encouraged to solve the
assignments independently (or at least give it a serious try).
75-80% exam. Must pass beyond 55 for the homework’s grade to count
![Page 3: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/3.jpg)
3
Bibliography
Biological Sequence Analysis, R.Durbin et al. , Cambridge University Press, 1998
Introduction to Molecular Biology, J. Setubal and
J. Meidanis, PWS publishing Company, 1997
Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology, D. Gusfield, Cambridge University Press, 1997.
Post-genome Informatics, M. Kanehisa , Oxford University Press, 2000.
More refs on course page.
![Page 4: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/4.jpg)
4
Course PrerequisitesComputer Science and Probability Background Computational Models Algorithms (“efficiency of computation”) Probability (any course)
Some Biology Background Formally: None, to allow CS students to take this course. Recommended: Some molecular biology course, and/or a serious
desire to complement your knowledge in Biology by reading the appropriate material.
Studying the algorithms in this course while acquiring enough biology background is far more rewarding than ignoring the biological context.
![Page 5: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/5.jpg)
5
ץ אר@ Cת ה Fא Hים, ו Lמ Cש Lת ה Fים, א Pלה Rא א Cר Cית, ב Pאש Fר Hב .
הום; Hי ת Fנ Hל-פ Lך, ע חש@ Hבהו, ו Cה תהו ו Cת Hי Cץ, ה אר@ Cה Hו
ים. Cמ Lי ה Fנ Hל-פ Lת ע פ@ ח@ Lר Hים, מ Pלה Rא Lרוח Hו
י-אור. Pה Hי Lי אור; ו Pה Hים, י Pלה Rר א יאמ@ Lו
Let Us Start from the Beginning
![Page 6: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/6.jpg)
6
13.7 billion years ago: the big bang. 5 billion years ago: our sun. 4.5 billion years ago: planet earth. 3.8 billion years ago: early forms of life. 3.8 bya – present: evolution. Akilia island, Greenland
And if you prefer the scientific version
![Page 7: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/7.jpg)
7
EvolutionA process where small random changes rccumulate within species over timetill new ones form.
The Tree of Life: A classical, basic science problem, since Darwin’s
1859 “Origin of Species”.
![Page 8: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/8.jpg)
8
Phylogeny ReconstructionGoal: Given a set of species, reconstruct
the tree which best explains their evolutionary history.
![Page 9: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/9.jpg)
9
Phylogenetic Trees Based on What?
1. Morphology
2. Single genes
3. Whole genomes
![Page 10: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/10.jpg)
10
What is Computational Biology?
Computational biology is the application of computational tools and techniques to molecular biology (primarily). It enables new ways of study in life sciences, allowing analytic and predictive methodologies that support and enhance laboratory work. It is a multidisciplinary area of study that combines Biology, Computer Science, and Statistics.
Computational biology is also called Bioinformatics, although many practitioners define Bioinformatics somewhat narrower by restricting to the application of specialized software for deducing meaningful biological information.
![Page 11: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/11.jpg)
11
Accepted Model of Species Evolution
Initial period: primordial soup, where “you are what you eat”. (recombination events, horizontal transfers.) Later period: Formation of distinct taxa. Speciation events induce a tree-like evolution.
![Page 12: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/12.jpg)
12
Species Evolution (2)
The affinities of all the beings of the same class have sometimes been represented by a great tree... Charles Darwin, 1859
![Page 13: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/13.jpg)
13
Why Bio-informatics ?
An explosive growth in the amount of biological information necessitates the use of computers for cataloging, retrieval and analyzing mega-data (> 3 billion bps, > 30,000 genes).
• The human genome project.
• Improved technologies, e.g. automated sequencing.
• GenBank is now approximately doubling every year !!!
![Page 14: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/14.jpg)
14
New Biotechnologies & Data• Micro arrays - gene expression.
• 2D gels – protein expression.
• Multi-level maps - genetic, physical: sequence, annotation.
• Networks of protein-protein interactions.
• Cross-species relationships -• Homologous genes.• Chromosome organization.
http://www.the-scientist.com/yr2002/apr/research020415.html
![Page 15: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/15.jpg)
15
BioInformatics Tools are Crucial !
• New biotechnology tools generate
explosive growth in the amount of
biological data.
• Impossible to analyze the data manually.
• Novel mathematical, statistical,
algorithmic and computational tools
are necessary !
![Page 16: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/16.jpg)
16
Areas of Interest (very partial list)
• Building evolutionary trees from molecular (and other) data• Efficiently reconstructing the genome sequence from sub-
parts (mapping, assembly, etc.)• Understanding the structure of genomes (Genes, SNP, SSR)• Understanding function of genes in the cell cycle and disease• Deciphering structure and function of proteins• Diagnosing cancer based on DNA microarrays (“chips”)
_____________________SNP: Single Nucleotide PolymorphismSSR: Simple Sequence Repeat
Much of this class has been edited from Nir Friedman’s lecture which is available at www.cs.huji.ac.il/~nir. Changes made by Dan Geiger, then Shlomo Moran, and finally
Benny Chor. Additional slides from Zohar Yakhini and Metsada Pasmanik.
![Page 17: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/17.jpg)
17
Growth of DNA Sequence Data: GenBank
42 million sequences(“words”)
45 billion base pairs(“letters”)
![Page 18: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/18.jpg)
18
PDB ContentGrowth
http://www.rcsb.org/pdb/
(Experimentallydetermined)
02
The Protein Data Bank
![Page 19: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/19.jpg)
20
Four Aspects
Biological What is the task?
Algorithmic How to perform the task at hand efficiently?
Learning How to adapt/estimate/learn parameters and
models describing the task from examples
Statistics How to differentiate true phenomena from
artifacts
![Page 20: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/20.jpg)
21
Example: Sequence Comparison
Biological Evolution preserves sequences, thus similar genes might
have similar function
Algorithmic Consider all ways to “align” one sequence against
another
Learning How do we define “similar” sequences? Use examples to
define similarity
Statistical When we compare to ~106 sequences, what is a random
match and what is true one
![Page 21: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/21.jpg)
23
Topics I
Dealing with DNA/Protein sequences:
Finding similar sequences Models of sequences: Hidden Markov Models Genome projects and how sequences are found Transcription regulation Protein Families Gene finding
![Page 22: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/22.jpg)
25
Topics II
High throughput biotechnologies –
potentials and computational challenges DNA microarrays applications to diagnostics applications to understanding gene networks
![Page 23: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/23.jpg)
26
Topics III (Structural BioInfo Course)
Protein World: How proteins fold - secondary & tertiary structure How to predict protein folds from sequences data How to predict protein function from its structure How to analyze proteins changes from raw
experimental measurements (MassSpec)
![Page 24: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/24.jpg)
27
Algorithmics
Will introduce algorithmic techniques that are
useful in computational genomics (and elsewhere): Dynamic programing, dynamic programing, dynamic.. Suffix trees and arrays Probabilistic models: PSSM (Position Specific
Scoring Matrices), HMM (Hidden Markov Models) Learning and classification, SVM (Support Vector
Machines) Heuristics for solving hard optimization problems
(Many problems in comp. genomics are NP-hard)
![Page 25: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/25.jpg)
28
• Each human cell contains 23 pairs of chromosomes.
• Chromosomes can be distinguished by size and by unique banding patterns.
• Male – XY, Female – XX.• This set is from a male.
DNA – The Basic Genetic Material
![Page 26: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/26.jpg)
29
Chromosomes
SKY –Differentially dye-staining chromosomes
![Page 27: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/27.jpg)
30
Watson and Crick
… On Feb. 28, 1953, Francis Crick walked into the Eagle pub in Cambridge, England, and, as James Watson later recalled, announced that "we had found the secret of life."
"The structure was too pretty not to be true."
-- JAMES D. WATSON, "The Double Helix"
![Page 28: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/28.jpg)
31
1920-1958
(1953)
DNA -the Code for Life
Died from ovarian cancer
http://www.nobel.se/medicine/laureates/1962/index.html
![Page 29: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/29.jpg)
33
The Double HelixS
ourc
e: A
lber
ts e
t al
![Page 30: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/30.jpg)
34
The Central Dogma of Molecular Biology
Transcription Translation
Replication
AC
UA A G C A
G
AC
UGUAC
DNA mRNAprotein
Phenotype
![Page 31: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/31.jpg)
36
Conclusion: DNA strands are complementary (1953).
Watson-Crick Complementarity
HumanSheepTurtleSea urchinWheat
E. coli
DNA source% of each base
Purines/Pyrimidines
Base ratios
PurinesPyrimidines
![Page 32: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/32.jpg)
37
Genome Sizes
E.Coli (bacteria) 4.6 x 106 bases Yeast (simple fungi) 15 x 106 bases Smallest human chromosome 50 x 106 bases Entire human genome 3 x 109 bases
![Page 33: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/33.jpg)
38
Mendel and the Discovery of Genes
• Gregor Mendel, 1822-1884, studied inheritance patterns of common pea traits.
• He concluded that traits were passed through generations in basic units which were later called genes.
![Page 34: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/34.jpg)
39
Genetic Information
Genome – the collection of genetic information.
Chromosomes – storage units of genes.
Gene – basic unit of genetic information. They determine the inherited characters.
![Page 35: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/35.jpg)
40
What is a Gene ?
DNA contains various recognition sites:• Promoter signals.• Transcription start signals.• Start codon.• Exon, intron boundaries.• Transcription termination signal.
Start codon Terminal codon
Transcribed region Un-codedregion
Un-codedregion
exon
exon exon
intron intronpromotor
![Page 36: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/36.jpg)
41
Control of the Human -Globin Gene
![Page 37: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/37.jpg)
42
Alternative Splicing 42
![Page 38: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/38.jpg)
43
Genes: How Many?The DNA strings include: Coding regions (“genes”)
E. coli has ~4,000 genes Yeast has ~6,000 genes C. Elegans has ~13,000 genes Humans have ~32,000 genes
Control regions These typically are adjacent to the genes They determine when a gene should be “expressed”
So called “Junk” DNA (unknown function - ~90% of the DNA in human’s chromosomes)
![Page 39: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/39.jpg)
44
Gene Finding• Only 4% of the human genome encodes for functional genes.
• Genes are found along large non-coding DNA regions.
• Repeats, pseudo-genes, introns, contamination of vectors, are confusing.
![Page 40: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/40.jpg)
45
![Page 41: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/41.jpg)
46
Gene FindingExisting programs for locating genes within genomic sequences utilize a number of statistical signals and employ statistical models such as hidden Markov models (HMMs).
The problem is not solvedyet, esp. for the newly discovered “RNA genes”.
![Page 42: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/42.jpg)
48
![Page 43: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/43.jpg)
49
Diversity of Tissues in Stomach
How is this variety encoded and expressed ?
![Page 44: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/44.jpg)
50
Central Dogma
Transcription
mRNA
Translation
ProteinGene
cells express different subset of the genesIn different tissues and under different conditions
שעתוק תרגום
![Page 45: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/45.jpg)
51
Transcription
Coding sequences can be transcribed to RNA
RNA nucleotides: Similar to DNA, slightly different backbone Uracil (U) instead of Thymine (T)
Sou
rce:
Mat
hew
s &
van
Hol
de
![Page 46: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/46.jpg)
52
Transcription: RNA Editing
Exons hold information, they are more stable during evolution.This process takes place in the nucleus. The mRNA molecules diffuse through the nucleus membrane to the outer cell plasma.
1. Transcribe to RNA2. Eliminate introns3. Splice (connect) exons* Alternative splicing exists
![Page 47: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/47.jpg)
53
RNA roles Messenger RNA (mRNA)
Encodes protein sequences. Each three nucleotide acids translate to an amino acid (the protein building block).
Transfer RNA (tRNA) Decodes the mRNA molecules to amino-acids. It connects
to the mRNA with one side and holds the appropriate amino acid on its other side.
Ribosomal RNA (rRNA) Part of the ribosome, a machine for translating mRNA to
proteins. It catalyzes (like enzymes) the reaction that attaches the hanging amino acid from the tRNA to the amino acid chain being created.
...
![Page 48: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/48.jpg)
54
New Roles of RNACellular Regulation
http://www.nature.com/nature/journal/v408/n6808/fig_tab/408037a0_F1.html
http://www.sciencemag.org/content/vol298/issue5602/cover.shtml
COVER: Researchers are discovering that small RNA molecules play a surprising variety of key roles in cells. They can inhibit translation of messenger RNA into protein, cause degradation of other messenger RNAs, and even initiate complete silencing of gene expression from the genome.
![Page 49: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/49.jpg)
55
Translation in Eukaryotes
http://www1.imim.es/courses/Lisboa01/slide1.6_translation.html
Animation: http://cbms.st-and.ac.uk/academics/ryan/Teaching/medsci/Medsci6.htm
![Page 50: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/50.jpg)
56
Translation
Translation is mediated by the ribosome Ribosome is a complex of protein & rRNA
molecules The ribosome attaches to the mRNA at a
translation initiation site Then ribosome moves along the mRNA sequence
and in the process constructs a sequence of amino acids (polypeptide) which is released and folds into a protein.
![Page 51: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/51.jpg)
57
Genetic Code
There are 20 amino acids from which proteins are build.
![Page 52: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/52.jpg)
58
Protein Structure
Proteins are poly-peptides of 70-3000 amino-acids
This structure is (mostly) determined by the sequence of amino-acids that make up the protein
![Page 53: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/53.jpg)
59
Protein Structure
![Page 54: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/54.jpg)
60
![Page 55: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/55.jpg)
61
The Central Paradigm of Bio-informatics
Geneticinformation
Molecular structure
Biochemical function
Symptoms
![Page 56: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/56.jpg)
62
Similarity Search in Databanks
Find similar sequencesto a working draft.
As databanks grow, homologies get harder,and quality is reduced.
Alignment Tools: BLAST & FASTA (time saving heuristics-approximations).
>gb|BE588357.1|BE588357 194087 BARC 5BOV Bos taurus cDNA 5'.
Length = 369
Score = 272 bits (137), Expect = 4e-71
Identities = 258/297 (86%), Gaps = 1/297 (0%)
Strand = Plus / Plus
Query: 17 aggatccaacgtcgctccagctgctcttgacgactccacagataccccgaagccatggca 76
|||||||||||||||| | ||| | ||| || ||| | |||| ||||| |||||||||
Sbjct: 1 aggatccaacgtcgctgcggctacccttaaccact-cgcagaccccccgcagccatggcc 59
Query: 77 agcaagggcttgcaggacctgaagcaacaggtggaggggaccgcccaggaagccgtgtca 136
|||||||||||||||||||||||| | || ||||||||| | ||||||||||| ||| ||
Sbjct: 60 agcaagggcttgcaggacctgaagaagcaagtggagggggcggcccaggaagcggtgaca 119
Query: 137 gcggccggagcggcagctcagcaagtggtggaccaggccacagaggcggggcagaaagcc 196
|||||||| | || | ||||||||||||||| ||||||||||| || ||||||||||||
Sbjct: 120 tcggccggaacagcggttcagcaagtggtggatcaggccacagaagcagggcagaaagcc 179
Query: 197 atggaccagctggccaagaccacccaggaaaccatcgacaagactgctaaccaggcctct 256
||||||||| | |||||||| |||||||||||||||||| ||||||||||||||||||||
Sbjct: 180 atggaccaggttgccaagactacccaggaaaccatcgaccagactgctaaccaggcctct 239
Query: 257 gacaccttctctgggattgggaaaaaattcggcctcctgaaatgacagcagggagac 313
|| || ||||| || ||||||||||| | |||||||||||||||||| ||||||||
Sbjct: 240 gagactttctcgggttttgggaaaaaacttggcctcctgaaatgacagaagggagac 296
Pairwise alignment:
![Page 57: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/57.jpg)
63
Multiple Sequence Alignment
Multiple alignment: Basis for phylogenetic tree construction. Useful to find protein families and functional domains.
![Page 58: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/58.jpg)
64
EvolutionEvolution - a process in which small changes occur within species over time.
These changes are mainly monitored today using molecular sequences (DNA/proteins).
The Tree of Life: A classical, basic science problem, since Darwin’s
1859 “Origin of Species”.
![Page 59: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/59.jpg)
65
Evolution
Related organisms have similar DNA Similarity in sequences of proteins Similarity in organization of genes along the
chromosomes Evolution plays a major role in biology
Many mechanisms are shared across a wide range of organisms
During the course of evolution existing components are adapted for new functions
![Page 60: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/60.jpg)
67
The Tree of Life
Sou
rce:
Alb
erts
et
al
![Page 61: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/61.jpg)
68
Phylogeny ReconstructionGoal: Given a set of species, reconstruct the
tree which best explains their evolutionary history.
![Page 62: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/62.jpg)
70
Today most phylogenetic trees are based on molecular sequence data (DNA or proteins).
Darwin (Origin of Species, 1859) and his contemporaries based their work on morphological and physiological properties (e.g. cold/warm blood, existance of scales, number of teeth, existance of wings, etc., etc.). Paleontological data is still in use when constructing trees for certain extinct species (e.g. dinosaures, mammoths, moas, unicorns, etc…)
Trees are Based on What ?
![Page 63: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/63.jpg)
71
Phylogenetic Trees Based on What?
1. Morphology
2. Single genes
3. Whole genomes
![Page 64: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/64.jpg)
72
www.tomchalk.com/evolution.gif
Evolution
![Page 65: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/65.jpg)
73
Example for Phylogenetic AnalysisInput: four nucleotide sequences: AAG, AAA, GGA, AGA taken from four species.
Question: Which evolutionary tree best explains these sequences ?
AGAAAA
GGAAAG
AAA AAA
AAA
21 1
Total #substitutions = 4
One Answer (the parsimony principle): Pick a tree that has a minimum total number of substitutions of symbols between species and their originator in the evolutionary tree (Also called phylogenetic tree).
![Page 66: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/66.jpg)
74
DNA Microarrays (Chips)
![Page 67: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/67.jpg)
75
A Modern Use of WC Complimentarity
A binds to TC binds to G
AATGCTTAGTCTTACGAATCAG
Perfect match
AATGCGTAGTCTTACGAATCAG
One base mismatch
![Page 68: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/68.jpg)
76
Array Based Hybridization Assays (DNA Chips)
Unknown sequence (target)Many copies.
Array of probes
![Page 69: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/69.jpg)
77
• Target hybs to WC complimentary probes only• Therefore – the fluorescence pattern is indicative of the
target sequence.
Array Based Hyb Assays
![Page 70: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/70.jpg)
78
Microarrays (“DNA Chips”)Leading edge, future technologies (since 1988):
In a single experiment, measure expression level of thousands of genes.
• Find informative genes that may have predictive power for medical diagnosis.
• Potential for personalized medicine, e.g. kits for identifying cancer types and prescribe “personal”
treatment.
![Page 71: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/71.jpg)
79
• Each chip has n “pixels” on it. Every pixel contains copies of a probe from a single gene.
• Do m experiments: Cells in each experiment are taken from different conditions: (different phase of cell cycle, different patient, different type of tissue etc.).
• Purpose: Measure mRNA expression levels (Color coded) of all n genes in one experiment.
DNA Chips - Structure
![Page 72: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/72.jpg)
80
Gene Expression Matrix
• Rows correspond to genes. (Typically n between 500 and 15,000).
• Columns correspond to experiments. (Typically m between 10 and 200).
• Entryi, j = expression level
of gene i, in experiment j.
![Page 73: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/73.jpg)
81
Algorithmic Challange
Analyse the vast amount of data in gene expression matrices.
Discover meaningful biological structures and functions.
And now, time for a break
![Page 74: Algorithms in Computational Biology Fall 2005/6 Lecturer: Benny Chor (benny AT cs.tau.ac.il) Lectures: Thursdays 18:00-21:00, Schreiber 007](https://reader035.vdocument.in/reader035/viewer/2022070407/56649e375503460f94b27a9b/html5/thumbnails/74.jpg)
82
Are We Close to Being Done ?
“Now this is not the end.
It is not even the beginning of the end.
But it is perhaps, the end of the beginning”.
Winston Churchill, 1942 (3 years into WW2)
http://www.globecartoon.com/neweconomy/10.html