introduction to bioinformatics - university of helsinki · 20 a good biology course for computer...
TRANSCRIPT
![Page 1: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/1.jpg)
Introduction toBioinformatics
Esa Pitkä[email protected]
Autumn 2008, I periodwww.cs.helsinki.fi/mbi/courses/08-09/itb
582606 Introduction to Bioinformatics, Autumn 2008
![Page 2: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/2.jpg)
Introduction toBioinformatics
Lecture 1:Administrative issues
MBI Programme, Bioinformatics coursesWhat is bioinformatics?Molecular biology primer
![Page 3: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/3.jpg)
3
How to enrol for the course?p Use the registration system of the Computer
Science department: https://ilmo.cs.helsinki.fin You need your user account at the IT department (“cc
account”)
p If you cannot register yet, don’t worry: attend thelectures and exercises; just register when you areable to do so
![Page 4: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/4.jpg)
4
Teachersp Esa Pitkänen, Department of Computer Science,
University of Helsinkip Elja Arjas, Department of Mathematics and
Statistics, University of Helsinkip Sami Kaski, Department of Information and
Computer Science, Helsinki University ofTechnology
p Lauri Eronen, Department of Computer Science,University of Helsinki (exercises)
![Page 5: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/5.jpg)
5
Lectures and exercisesp Lectures: Tuesday and Friday 14.15-16.00
Exactum C221
p Exercises: Tuesday 16.15-18.00 ExactumC221n First exercise session on Tue 9 September
![Page 6: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/6.jpg)
6
Status & Prerequisitesp Advanced level course at the Department
of Computer Science, U. Helsinkip 4 creditsp Prerequisites:n Basic mathematics skills (probability calculus,
basic statistics)n Familiarity with computersn Basic programming skills recommendedn No biology background required
![Page 7: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/7.jpg)
7
Course contentsp What is bioinformatics?p Molecular biology primerp Biological wordsp Sequence assemblyp Sequence alignmentp Fast sequence alignment using FASTA and BLASTp Genome rearrangementsp Motif finding (tentative)p Phylogenetic treesp Gene expression analysis
![Page 8: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/8.jpg)
8
How to pass the course?p Recommended method:n Attend the lectures (not obligatory though)n Do the exercisesn Take the course exam
p Or:n Take a separate exam
![Page 9: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/9.jpg)
9
How to pass the course?p Exercises give you max. 12 points
n 0% completed assignments gives you 0 points,80% gives 12 points, the rest by linearinterpolation
n “A completed assignment” means thatp You are willing to present your solution in the
exercise session andp You return notes by e-mail to Lauri Eronen (see
course web page for contact info) describing the mainphases you took to solve the assignment
n Return notes at latest on Tuesdays 16.15
p Course exam gives you max. 48 points
![Page 10: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/10.jpg)
10
How to pass the course?p Grading: on the scale 0-5
n To get the lowest passing grade 1, you need to get atleast 30 points out of 60 maximum
p Course exam: Wed 15 October 16.00-19.00Exactum A111
p See course web page for separate examsp Note: if you take the first separate exam, the
best of the following options will be considered:n Exam gives you 48 points, exercises 12 pointsn Exam gives you 60 points
p In second and subsequent separate exams, onlythe 60 point option is in use
![Page 11: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/11.jpg)
11
Literaturep Deonier, Tavaré,
Waterman: ComputationalGenome Analysis, anIntroduction. Springer,2005
p Jones, Pevzner: AnIntroduction toBioinformatics Algorithms.MIT Press, 2004
p Slides for some lectureswill be available on thecourse web page
![Page 12: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/12.jpg)
12
Additional literaturep Gusfield: Algorithms on
strings, trees andsequences
p Griffiths et al: Introductionto genetic analysis
p Alberts et al.: Molecularbiology of the cell
p Lodish et al.: Molecular cellbiology
p Check the course web site
![Page 13: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/13.jpg)
13
Questions about administrative &practical stuff?
![Page 14: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/14.jpg)
14
Master's Degree Programme inBioinformatics (MBI)p Two-year MSc programmep Admission for 2009-2010 in January 2009
n You need to have your Bachelor’s degree ready byAugust 2009
www.cs.helsinki.fi/mbi
![Page 15: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/15.jpg)
15
MBI programme organizers
Department of Computer Science,Department of Mathematics and StatisticsFaculty of Science, Kumpula Campus, HY
Laboratory of Computer andInformation Science, Laboratory of
CS and Engineering,TKK
Faculty of Medicine, Meilahti Campus, HY
Faculty of BiosciencesFaculty of Agriculture and Forestry
Viikki Campus, HY
![Page 16: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/16.jpg)
16
TKK, Otaniemi
HY, MeilahtiHY, Kumpula
HY, Viikki
Four MBI campuses
![Page 17: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/17.jpg)
17
MBI highlightsp You can take courses from both HY and
TKKp Two biology courses tailored specifically
for MBIp Bioinformatics is a new exciting field, with
a high demand for experts in job market
p Go to www.cs.helsinki.fi/mbi/careers tofind out what a bioinformatician could dofor living
![Page 18: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/18.jpg)
18
Admissionp Admission requirements
n Bachelor’s degree in a suitable field (e.g., computerscience, mathematics, statistics, biology or medicine)
n At least 60 ECTS credits in total in computer science,mathematics and statistics
n Proficiency in English (standardized language test:TOEFL, IELTS)
p Admission period opens in late Autumn 2009 andcloses in 2 February 2009
p Details on admission will be posted inwww.cs.helsinki.fi/mbi during this autumn
![Page 19: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/19.jpg)
19
Bioinformatics courses in Helsinkiregion: 1st periodp Computational genomics (4-7 credits, TKK)p Seminar: Neuroinformatics (3 credits, Kumpula)p Seminar: Machine Learning in Bioinformatics (3
credits, Kumpula)p Signal processing in neuroinformatics (5 credits,
TKK)
![Page 20: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/20.jpg)
20
A good biology course for computerscientists and mathematicians?p Biology for methodological scientists (8 credits, Meilahti)
n Course organized by the Faculties of Bioscience and Medicinefor the MBI programme
n Introduction to basic concepts of microarrays, medical geneticsand developmental biology
n Study group + book exam in I period (2 cr)n Three lectured modules, 2 cr eachn Each module has an individual registration so you can
participate even if you missed the first modulen www.cs.helsinki.fi/mbi/courses/08-09/bfms/
![Page 21: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/21.jpg)
21
Bioinformatics courses in Helsinkiregion: 2nd periodp Bayesian paradigm in genetic bioinformatics (6
credits, Kumpula)p Biological Sequence Analysis (6 credits, Kumpula)p Modeling of biological networks (5-7 credits, TKK)p Statistical methods in genetics (6-8 credits,
Kumpula)
![Page 22: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/22.jpg)
22
Bioinformatics courses in Helsinkiregion: 3rd periodp Evolution and the theory of games (5 credits, Kumpula)p Genome-wide association mapping (6-8 credits, Kumpula)p High-Throughput Bioinformatics (5-7 credits, TKK)p Image Analysis in Neuroinformatics (5 credits, TKK)p Practical Course in Biodatabases (4-5 credits, Kumpula)p Seminar: Computational systems biology (3 credits,
Kumpula)p Spatial models in ecology and evolution (8 credits,
Kumpula)p Special course in bioinformatics I (3-7 credits, TKK)
![Page 23: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/23.jpg)
23
Bioinformatics courses in Helsinki region:4th periodp Metabolic Modeling (4 credits, Kumpula)p Phylogenetic data analyses (6-8 credits,
Kumpula)
![Page 24: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/24.jpg)
24
1. What is bioinformatics?
![Page 25: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/25.jpg)
25
What is bioinformatics?p Bioinformatics, n. The science of information
and information flow in biological systems,esp. of the use of computational methods ingenetics and genomics. (Oxford EnglishDictionary)
p "The mathematical, statistical and computingmethods that aim to solve biological problemsusing DNA and amino acid sequences andrelated information." -- Fredj Tekaia
![Page 26: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/26.jpg)
26
What is bioinformatics?p "I do not think all biological computing is
bioinformatics, e.g. mathematical modelling isnot bioinformatics, even when connected withbiology-related problems. In my opinion,bioinformatics has to do with management andthe subsequent use of biological information,particular genetic information."-- Richard Durbin
![Page 27: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/27.jpg)
27
What is not bioinformatics?p Biologically-inspired computation, e.g., genetic algorithms
and neural networksp However, application of neural networks to solve some
biological problem, could be called bioinformaticsp What about DNA computing?
http://www.wisdom.weizmann.ac.il/~lbn/new_pages/Visual_Presentation.html
![Page 28: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/28.jpg)
28
Computational biologyp Application of computing to biology (broad
definition)p Often used interchangeably with bioinformaticsp Or: Biology that is done with computational
means
![Page 29: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/29.jpg)
29
Biometry & biophysicsp Biometry: the statistical analysis of biological
datan Sometimes also the field of identification of individuals
using biological traits (a more recent definition)
p Biophysics: "an interdisciplinary field whichapplies techniques from the physical sciencesto understanding biological structure andfunction" -- British Biophysical Society
![Page 30: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/30.jpg)
30
Mathematical biologyp Mathematical biology
“tackles biologicalproblems, but the methodsit uses to tackle them neednot be numerical and neednot be implemented insoftware or hardware.”-- Damian Counsell
Alan Turing
![Page 31: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/31.jpg)
31
Turing on biological complexityp “It must be admitted that the biological examples which
it has been possible to give in the present paper are verylimited.
This can be ascribed quite simply to the fact thatbiological phenomena are usually very complicated.Taking this in combination with the relatively elementarymathematics used in this paper one could hardly expect tofind that many observed biological phenomena would becovered.
It is thought, however, that the imaginary biologicalsystems which have been treated, and the principles whichhave been discussed, should be of some help ininterpreting real biological forms.”
– Alan Turing, The Chemical Basis of Morphogenesis, 1952
![Page 32: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/32.jpg)
32
Related conceptsp Systems biology
n “Biology of networks”n Integrating different levels
of information tounderstand how biologicalsystems work
p Computational systems biology
Overview of metabolic pathways inKEGG database, www.genome.jp/kegg/
![Page 33: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/33.jpg)
33
Why is bioinformatics important?p New measurement techniques produce
huge quantities of biological datan Advanced data analysis methods are needed to
make sense of the datan Typical data sources produce noisy data with a
lot of missing values
p Paradigm shift in biology to utilisebioinformatics in research
![Page 34: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/34.jpg)
34
Bioinformatician’s skill setp Statistics, data analysis methodsn Lots of datan High noise levels, missing valuesn #attributes >> #data points
p Programming languagesn Scripting languages: Python, Perl, Ruby, …n Extensive use of text file formats: need
parsersn Integration of both data and tools
p Data structures, databases
![Page 35: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/35.jpg)
35
Bioinformatician’s skill setp Modellingn Discrete vs continuous domainsn -> Systems biology
p Scientific computation packagesn R, Matlab/Octave, …
p Communication skills!
![Page 36: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/36.jpg)
36
Communication skills: case 1Biologist presents a problemto computer scientists /mathematicians
?
”I am interested in finding what affects theregulation gene x during condition y and how
that relates to the organism’s phenotype.”
”Define input and output of the problem.”
![Page 37: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/37.jpg)
37
Communication skills: case 2Bioinformatician is a partof a group that consistsmostly of biologists.
![Page 38: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/38.jpg)
38
Communication skills: case 2
...biologist/bioinformatician ratio is important!
![Page 39: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/39.jpg)
39
Communication skills: case 3A group ofbioinformaticiansoffers their services tomore than one group
![Page 40: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/40.jpg)
40
Bioinformatician’s skill setp How much biology you should know?
![Page 41: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/41.jpg)
41
Computer Science• Programming• Databases• Algorithmics
Mathematics andstatistics• Calculus• Probability calculus• Linear algebra
Biology & Medicine• Basics in molecular andcell biology• Measurement techniques
Bioinformatics• Biological sequence analysis• Biological databases• Analysis of gene expression• Modeling protein structure andfunction• Gene, protein and metabolicnetworks• …
Bioinformatician’s skill set
Prof. Juho Rousu, 2006
Where would you be in this triangle?
![Page 42: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/42.jpg)
42
A problem involving bioinformatics?
- ”I found a fruit fly that is immune to all diseases!”
- ”It was one of these”
Pertti Jarla, http://www.hs.fi/fingerpori/
![Page 43: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/43.jpg)
43
Molecular biology primer
Molecular Biology Primer by Angela Brooks, Raymond Brown,Calvin Chen, Mike Daly, Hoa Dinh, Erinn Hama, Robert Hinman,Julio Ng, Michael Sneddon, Hoa Troung, Jerry Wang, Che FungYungEdited for Introduction to Bioinformatics (Autumn 2007, Summer2008, Autumn 2008) by Esa Pitkänen
![Page 44: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/44.jpg)
44
Molecular biology primerp Part 1: What is life made of?p Part 2: Where does the variation in
genomes come from?
![Page 45: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/45.jpg)
45
Life begins with Cell
p A cell is a smallest structural unit of anorganism that is capable of independentfunctioning
p All cells have some common features
![Page 46: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/46.jpg)
46
Cellsp Fundamental working units of every living system.p Every organism is composed of one of two radically different types of
cells:n prokaryotic cells orn eukaryotic cells.
p Prokaryotes and Eukaryotes are descended from the sameprimitive cell.n All prokaryotic and eukaryotic cells are the result of a total of 3.5
billion years of evolution.
![Page 47: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/47.jpg)
47
Two types of cells: Prokaryotes andEukaryotes
![Page 48: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/48.jpg)
48
Prokaryotes and Eukaryotesp According to the most
recent evidence, thereare three mainbranches to the tree oflife
p Prokaryotes includeArchaea (“ancientones”) and bacteria
p Eukaryotes arekingdom Eukarya andincludes plants,animals, fungi andcertain algae Lecture: Phylogenetic trees
![Page 49: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/49.jpg)
49
All Cells have common Cycles
p Born, eat, replicate, and die
![Page 50: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/50.jpg)
50
Common features of organismsp Chemical energy is stored in ATPp Genetic information is encoded by DNAp Information is transcribed into RNAp There is a common triplet genetic codep Translation into proteins involves ribosomesp Shared metabolic pathwaysp Similar proteins among diverse groups of
organisms
![Page 51: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/51.jpg)
51
All Life depends on 3 critical moleculesp DNAs (Deoxyribonucleic acid)
n Hold information on how cell works
p RNAs (Ribonucleic acid)n Act to transfer short pieces of information to different
parts of celln Provide templates to synthesize into protein
p Proteinsn Form enzymes that send signals to other cells and
regulate gene activityn Form body’s major components (e.g. hair, skin, etc.)n “Workhorses” of the cell
![Page 52: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/52.jpg)
52
DNA: The Code of Life
p The structure and the four genomic letters code for all livingorganisms
p Adenine, Guanine, Thymine, and Cytosine which pair A-T and C-Gon complimentary strands.
Lecture: Genome sequencingand assembly
![Page 53: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/53.jpg)
53
Discovery of the structure of DNAp 1952-1953 James D. Watson and Francis H. C. Crick
deduced the double helical structure of DNA from X-raydiffraction images by Rosalind Franklin and data on amountsof nucleotides in DNA
James Watson andFrancis Crick
RosalindFranklin
”Photo 51”
![Page 54: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/54.jpg)
54
DNA, continuedp DNA has a double helix
structure which iscomposed ofn sugar moleculen phosphate groupn and a base (A,C,G,T)
p By convention, we readDNA strings in direction oftranscription: from 5’ endto 3’ end5’ ATTTAGGCC 3’3’ TAAATCCGG 5’
![Page 55: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/55.jpg)
55
DNA is contained in chromosomes
http://en.wikipedia.org/wiki/Image:Chromatin_Structures.png
p In eukaryotes, DNA is packed into chromatidsn In metaphase, the “X” structure consists of two identical
chromatids
p In prokaryotes, DNA is usually contained in a single,circular chromosome
![Page 56: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/56.jpg)
56
Human chromosomesp Somatic cells in humans
have 2 pairs of 22chromosomes + XX(female) or XY (male) =total of 46 chromosomes
p Germline cells have 22chromosomes + either X orY = total of 23chromosomes
Karyogram of human male using Giemsa staining(http://en.wikipedia.org/wiki/Karyotype)
![Page 57: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/57.jpg)
57
Length of DNA and number of chromosomesOrganism #base pairs #chromosomes (germline)
ProkayoticEscherichia coli (bacterium) 4x106 1
EukaryoticSaccharomyces cerevisia (yeast) 1.35x107 17Drosophila melanogaster (insect) 1.65x108 4Homo sapiens (human) 2.9x109 23Zea mays (corn / maize) 5.0x109 10
![Page 58: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/58.jpg)
58
1 atgagccaag ttccgaacaa ggattcgcgg ggaggataga tcagcgcccg agaggggtga61 gtcggtaaag agcattggaa cgtcggagat acaactccca agaaggaaaa aagagaaagc
121 aagaagcgga tgaatttccc cataacgcca gtgaaactct aggaagggga aagagggaag181 gtggaagaga aggaggcggg cctcccgatc cgaggggccc ggcggccaag tttggaggac241 actccggccc gaagggttga gagtacccca gagggaggaa gccacacgga gtagaacaga301 gaaatcacct ccagaggacc ccttcagcga acagagagcg catcgcgaga gggagtagac361 catagcgata ggaggggatg ctaggagttg ggggagaccg aagcgaggag gaaagcaaag421 agagcagcgg ggctagcagg tgggtgttcc gccccccgag aggggacgag tgaggcttat481 cccggggaac tcgacttatc gtccccacat agcagactcc cggaccccct ttcaaagtga541 ccgagggggg tgactttgaa cattggggac cagtggagcc atgggatgct cctcccgatt601 ccgcccaagc tccttccccc caagggtcgc ccaggaatgg cgggacccca ctctgcaggg661 tccgcgttcc atcctttctt acctgatggc cggcatggtc ccagcctcct cgctggcgcc721 ggctgggcaa cattccgagg ggaccgtccc ctcggtaatg gcgaatggga cccacaaatc781 tctctagctt cccagagaga agcgagagaa aagtggctct cccttagcca tccgagtgga841 cgtgcgtcct ccttcggatg cccaggtcgg accgcgagga ggtggagatg ccatgccgac901 ccgaagagga aagaaggacg cgagacgcaa acctgcgagt ggaaacccgc tttattcact961 ggggtcgaca actctgggga gaggagggag ggtcggctgg gaagagtata tcctatggga
1021 atccctggct tccccttatg tccagtccct ccccggtccg agtaaagggg gactccggga1081 ctccttgcat gctggggacg aagccgcccc cgggcgctcc cctcgttcca ccttcgaggg1141 ggttcacacc cccaacctgc gggccggcta ttcttctttc ccttctctcg tcttcctcgg1201 tcaacctcct aagttcctct tcctcctcct tgctgaggtt ctttcccccc gccgatagct1261 gctttctctt gttctcgagg gccttccttc gtcggtgatc ctgcctctcc ttgtcggtga1321 atcctcccct ggaaggcctc ttcctaggtc cggagtctac ttccatctgg tccgttcggg1381 ccctcttcgc cgggggagcc ccctctccat ccttatcttt ctttccgaga attcctttga1441 tgtttcccag ccagggatgt tcatcctcaa gtttcttgat tttcttctta accttccgga1501 ggtctctctc gagttcctct aacttctttc ttccgctcac ccactgctcg agaacctctt1561 ctctcccccc gcggtttttc cttccttcgg gccggctcat cttcgactag aggcgacggt1621 cctcagtact cttactcttt tctgtaaaga ggagactgct ggccctgtcg cccaagttcg1681 ag
Hepatitis delta virus, complete genome
![Page 59: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/59.jpg)
59
RNAp RNA is similar to DNA chemically. It is usually only a
single strand. T(hyamine) is replaced by U(racil)p Several types of RNA exist for different functions in
the cell.
http://www.cgl.ucsf.edu/home/glasfeld/tutorial/trna/trna.giftRNA linear and 3D view:
![Page 60: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/60.jpg)
60
DNA, RNA, and the Flow ofInformation
TranslationTranscription
Replication ”The central dogma”
Is this true?
Denis Noble: The principles of Systems Biology illustrated using the virtual hearthttp://velblod.videolectures.net/2007/pascal/eccs07_dresden/noble_denis/eccs07_noble_psb_01.ppt
![Page 61: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/61.jpg)
61
Proteinsp Proteins are polypeptides
(strings of amino acidresidues)
p Represented using stringsof letters from an alphabetof 20: AEGLV…WKKLAG
p Typical length 50…1000residues
Urease enzyme from Helicobacter pylori
![Page 62: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/62.jpg)
62
Amino acids
http://upload.wikimedia.org/wikipedia/commons/c/c5/Amino_acids_2.png
![Page 63: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/63.jpg)
63
How DNA/RNA codes for protein?p DNA alphabet contains four
letters but must specifyprotein, or polypeptidesequence of 20 letters.
p Dinucleotides are notenough: 42 = 16 possibledinucleotides
p Trinucleotides (triplets)allow 43 = 64 possibletrinucleotides
p Triplets are also calledcodons
![Page 64: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/64.jpg)
64
How DNA/RNA codes for protein?p Three of the possible
triplets specify ”stoptranslation”
p Translation usually startsat triplet AUG (this codesfor methionine)
p Most amino acids may bespecified by more thantriplet
p How to find a gene? Lookfor start and stop codons(not that easy though)
![Page 65: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/65.jpg)
65
Proteins: Workhorses of the Cell
p 20 different amino acidsn different chemical properties cause the protein chains to fold
up into specific three-dimensional structures that define theirparticular functions in the cell.
p Proteins do all essential work for the celln build cellular structuresn digest nutrientsn execute metabolic functionsn mediate information flow within a cell and among cellular
communities.p Proteins work together with other proteins or nucleic acids
as "molecular machines"n structures that fit together and function in highly specific, lock-
and-key ways.
Lecture 8: Proteomics
![Page 66: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/66.jpg)
66
Genesp “A gene is a union of genomic sequences encoding a
coherent set of potentially overlapping functional products”--Gerstein et al.
p A DNA segment whose information is expressed either asan RNA molecule or protein
5’ 3’
3’ 5’
… a t g a g t g g a …
… t a c t c a c c t …
augagugga ...
(transcription)
(translation)MSG …
(folding)
http://fold.it
![Page 68: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/68.jpg)
68
Genes & allelesp A gene can have different variantsp The variants of the same gene are called
alleles
5’
3’
… a t g a g t g g a …
… t a c t c a c c t …
augagugga ...
MSG …
5’
3’
… a t g a g t c g a …
… t a c t c a g c t …
augagucga ...
MSR …
![Page 69: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/69.jpg)
69
Genes can be found on both strands
3’
5’
5’
3’
![Page 70: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/70.jpg)
70
Exons and introns & splicing
3’
5’
5’
3’
Introns are removed from RNA after transcription
Exons
Exons are joined:
This process is called splicing
![Page 71: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/71.jpg)
71
Alternative splicing
A 3’
5’
5’
3’
B C
Different splice variants may be generated
A B C
B C
A C
…
![Page 72: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/72.jpg)
72
Where does the variation in genomes comefrom?p Prokaryotes are typically
haploid: they have a single(circular) chromosome
p DNA is usually inheritedvertically (parent todaughter)
p Inheritance is clonaln Descendants are faithful
copies of an ancestral DNAn Variation is introduced via
mutations, transposableelements, and horizontaltransfer of DNA
Chromosome map of S. dysenteriae, the nine ringsdescribe different properties of the genomehttp://www.mgc.ac.cn/ShiBASE/circular_Sd197.htm
![Page 73: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/73.jpg)
73
Causes of variationp Mistakes in DNA replicationp Environmental agents (radiation, chemical
agents)p Transposable elements (transposons)
n A part of DNA is moved or copied to another location ingenome
p Horizontal transfer of DNAn Organism obtains genetic material from another
organism that is not its parentn Utilized in genetic engineering
![Page 74: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/74.jpg)
74
Biological string manipulationp Point mutation: substitution of a base
n …ACGGCT… => …ACGCCT…
p Deletion: removal of one or more contiguousbases (substring)n …TTGATCA… => …TTTCA…
p Insertion: insertion of a substringn …GGCTAG… => …GGTCAACTAG…
Lecture: Sequence alignmentLecture: Genome rearrangements
![Page 75: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/75.jpg)
75
Meiosisp Sexual organisms are usually
diploidn Germline cells (gametes)
contain N chromosomesn Somatic (body) cells have 2N
chromosomesp Meiosis: reduction of
chromosome number from2N to N during reproductivecyclen One chromosome doubling is
followed by two cell divisions
Major events in meiosis
http://en.wikipedia.org/wiki/Meiosis
http://www.ncbi.nlm.nih.gov/About/Primer
![Page 76: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/76.jpg)
76
Recombination and variationp Recap: Allele is a viable DNA
coding occupying a given locus(position in the genome)
p In recombination, alleles fromparents become suffled inoffspring individuals viachromosomal crossover over
p Allele combinations inoffspring are usually differentfrom combinations found inparents
p Recombination errors lead intoadditional variations
Chromosomal crossover as described byT. H. Morgan in 1916
![Page 77: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/77.jpg)
77
Mitosis
http://en.wikipedia.org/wiki/Image:Major_events_in_mitosis.svg
p Mitosis: growth and development of the organismn One chromosome doubling is followed by one cell
division
![Page 78: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/78.jpg)
78
Recombination frequency and linked genesp Genetic marker: some DNA sequence of interest
(e.g., gene or a part of a gene)
p Recombination is more likely to separate twodistant markers than two close ones
p Linked markers: ”tend” to be inherited together
p Marker distances measured in centimorgans: 1centimorgan corresponds to 1% chance that twomarkers are separated in recombination
![Page 79: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/79.jpg)
79
Biological databasesp Exponential growth of
biological datan New measurement
techniquesn Before we are able to use
the data, we need to storeit efficiently -> biologicaldatabases
n Published data issubmitted to databases
p General vs specialiseddatabases
p This topic is discussedextensively in Practicalcourse in biodatabases (IIIperiod)
![Page 80: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/80.jpg)
80
10 most important biodatabases… accordingto ”Bioinformatics for dummies”
p GenBank/DDJB/EMBL www.ncbi.nlm.nih.gov Nucleotide sequencesp Ensembl www.ensembl.org Human/mouse genomep PubMed www.ncbi.nlm.nih.gov Literature referencesp NR www.ncbi.nlm.nih.gov Protein sequencesp UniProt www.expasy.org Protein sequencesp InterPro www.ebi.ac.uk Protein domainsp OMIM www.ncbi.nlm.nih.gov Genetic diseasesp Enzymes www.expasy.org Enzymesp PDB www.rcsb.org/pdb/ Protein structuresp KEGG www.genome.ad.jp Metabolic pathways
Sophia Kossida, Introduction to Bioinformatics, Summer 2008
![Page 81: Introduction to Bioinformatics - University of Helsinki · 20 A good biology course for computer scientists and mathematicians? p Biology for methodological scientists (8 credits,](https://reader030.vdocument.in/reader030/viewer/2022041000/5ea0a1a20489630aba0239ea/html5/thumbnails/81.jpg)
81
FASTA formatp A simple format for DNA and protein sequence
data is FASTA
>Hepatitis delta virus, complete genomeatgagccaagttccgaacaaggattcgcggggaggatagatcagcgcccgagaggggtgagtcggtaaagagcattggaacgtcggagatacaactcccaagaaggaaaaaagagaaagcaagaagcggatgaatttccccataacgccagtgaaactctaggaaggggaaagagggaaggtggaagagaaggaggcgggcctcccgatccgaggggcccggcggccaagtttggaggacactccggcccgaagggttgagagtaccccagagggaggaagccacacggagtagaacagagaaatcacctccagaggaccccttcagcgaacagagagcgcatcgcgagagggagtagaccatagcgataggaggggatgctaggagttgggggagaccgaagcgaggaggaaagcaaagagagcagcggggctagcaggtgggtgttccgccccccgagaggggacgagtgaggcttatcccggggaactcgacttatcgtccccacatagcagactcccggaccccctttcaaagtga…
Header line,begins with >