bioinformatics for stem cell lecture 1
DESCRIPTION
Bioinformatics for Stem Cell Lecture 1. Debashis Sahoo , PhD. Outline. Introduction History of Bioinformatics Introduction to computing Data collection Experiment design Data analysis. Bioinformatics Definition. Biological Data Representation Storage Access Processing - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/1.jpg)
Bioinformatics for Stem CellLecture 1
Debashis Sahoo, PhD
![Page 2: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/2.jpg)
Outline
• Introduction• History of Bioinformatics• Introduction to computing• Data collection• Experiment design• Data analysis
![Page 3: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/3.jpg)
Bioinformatics Definition
• Biological Data– Representation– Storage– Access– Processing
• bi·o·in·for·mat·ics [bahy-oh-in-fer-mat-iks]– noun ( used with a singular verb ) the retrieval and
analysis of biochemical and biological data using mathematics and computer science, as in the study of genomes.
![Page 4: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/4.jpg)
http://www.merriam-webster.com/dictionary/bioinformatics
http://www.ncbi.nlm.nih.gov/About/primer/bioinformatics.html
![Page 5: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/5.jpg)
http://www.ncbi.nlm.nih.gov/About/primer/bioinformatics.html
![Page 6: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/6.jpg)
The science behind Michael Levitt's Nobel PrizeMichael Levitt, PhD, has dramatically advanced the field of structural biology by developing sophisticated computer algorithms to build models of complex biological molecules.
![Page 7: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/7.jpg)
“It is hard for me to say confidently that, after fifty more years of explosive growth of computer science, there will still be a lot of fascinating unsolved problems at peoples' fingertips, that it won't be pretty much working on refinements of well-explored things. I can't be as confident about computer science as I can about biology. Biology easily has 500 years of exciting problems to work on, it's at that level.”
Professor Donald E. Knuth
The "father" of the analysis of algorithms
He is the author of the seminal multi-volume work The Art of Computer Programming.
![Page 8: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/8.jpg)
HISTORICAL PERSPECTIVE
![Page 9: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/9.jpg)
History of Bioinformatics
• Gergor Mendel (1866, Verhandlungen des naturforschenden Vereins Brünn)
• 1951 – structure for the alpha-helix and beta-sheet– Pauling and Corey (PNAS – 1951)
• 1953 - double helix model for DNA– Watson and Crick (Nature, 171: 737-738, 1953)
• 1955 – protein sequence of bovine insulin– F. Sanger.
![Page 10: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/10.jpg)
History of Bioinformatics
• 1958 – 1990– Revolution in Computer Science and Engineering
• Computer, email, network, internet
• 1990 – BLAST– Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. (1990) "Basic
local alignment search tool." J. Mol. Biol. 215:403-410.
• 1995 - The Haemophilus influenzea genome (1.8 Mb) is sequenced.
• 1993 – 2013 – Microarrays• 2005 – 2013 – High-throughput sequencing
![Page 11: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/11.jpg)
INTRODUCTION TO COMPUTING
![Page 12: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/12.jpg)
What is a computer?
1 0 0 1 0 1 1 0
Controller
Read/Write head
Tape
Turing Machine (1936)Alan Turing, "On computable numbers, with an application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, 42 (1937), pp 230–265.
![Page 13: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/13.jpg)
Modern ComputerProcessorMain Memory Disk Drives
IO controller
Display Keyboard Mouse
![Page 14: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/14.jpg)
What is a Computer Program?
Executable file
Load to Memory
Run the program
C ProgramAssembly Program
![Page 15: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/15.jpg)
DATA COLLECTION
![Page 16: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/16.jpg)
Public Databases
• Gene Expression Omnibus (GEO)• Array Express• National Center for Biotechnology Information
(NCBI)• UCSC Genome Browser• The human protein atlas• Catalogue of Somatic Mutations in Cancer – COSMIC• The Cancer Genome Atlas (TCGA)
![Page 17: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/17.jpg)
http://www.ncbi.nlm.nih.gov/geo/
![Page 18: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/18.jpg)
http://www.ebi.ac.uk/arrayexpress
![Page 19: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/19.jpg)
http://www.ncbi.nlm.nih.gov/pubmed
![Page 20: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/20.jpg)
http://genome.ucsc.edu
![Page 21: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/21.jpg)
http://www.proteinatlas.org/
![Page 22: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/22.jpg)
http://www.sanger.ac.uk/genetics/CGP/cosmic/
![Page 23: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/23.jpg)
https://tcga-data.nci.nih.gov/tcga/
![Page 24: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/24.jpg)
EXPERIMENT DESIGN
![Page 25: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/25.jpg)
To call in the statistician after the experiment is done may be no more than asking him to perform a postmortem examination : he may be able to say what the experiment died of.
- R. A. Fisher
![Page 26: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/26.jpg)
http://graphpad.com/guides/prism/5/user-guide/prism5help.html
![Page 27: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/27.jpg)
Independent Samples
• Statistical tests are based on the assumption that each subject was sampled independently.
• Provides maximum amount of information.• Provides better estimation of the mean.
![Page 28: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/28.jpg)
The Gaussian Approximation
Everybody believes in the normal approximation, the experimenters because they think it is a mathematical theorem, the mathematicians because they think it is an experimental fact.
G. Lippman (1845 – 1921)
![Page 29: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/29.jpg)
Sample Size Estimation
![Page 30: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/30.jpg)
DATA ANALYSIS
![Page 31: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/31.jpg)
Correlation
![Page 32: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/32.jpg)
Hypothesis Testing
• Randomly select samples from the population• State the null hypothesis
– Distribution of values in two different populations are the same
• Perform the statistical test– T test, F test, Chi-sq test
• Get P-value– Set a threshold (usually < 0.05) for significance
![Page 33: Bioinformatics for Stem Cell Lecture 1](https://reader036.vdocument.in/reader036/viewer/2022062500/568150f8550346895dbf166f/html5/thumbnails/33.jpg)
Multiple Comparisons
• The Bonferroni correction– P < 0.05/N (N = number of comparisons)
• False Discovery Rate (FDR) – Q value– What fraction of all the discoveries are false?– Q = 10%, N = 100, smallest p-value < Q/N– http://genomics.princeton.edu/storeylab/qvalue/– Permutation based approaches