chapter 14 genomes and genomics. sequencing dna dideoxy (sanger) method ddgtp ddatp ddttp ddctp...
TRANSCRIPT
Chapter 14 Genomes and Genomics
Sequencing DNAdideoxy (Sanger) method
ddGTP ddATP ddTTP ddCTP
5’TAATGTACG TAATGTACTAATGTATAATGTTAATGTAATTAATAT
Fred Sanger, Nobel prize 1980
Sequencing DNAdideoxy (Sanger) method
Leroy Hood, CaltechFluorescence based sequencing
Norm Dovici – Capillary electrophoresis
Sequencing DNAdideoxy (Sanger) method
Genomics era: High-throughput DNA sequencing
The first high-throughput genomics technology was automated DNA sequencing in the early 1990.
In September 1999, Celera Genomics completed the sequencing of the Drosophila genome.
Baker’s yeast, Saccharomyces cerevisiae (15 million bp), was the first eukaryotic genome to be sequenced.
TIGR (The Institute for Genomics Research) 1995 – first whole genome sequence, H. influenza
Genomics: Completed genomes as 2002
Currently the genome of over 600 organisms are sequenced:
This generates large amounts of information to be handled by individual computers.
http://www.genomesonline.org/
Cloning/librariesBAC, YAC and ESTs
• BAC = bacterial artificial chromosome– 150 kb, replicate in E.coli
• YAC = yeast artificial chromosome– 150 kb -1.5 Mb, replicate in yeast
Assemblingcontigs
Ordered-clone Sequencing
Clones ordered by restriction enzyme sites
Annotation
• ORF – open reading frame
• EST- Expressed sequence tag – Based on mRNA
• Comparative genomics
The trend of data growth
01
234
56
78
1980 1985 1990 1995 2000
Years
Nu
cle
oti
de
s(b
illio
n)
21st century is a century of biotechnology:
Microarray: Global expression analysis: RNA levels of every gene in the genome analyzed in parallel.
Proteomics:Global protein analysis generates by large mass spectra libraries.
Metabolomics:Global metabolite analysis: 25,000 secondary metabolites characterized
Genomics: New sequence information is being produced at increasing rates. (The
contents of GenBank double every year)
Glycomics:Global sugar metabolism analysis
How to handle the large amount of information?
Drew Sheneman, New Jersey--The Newark Star Ledger
Answer: bioinformatics and Internet
Bioinformatics history
IBM 7090 computer
In1960s: the birth of bioinformatics
Margaret Oakley Dayhoff created:The first protein databaseThe first program for sequence assembly
There is a need for computers and algorithms that allow: Access, processing, storing, sharing, retrieving, visualizing, annotating…
DNA (nucleotide sequences) databases
They are big databases and searching either one should produce similar results because they exchange information routinely.
-GenBank (NCBI): www.ncbi.nlm.nih.gov
-Arabidopsis: (TAIR) www.arabidopsis.org
Specialized databases:Tissues, species… -ESTs (Expressed Sequence Tags)
~at NCBI ~at TIGR
- ...many more!
Comparative genomics
BLAST – basic local alignment and search tool(http://www.ncbi.nlm.nih.gov/)
Homologsorthologsparalogs
QuestionYou are a researcher who has tentatively identified a human homolog of a yeast gene. You determine the DNA sequence of cDNAs of both your yeast gene and the human gene and decide to compare the gene sequences, as well as the predicted protein sequence of each, using alignment software. You would expect the greatest sequence identity from comparisons of the:
a. cDNA sequencesb. Protein sequencesc. Genomic DNA sequencesd. Both (a) and (b) will give you equivalent sequence similaritye. All will give equivalent sequence similarity
What is a microarray?
Types of Arrays
• Expression Arrays – cDNA– Genome
• Affymetrix (GeneChip®)
• Agilent
• Tiling arrays
Overview of Microarrays
Transcription Profiling of a mutant
WT
mutant
A “good” microarray plate
Red = only in treatment
Green = only in normal
Yellow = found in both
Black = found in neither
ResultsResults
100’s of genes identified,
those turned on, those turned off
Expression mapred = up regulatedgreen= down regulated
Question
Microarray technology directly involves:
a. PCR
b. DNA sequencing
c. Hybridization
d. RFLP detection
e. None of the above
Protein – protein interactions
• ChIP (chomatin immunoprecipitation)
• Yeast two hybrid
• Bi Molecular Fluorescence Complementation (BMFC)
ChIP and ChIP- chip
Yeast two hybrid
Citovsky et al., 2006
Bi Molecular Fluorescence Complementation(BMFC)
Reverse genetics
• Gene knockouts
• RNAi
• Overexpression
• Altered expression
Summary
• DNA Sequencing and the rise of genomics
• Annotation of genome sequence– Comparative genomics– Functional genomics
• Protein-protein interactions
• ESTs
• Reverse genetics