1 genomics class: molecular biology, gibms 2004 source: “molecular biology” by robert f. weaver...
Post on 21-Jan-2016
216 Views
Preview:
TRANSCRIPT
1
Genomics
Class: Molecular Biology, GIBMS 2004 Source: “Molecular Biology” by Robert F. Weaver
2nd Edition, McGraw Hill Publishing, 2002
2
Subjects To Be Covered
Sequencing of GenomesSequencing of GenomesThe human genome projectVectors of large scale genome projectsThe clone-by-clone strategyShotgun sequencingProgress in sequencing human genome
Genomics and Its ApplicationsTechniques in functional genomicsPositional cloningApplications of functional genomicsOther applicationsBioinformatics and proteomics
3
Sequencing of Genomes
1977; Fred Sanger; X 174 bacteriophage; 5,375 ntConcept of ORF as coding regionAmino acid sequence of phage proteinsOverlapping genes [Figure 24-1] only in viruses
1995; Craig Venter & Hamilton Smith;Haemophilus influenzae (1,830,137 nt) (1st free living)Mycoplasma genitalium (smallest free-living, 580,000 nt; 470 genes)
1996; Saccharomyces cerevisiae; (1st eukaryote) 12,068,000 nt1997; Escherichia coli; 4,639,221 nt; Genetically more importantMany firsts followed1999; Human chromosome 22; 53,000,000 nt2000; Drosophila melanogaster; 180,000,000 nt2001; Human; Working draft; 3,200,000,000 nt
4
5
Subjects To Be Covered
Sequencing of GenomesThe human genome projectVectors of large scale genome projectsThe clone-by-clone strategyShotgun sequencingProgress in sequencing human genome
Genomics and Its ApplicationsTechniques in functional genomicsPositional cloningApplications of functional genomicsOther applicationsBioinformatics and proteomics
6
Sequencing of GenomesHuman Genome Project
International project
Controversial: proposed in 1990Sizes and costs (500,000 pages just to print, time to read them?)Social implications More so
ApproachesSystematic and conservative; Francis Collins; expected done by 20051998; Craig Venter; Celera (VitaGenomics Taiwan); by 2000 using shotgun sequencing needs powerful computer
Rough drafts of Human GenomeAnnounced June 26, 2000; 3,200,000,000 nt; 85%-99% complete
7
Subjects To Be Covered
Sequencing of GenomesThe human genome projectVectors of large scale genome projectsThe clone-by-clone strategyShotgun sequencingProgress in sequencing human genome
Genomics and Its ApplicationsTechniques in functional genomicsPositional cloningApplications of functional genomicsOther applicationsBioinformatics and proteomics
8
Sequencing of GenomesVectors for Large-Scale Genome Project
Vectors needed: Yeast & bacterial artificial chromosomes
Cloning capacity; cosmid ~50Kb
Yeast artificial chromosomes (YAC) [Fig. 24-2]Large capacity & self replicating1,000,000+ nt capacityInefficient; Isolation; Unstable (linear); Cryptic
Bacterial artificial chromosome (BAC) [Fig. 24-3]Based on F and F’ plasmids that conjugate between bacterial cellsMobilize the whole host chromosome after insertion between cells300,000 nt capacity
9
10
Constructed in 19Constructed in 1992MCS: Multiple Cloning Site for cloningCmR for selection
11
Subjects To Be Covered
Sequencing of GenomesThe human genome projectVectors of large scale genome projectsThe clone-by-clone strategyShotgun sequencingProgress in sequencing human genome
Genomics and Its ApplicationsTechniques in functional genomicsPositional cloningApplications of functional genomicsOther applicationsBioinformatics and proteomics
12
Sequencing of GenomesThe Clone-by-Clone Strategy
Mapping (genetically & physically) the whole genomeUse overlapping clones Clone-by-Clone sequencing strategyLooking for “flag posts”
Tools for mapping of genes:Restriction Fragment Length Polymorphisms (RFLPs) [Fig. 24-4]
Use to determine the position/location of a gene or a stretch of DNAHow to look for RFLPs?
Variable Number of Tandem Repeats (VNTRs)Repeated sequences in tandem derived from minisatellites
Sequence Tagged Sites (STSs) [Fig. 24-5]Short (60-1000 bp) sequences detectable by PCR
Microsatellites: repeats of very short sequencesHighly polymorphic, thus genetic mapping is possibleUseful in physical mapping or locating specific sequence in the genome
13
2 individuals are polymorphic with respect to a HindIII site (in red)
14
Primers for PCR were designed from sequences of small areas of DNA that were already known
15
Sequencing of GenomesThe Clone-by-Clone Strategy
Tools for gene mapping: landmarks that relate to gene positions
Construction of physical map with sequencing dataMapping with STSs [Fig. 24-6]Very laborious due to the sizes of the BACsRadiation Hybrid Mapping
Ionizing radiation to create chromosome fragmentsForm hybrid cells with hamster cellsExamine individually cloned cells
For mapping human chromosomesA set of landmarks or signposts are needed and thus used to relate the positions of genes1998 STS-based maps constructed that included 30,000+ genes
16
After a number of positive BACs, one can begin mapping by screening these BACs for STSs in sequential manner
17
Subjects To Be Covered
Sequencing of GenomesThe human genome projectVectors of large scale genome projectsThe clone-by-clone strategyShotgun sequencingProgress in sequencing human genome
Genomics and Its ApplicationsTechniques in functional genomicsPositional cloningApplications of functional genomicsOther applicationsBioinformatics and proteomics
18
Sequencing of GenomesShotgun Sequencing
The shotgun sequencing strategy [Fig. 24-7]
Directly to sequencing without mapping1996; Craig Venter, Hamilton Smith, Leroy Hood500 nt/end x 300,000 BAC clones = 300 million nts = 10% total human genome500 nt sequenced are dispersed around every 5,000 kbActed as sequence-tagged connector (STC) for each BAC cloneEach of the 300,000 clones connects via STC to 30 other clones
Fingerprinting of each clones
“BAC walking”
19
<1> BAC library<2> Plasmid library<3> Fingerprinting<4> BAC walking Powerful computer
20
Subjects To Be Covered
Sequencing of GenomesThe human genome projectVectors of large scale genome projectsThe clone-by-clone strategyShotgun sequencingProgress in sequencing human genome
Genomics and Its ApplicationsTechniques in functional genomicsPositional cloningApplications of functional genomicsOther applicationsBioinformatics and proteomics
21
Sequencing of GenomesProgress in Sequencing the Human Genome
Progress: Working draft: 90% complete with 1% errorFinal draft: as complete as possible with less than 0.01% error (1 in 10,000)
“Functionally complete”33,464,000 of the 34,491,000 nt (97.02%) were sequencedError rate at 1 per 50,000 nt Primarily the 22q
1999; Final draft of human chromosome 222000; Final draft of human chromosome 212001; Working draft of whole human chromosomes
What do we learned from chromosome 22?<1> still contains 11 gaps of “unclonable” and “unsequenceable” DNA<2> 800 genes (679 known, related & pseudogenes, 100 predicted, 225 unknown)<3> exons account for 3% of total length<4> recombination rates vary along the chromosome [Fig. 24-8]<5> local and long-range duplications<6> large regions of 22q are conserved in mouse [Fig. 24-9]
22
23
24
Sequencing of GenomesProgress in Sequencing the Human Genome
1999; Final draft of human chromosome 22
2000; Final draft of human chromosome 21Involved in Down’s Syndrome (trisomy 21)Primarily from 21q, with minors from 21pA total of 33,500,000+ nt were sequenced (99.7% of total length)Gaps (3) also present that no sequences are availableRelatively low gene density; 225 identified genes (127 known, 98 predicted)Total number of genes estimated in human:
40,000 genes (based on chromosomes 21 & 22)30,000 genes (working draft of whole chromosomes)
Large regions of conservation between human and mouse chromosomesIdentity of gene(s) responsible for Down’s Syndrome still unknown
2001; Working draft of whole human chromosomes
25
Sequencing of GenomesProgress in Sequencing the Human Genome
1999; Final draft of human chromosome 222000; Final draft of human chromosome 212001; Working draft of whole human chromosomes
2.9 billion (Venter et al) to 3.2 billion (Collins et al) ntGaps and inaccuracies, but nevertheless, extremely informative25,000–40,000 genes (another 12,000 possible genes)
Only 2x more than fruit fliesOrganisms complexity not proportional to gene numbers
Expression of human genome is more complexAlternative splicing? 40% of genesPost-translational modifications?
Source of human genes: importation (from bacteria?)About 50% human genome came from transposon action
all known transposons in human are inactive now
26
Subjects To Be Covered
Sequencing of GenomesThe human genome projectVectors of large scale genome projectsThe clone-by-clone strategyShotgun sequencingProgress in sequencing human genome
Genomics and Its ApplicationsGenomics and Its ApplicationsTechniques in functional genomicsPositional cloningApplications of functional genomicsOther applicationsBioinformatics and proteomics
27
Genomics and Its Applications
Structure genomicssequencing data
What can we use the genomic DNA sequences for?
Applications:Study the expression of large number of genes
“Functional Genomics”Finding/Identify the functions of genes, especially in diseases
“Positional Cloning”Others
28
Subjects To Be Covered
Sequencing of GenomesThe human genome projectVectors of large scale genome projectsThe clone-by-clone strategyShotgun sequencingProgress in sequencing human genome
Genomics and Its ApplicationsTechniques in functional genomicsPositional cloningApplications of functional genomicsOther applicationsBioinformatics and proteomics
29
Genomics and Its ApplicationsTechniques in Functional Genomics
Blotting analysis in the past/Miniaturized the blotting analysisin order to study the pattern of expression of genes
DNA microarray0.25-1 nL (billionth of a liter) per spot [Fig. 24-10]
5,808 DNA spots/microscope slide DNA microchipsSynthesize oligonucleotides directly on glass chips [Fig. 24-11]
Oligonucleotide arrayHow long must a nucleotide be to uniquely identify a human gene in a mixture of all other human genes?
Hybridization analysis on DNA chip [Fig. 24-12]300,000 oligonucleotides in a 0.5” X 0.5” glass areaExpressing of every and all yeast gene at the same time has been determined
Serial analysis of gene expression (SAGE) [Fig. 24-13]Short cDNAs (tags) are synthesized from all mRNAs in a cellTags are linked together in clones, sequenced to determine the nature (expression) of them
30
1” X 3” glass microscopic slide with 5,808 tiny spots of DNA
31
Circle: reactive groupsRed: photosensitive blocking agentBlue: masking agent
32
Serum-starved: green (#3)Serum-stimulated: red (#2, #4)
33
34
Subjects To Be Covered
Sequencing of GenomesThe human genome projectVectors of large scale genome projectsThe clone-by-clone strategyShotgun sequencingProgress in sequencing human genome
Genomics and Its ApplicationsTechniques in functional genomicsPositional cloningApplications of functional genomicsOther applicationsBioinformatics and proteomics
35
Genomics and Its ApplicationsPositional Cloning
Before genomic era
Positional cloning is used to look for a gene responsible for a disease without knowing the function of its protein product to locate a gene responsible for a disease on the chromosome
Strategies of positional cloningObtain markers closely linked to the diseaseScan regions between markers and possible genes
Search for exons with “exon traps” techniqueLocate “CpG islands” that tend to associate with genesOther tools
Human Genome Project made the scanning much easier
36
Genomics and Its ApplicationsPositional Cloning
“exon traps” or “exon amplification” technique [Fig. 24-14]Look for ORFs?
More efficiently with “exon traps” techniqueVector contains chimeric gene under SV40 promoter controlLook for exons in amplified products after cloning of cDNAAll exons or ORFs contain splice sites and thus survive propagation in cells
Locate “CpG islands”Active human genes tend to associate with unmethylated CpGInactive human genes are mostly methylated CpGHpaII recognizes only unmethylated CCGG
HpaII will only cut active genes
37
38
Subjects To Be Covered
Sequencing of GenomesThe human genome projectVectors of large scale genome projectsThe clone-by-clone strategyShotgun sequencingProgress in sequencing human genome
Genomics and Its ApplicationsTechniques in functional genomicsPositional cloningApplications of functional genomicsOther applicationsBioinformatics and proteomics
39
Genomics and Its ApplicationsApplications of Functional Genomics
Huntington’s Disease (“HD”)Progressive nerve disorder:emotional disturbances & adventitious movements
Single dominant gene with linked RFLP identified [Fig. 24-15]Two (2) polymorphic sites were present in affected families
Four (4) haplotypes or haploid genotypes were possible [Fig. 24-16]
Which haplotype is associated with the Hungtington’s Disease? [Fig. 24-17]
Answer: Haplotype “C” (those with both HindIII sites) is strongly
associated with the disease
However, this haplotype association varies with families
RFLP can be used as a genetic marker, just like a gene
“HD” gene was mapped to a region on chromosome 4 with repeats of CAGNormal individuals: 11-34 “CAG” repeats (98% has less than 24 repeats)
Affected patients: >42 “CAG” repeats
Cystic fibrosis (“CF”)
40
4 haplotypes (A, B, C, D) result from the combinations of the presence or absence of the 2 HindIII sites
41
Haplotype Site 1 Site 2 FragmentsA Absent Present 17.5; 3.7; 1.2B Absent Absent 17.5; 4.9C Present Present 15.0; 3.7; 1.2D Present Absent 15.0; 4.9
42
<1> Most individuals with the “C” haplotype already have the disease<2> No disease sufferers lack the “C” haplotype
43
Genomics and Its ApplicationsApplications of Functional Genomics
Huntington’s Disease (“HD”)“HD” gene was located to a region near the end of human chromosome 4Identification of “HD” gene:
Number of “CAG” repeats of a putative gene Normal: ranged from 11 to 34; 98% had <24 Diseased: all have >42, and up to 100
Perspective studies using animal (mouse) modelApplications:
Genetic screening of potential patientsGene therapy? Normal function of “HD” gene (“huntingtin”) How the expansion of “CAG” repeats causes disease
extra glutamines in “huntingtin” protein?
Cystic fibrosis (“CF”)
44
Genomics and Its ApplicationsApplications of Functional Genomics
Huntington’s Disease (“HD”)
Cystic fibrosis (“CF”)Most common “lethal” genetic disease affects Caucasian peopleAutosomal-recessive mutation; carrier rate is 1/20Affected secretory epithelia of 1/1,600 live birthsAccumulation of mucus infectionsLinkage to known markers was established on 7q31Positional cloning & “chromosome walking” were followed [Fig. 24-18]Unclonable region“Chromosomal jumping” (over unclonable regions) [Fig. 24-19]“CF” gene spans 250Kb of DNA and includes at least 24 exons
45
46
47
Genomics and Its ApplicationsApplications of Functional Genomics
Huntington’s Disease (“HD”)Cystic fibrosis (“CF”)
Identification & authentication of “CF” gene<1> expressed in all tissues affected by CF<2> gene product contains membrane-spanning domain
regulates channel of ions across the membraneCFTR: Cystic fibrosis transmembrane conductance regulator
<3> most CF patients have a 3-bp deletion in “CFTR” genea “phenylalanine” is missing
Applications: Transgenic animal modelApplications: Gene therapy; CFTR protein as drug
48
Subjects To Be Covered
Sequencing of GenomesThe human genome projectVectors of large scale genome projectsThe clone-by-clone strategyShotgun sequencingProgress in sequencing human genome
Genomics and Its ApplicationsTechniques in functional genomicsPositional cloningApplications of functional genomicsOther applicationsBioinformatics and proteomics
49
Genomics and Its ApplicationsOther Applications
Post-genomic era:
Single Nucleotide Polymorphisms (SNPs)SNPs could link to human diseasesAssociations with:
polygenic traits, such as intelligenceresponses to drugs pharmacogenomics
Vast majority of SNPs locate outside genesSimilarities and differences between RFLPs and SNPs in human
Testing of functions of each & every genes in microorganismsintentional and targeted mutation
Protein-protein interactions and activities of gene productsyeast two-hybrid system
50
Subjects To Be Covered
Sequencing of GenomesThe human genome projectVectors of large scale genome projectsThe clone-by-clone strategyShotgun sequencingProgress in sequencing human genome
Genomics and Its ApplicationsTechniques in functional genomicsPositional cloningApplications of functional genomicsOther applicationsBioinformatics and proteomics
51
Genomics and Its ApplicationsBioinformatics & Proteomics
To access, analyze and interpret sequences in databases
Bioinformatics:Combines biology & computerized data processing knowledgeBuilding and manipulating biological database
ProteomicsGene genome, genomicsTranscripts transcriptome, transcriptomics
Protein proteome, proteomicsSeparation of proteins: 2-D P.A.G.E
Analysis of proteins: mass spectrometry [Fig. 24-20]Protein (antibody) microchips
52
Matrix-assisted laser desorption-ionization time-of-flight(MALDI-TOF) mass spectrometry
top related