co 10. genome: the entire collection of genes encoded by a particular organism. determination of a...

66
CO 10

Upload: lynne-singleton

Post on 28-Dec-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

CO 10

Page 2: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Genome:

The entire collection of genes encoded by a particular organism.

Determination of a entire genome sequence is a prerequisite to understanding the completebiology of an organism.

Page 3: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Structural: construction of sequence data and gene map.

Functional: functions of genes, and their regulation and products.

Comparative: compare genes from different genomes to elucidate functional and evolutional relationship.

Genomics:

Page 4: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

1990: International Human Genome project begins.

1. To generate physical, genetic, and sequence map of the human genome.

2. To sequence the genome of a variety of model organisms.

3. To develop improved technologies for mapping and sequencing.

4. To develop computational tools for capturing, storing, analyzing, displaying, and distributing map and sequence information.

History

Page 5: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

5. To sequence EST (expressed-sequence tag) fragments of cDNA, and eventually full-length cDNA in different cell

types of human and mice.

6. To consider the ethical, social, and legal challenges posted

by genomic information.

History

1990: International Human Genome project begins.

Page 6: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.1

Page 7: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

What in this chapter?

• Challenges and strategies of genome analysis

• Major insights emerging from complete genome sequences

• High throughput tools for analyzing genome and their products.

Page 8: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Table 10.1

The genomes of living Organisms vary enormously in size

Page 9: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Sequences and polymorphisms

• Sequence error rate: 1% per sequence read Good genomic sequence errors: 1/10,000

Polymorphisms: 1/500 bp.

Repeated sequences may be hard to placeUnclonable DNA cannot be sequenced

Challenges and strategies of genome analysis

Page 10: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.2

A divide and conquer strategy

Page 11: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

10-fold sequence coverage

Sequencing of every chromosomal region from 10 independent inserts can generate an error rate of less than 1/10000.

Random sequence error:1/10 sequence fragments

Polymorphisms: 5/10 sequence fragments

Page 12: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Major techniques in genome characterization

Cloning

hybridization

PCR amplification

sequencing

Computational tool

Page 13: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Three types of maps used in the analysis of human genome

• Linkage map (DNA markers)

• Physical map (divide and conquer)

• Sequence map

Human genome: 3X109

Page 14: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.3

The making of large-scale linkage maps

Two common types of polymorphisms used or mapping

DNA markers

(expand or contract during replication)

Page 15: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Genomewide identification of genetic markers

Identification of SSR by specific pairs of PCR primers

Page 16: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Human Linkage Map

• 20,000 SSRs, 4 million SNPs.

Page 17: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.4

In human: 1 cM= 1 MbIn mice: 1 cM= 2 Mb

Physical MapsOverlapping DNA fragments that are ordered and oriented

and span each of the chromosomes in a genome

The molecular counterparts of linkage maps

Page 18: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

How to build the long-range physical maps:

Bottom-up and Top-Down approaches

Page 19: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

A Hypothetical physical map generated by the analysis of sequence

tagged sites

STS: sequence tagged sites

Page 20: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.5

Dark band: gene poor, AT richLight band: gene rich, CG rich

metaphase

Page 21: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Chromosome 7 at three levels of banding resolution

Page 22: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.6

FISH (fluorescent in situ hybridization)

Page 23: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Advantages of FISH compared to linkage mapping

1. All clones can be mapped by FISH, but those that detect polymorphisms can be mapped by linkage analysis.

2. FISH can be done on any clone locus in isolation, but linkage requires the analysis of one locus in relation to another.

3. FISH requires only a single sample, linkage requires genotype information from a large cohort of individuals.

Disadvantages: low resolution, 4-8 Mb

Page 24: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

A sequencing map is the highest-resolutiongenomic map

Hierarchical shotgun approach

Whole-genome shotgun approach

Page 25: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.12

Hierarchical shotgun approach

minimal overlappingBACs

10X coverage acrossThe BAC insert

200kbX10/2Kb=1000

Page 26: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.13

Whole genome shotgun approach

10-fold sequencecoverage

3X109X6/2000

Page 27: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Whole genome shotgun approach

Advantages: no construction of physical map.

Disadvantage: some genomic sequences can not be cloned.

Page 28: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

The human genome project has changed the practice of Biology, genetics, and genomics

Gene finding and gene-function analyses:

•Through comparative genomics, Identification of genes and gene functions in second genome is facilitated by sequence homology.

•Genes often encodes one or more protein domains. These information provide insights into the functions of a protein.

Page 29: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.14

Page 30: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.15

Synteticblocks

Page 31: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Major insights from the Human and model organismgenome sequence

1. There are approximately 30,000 human genes. 2. Genes encodes either noncoding RNAs or proteins Non-coding RNAs: tRNA,tRNA,snoRNA (small nucleolar RNAs)snRNA (small nuclear RNAs)

Page 32: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

3. Higher complexity of proteome in human: more genes,

more paralogous, alternative splicing.

Homologous genes: genes with enough sequence similarity to be evolutionarily related.

Orthologous genes: defined by their sequence similarities, are genes in two different species that arose from the same gene in the two species’ common ancestor. Paralogous genes: arise by duplication within the same species.

Page 33: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

4. More Domain architecture:

Major insights from the Human and model organismgenome sequence

Page 34: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

5. Chemical modification of proteins

• 400 different chemical modification

• 1 million different proteins

Page 35: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Major insights from the Human and model organismgenome sequence

6. Repeated sequences constitute more than 50% of the human genome.

Transposon-derived repeats, pseudogenes, or simple sequence repeats

Page 36: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Major insights from the Human and model organismgenome sequence

6. The genome contains distinct types of gene organization

A). gene family: multiple related genesolfactory gene family (1000 genes), histones, hemoglobins,

Page 37: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.19 Olfactory receptor gene family

1. One gene undergoes duplication to generate 20 paralogs.2. Massive duplication created 30 sites of the original 20-paralog family.

Page 38: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.20

B). Gene rich region 70% DNA is transcribed

C). Gene deserts

82 gene deserts: no identifiable gene within a megabase

60 genes/700 kb

Page 39: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.21

Combinational strategies may amplify geneticInformation and generate diversity

at DNA level

Antibody or T-cell receptor genes: VDJ recombination

Page 40: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.22

Combinational strategies may amplify geneticInformation and generate diversity

At the RNA level

Page 41: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

High throughput genomic and proteomic platformspermit the global analysis of gene product

Page 42: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.23

Sanger sequencing scheme

Page 43: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

DNA arrays

Macroarray: cDNA on nylon membrane

Microarray: PCR amplified product on glass-slide

Oligonuclotide array: chemically synthesized 20- 60 nt of DNA or RNA

Page 44: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.25

Normaltumor

Normal

tumor

Two-color DNA microarray

Page 45: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.27

Mass/chargeratios

Protein analyses

Page 46: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.28

MPSS: methods to identify transcriptome

(multiple parallel signature sequencing)

Page 47: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.31

Protein-protein interaction:affinity purification and mass spectrometry

Page 48: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.32

The yeast two-hybrid

Page 49: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

System BiologyGlobal study of multiple components of biological

systems and their simultaneous interaction

Page 50: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

System Biology approaches

1. Formulate a computer-based model based on current understanding.2. To define as many of the system’s element as possible by discovery science.3. Perturb the system either genetically or environmentally and

measure changes.

Page 51: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.33

Perturb the system and measure changes

Page 52: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.34

Page 53: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.35

4. Integrate the biological information, and compare these data against prediction of the model

Page 54: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

5. Formulate hypothesis to explain disparities betweenexperimental data and the model, and use these hypothesis as the basis for a second round of perturbation

6. Refine the model until model and experiment are in accord with one another.

Page 55: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

TABLES

Page 56: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Table 10.2a

Page 57: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Table 10.2b

Page 58: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Table 10.3

Page 59: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.9b

Page 60: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.11

Page 61: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.16

Page 62: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.26

Page 63: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.36

Page 64: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.29

Page 65: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.30

Page 66: CO 10. Genome: The entire collection of genes encoded by a particular organism. Determination of a entire genome sequence is a prerequisite to understanding

Fig. 10.7

Basic procedures in building a whole chromosome physical map