genomics lecture 3

39
>CEK06A5 acaagagagggcgcctcggccgtatgttgaatgggagatcgatggaaccgagacaacgagaaaaggaatagagacggagaaagagagagagagcgcgcgttgttggaaggatg aaaaagaaaaaagacatgagctgcttcacaagagcttggcgaaagcaaagggcaaagtgttgacagcttagtggtggtagttggatcttctctcctcgttctctgctcacaac tcgtctatcactcatatcacatttatttcccaatatcattttaacaacatcttccgatgcatgttcgtcaatattgcgcaaccactttgcaatattgtcaaaacttttcgcat ttgtgatatcgtaaaccagcataattcccattgctccgcggtaatatgatgttgtgattgtgtggaatcgttcttgtccagctgtgtcccagatttgtaatttaatctttttt ccttttaattcgatagttttaattttgaagtcgattcctgaatgaaaaaagaaaattattttgaaatcactagattctgaataaaaactaaccaatagttgagatgaatgtgg tgttaaaggcatcatccgaaaatctgtacagaatgcaagtttttccaactcctgagtcgcctattagcagcaatttgaagagcatgtcatacggtcggcgagccatttttctt ctgaaatgagaaaaagttgagaactaaagttgcacaaaagtaagagaaaagcacttgagtcatggcaaatagaacgaacactttgagatttcgaagaagttatcaagagttga caattggaagatatttggaagaactttctaatttttttctagttttccaaaattaggtttttgtcataaaatgttgtcaaagaaaaaacaggacaaaatagttaattgttgtt tccattataacaaaaaaaaatttgaacggagctattaacgcgtgcatgcgcaaatcacatcgattagctgtttctgggaaattctcgggaaaaggtgaacagcagctgctggc ttcctctgcgggtcacgaaaacacaaagagatcattataattgttatttggaaaggaagcgaatctaaaacgggtacaggtggacgtttattgatcgaaagtgctttttattt gaaattgaatggtgaactttgcaattttgtaatgcaaagtacgttatcagatggcatgagatgtgtgaagtgataaggaataaaatgtgaacgacatgttcaagaaactgtga tttttcaataatttgtgatgaaatattttaggaacagaaatgaacatattaattgatataaaaacaataggaacactaactcataattatgataggtgaatatcaaaatgtgc tagattttttgaagttaaaaaatacatttctaatattttttcaaataataagtttcagctgaaatttcagggtgatttcagaaagctatgttttgataaattgttttgaaaat taaaagaagctacagcaaaaaaaaattaaagagaacatcgctccctcgtagtgtataatttttgattatcgaaaaaaatgagtcaatgatgaaaaggaagtcgcaatctcaaa acttcaaaaatcaaaagaagccgttgcctctgtcatcaaaaattcagaagacaaggttgttgacaagggtcaattctcagtggtggagggcattgggcgtggtgaaatttttg aaggctagtgtggttggacctctactagatagacaaaacccccgaaatagacgtttaatttgatgagatggtggagaaagaaaaggactcattctctagatgatagagagacc agagatacagacaagagagggcgcctcggccgtatgttgaatgggagatcgatggaaccgagacaacgagaaaaggaatagagacggagaaagagagagagagcgcgcgttgt tggaaggatgaaaaagaaaaaagacatgagctgcttcacaagagcttggcgaaagcaaagggcaaagtgttgacagcttagtggtggtagttggatcatgtgtttttatgttt ccggtgggagaaggttcaacaaaaaatgaaaagaaaaagttcaagcggcatgaatcattctgagtttaaaacaaaattattgcgaaaattaatattaaaaccttttcacaaaa cttcaagctaatctgttcatgaaaatttgaataatagttttttcccacctatttagaattaacttcatattaacgaaattaattaacgaatcgaaaattatgacttttcagaa tcatctgaagttttttcacattccatgctgcatggaataatttgatcctggaatcgatatgtttttatggtatactttttaaccttcaatttagctggaaaagtatggaataa ataattcccgaagctatgtacatatatgtagaattattgaatgattgtgagaacaacttgactttagcttgagtaggaatcggaatggctatcgaccgatcaacacttaggat tgtaagaatggcagtaagaatatattgaagaaagaatgtttgttcataggaagagaaagagtattgcgaaatcatcatcgcccactttagaatggacgggcggtgagcggaca tagagaattgtgaatgactaatgcttttgcagaatctagggcaaaatcgtaggaacaaacaattgtaatacggagaaaacaatcatatcgatcgatgatcatggagaaaaatg tgatttaagtgagtagacttggaaaaattaataaaagcatgaattgtcgatatttttcatttattttcattataaagctctttaaaaacaaattaaatattgagaatggcttc gaagaatattgtttcaaatatgttcaatggtgacaccttgcggataaaattaatgtaaaaatcatggaacacagattcactgatatctcattatctcaagcagtgtaattaga gattttttggaacaattattttataaaactataaataaaccgtttatactactcaaagccaaatattcaagctattaccattttttttctaactaattcttgagcaattaaag tattccccagtttttattttgcaacgactccaggcaaacacgctccgttgcacttgccgccaaggcgttgcattcaaatcagagagacatctcattccgatttctgtttttct tccaataaacggtattttatgcctaatgggtgatacggaaattgttcctcttcgagtacaaaatgtacttgatagcgaaatcattcgtctcaacttgtggtccatgaaggtaa ctgtctagtttttttaagttttcatgatttcaatatttttacagtttaacgcgaccagtttcaaactcgaaggttttgtgagaaatgaagaaggcactatgatgcagaaagtt tgttccgaatttatttgtgtaagtcgagaaacatattcgtcaacaattttcattaaatattcagagacgcttcacttctacgttgcttttcgatgtttccggacgtttcttcg acttggtcggacagattgatcgggaatatcaacaaaaaatgggaatgcctagtagaattattgatgaattttcaaatggaattcctgaaaattgggccgaccttatctattcc tgcatgtcagccaaccaaagaagcgcacttcgccctatccaacaggctccaaaagaaccaattagaactagaacagaaccaattgttacgttggcagatgaaaccgagctaac tggaggatgccagaaaaattccgaaaacgagaaagaaaggaacagacgtgagcgtgaagaacagcaaacaaaggaacgtgagagaagattagaagaagaaaaacaacgacgag atgctgaagctgaggctgaaagaaggcgaaaagaagaggaagagctggaagaagctaattacacccttcgtgctccgaaatctcagaacggcgagccaatcactccgataaga C. elegans cosmid K06A5, 24323 bp. Flat sequence file –3955 bp shown.

Upload: iainj88

Post on 11-May-2015

1.028 views

Category:

Technology


0 download

DESCRIPTION

Background to genomics - based on the C. elegans genome project.

TRANSCRIPT

Page 1: Genomics lecture 3

>CEK06A5acaagagagggcgcctcggccgtatgttgaatgggagatcgatggaaccgagacaacgagaaaaggaatagagacggagaaagagagagagagcgcgcgttgttggaaggatgaaaaagaaaaaagacatgagctgcttcacaagagcttggcgaaagcaaagggcaaagtgttgacagcttagtggtggtagttggatcttctctcctcgttctctgctcacaactcgtctatcactcatatcacatttatttcccaatatcattttaacaacatcttccgatgcatgttcgtcaatattgcgcaaccactttgcaatattgtcaaaacttttcgcatttgtgatatcgtaaaccagcataattcccattgctccgcggtaatatgatgttgtgattgtgtggaatcgttcttgtccagctgtgtcccagatttgtaatttaatcttttttccttttaattcgatagttttaattttgaagtcgattcctgaatgaaaaaagaaaattattttgaaatcactagattctgaataaaaactaaccaatagttgagatgaatgtggtgttaaaggcatcatccgaaaatctgtacagaatgcaagtttttccaactcctgagtcgcctattagcagcaatttgaagagcatgtcatacggtcggcgagccatttttcttctgaaatgagaaaaagttgagaactaaagttgcacaaaagtaagagaaaagcacttgagtcatggcaaatagaacgaacactttgagatttcgaagaagttatcaagagttgacaattggaagatatttggaagaactttctaatttttttctagttttccaaaattaggtttttgtcataaaatgttgtcaaagaaaaaacaggacaaaatagttaattgttgtttccattataacaaaaaaaaatttgaacggagctattaacgcgtgcatgcgcaaatcacatcgattagctgtttctgggaaattctcgggaaaaggtgaacagcagctgctggcttcctctgcgggtcacgaaaacacaaagagatcattataattgttatttggaaaggaagcgaatctaaaacgggtacaggtggacgtttattgatcgaaagtgctttttatttgaaattgaatggtgaactttgcaattttgtaatgcaaagtacgttatcagatggcatgagatgtgtgaagtgataaggaataaaatgtgaacgacatgttcaagaaactgtgatttttcaataatttgtgatgaaatattttaggaacagaaatgaacatattaattgatataaaaacaataggaacactaactcataattatgataggtgaatatcaaaatgtgctagattttttgaagttaaaaaatacatttctaatattttttcaaataataagtttcagctgaaatttcagggtgatttcagaaagctatgttttgataaattgttttgaaaattaaaagaagctacagcaaaaaaaaattaaagagaacatcgctccctcgtagtgtataatttttgattatcgaaaaaaatgagtcaatgatgaaaaggaagtcgcaatctcaaaacttcaaaaatcaaaagaagccgttgcctctgtcatcaaaaattcagaagacaaggttgttgacaagggtcaattctcagtggtggagggcattgggcgtggtgaaatttttgaaggctagtgtggttggacctctactagatagacaaaacccccgaaatagacgtttaatttgatgagatggtggagaaagaaaaggactcattctctagatgatagagagaccagagatacagacaagagagggcgcctcggccgtatgttgaatgggagatcgatggaaccgagacaacgagaaaaggaatagagacggagaaagagagagagagcgcgcgttgttggaaggatgaaaaagaaaaaagacatgagctgcttcacaagagcttggcgaaagcaaagggcaaagtgttgacagcttagtggtggtagttggatcatgtgtttttatgtttccggtgggagaaggttcaacaaaaaatgaaaagaaaaagttcaagcggcatgaatcattctgagtttaaaacaaaattattgcgaaaattaatattaaaaccttttcacaaaacttcaagctaatctgttcatgaaaatttgaataatagttttttcccacctatttagaattaacttcatattaacgaaattaattaacgaatcgaaaattatgacttttcagaatcatctgaagttttttcacattccatgctgcatggaataatttgatcctggaatcgatatgtttttatggtatactttttaaccttcaatttagctggaaaagtatggaataaataattcccgaagctatgtacatatatgtagaattattgaatgattgtgagaacaacttgactttagcttgagtaggaatcggaatggctatcgaccgatcaacacttaggattgtaagaatggcagtaagaatatattgaagaaagaatgtttgttcataggaagagaaagagtattgcgaaatcatcatcgcccactttagaatggacgggcggtgagcggacatagagaattgtgaatgactaatgcttttgcagaatctagggcaaaatcgtaggaacaaacaattgtaatacggagaaaacaatcatatcgatcgatgatcatggagaaaaatgtgatttaagtgagtagacttggaaaaattaataaaagcatgaattgtcgatatttttcatttattttcattataaagctctttaaaaacaaattaaatattgagaatggcttcgaagaatattgtttcaaatatgttcaatggtgacaccttgcggataaaattaatgtaaaaatcatggaacacagattcactgatatctcattatctcaagcagtgtaattagagattttttggaacaattattttataaaactataaataaaccgtttatactactcaaagccaaatattcaagctattaccattttttttctaactaattcttgagcaattaaagtattccccagtttttattttgcaacgactccaggcaaacacgctccgttgcacttgccgccaaggcgttgcattcaaatcagagagacatctcattccgatttctgtttttcttccaataaacggtattttatgcctaatgggtgatacggaaattgttcctcttcgagtacaaaatgtacttgatagcgaaatcattcgtctcaacttgtggtccatgaaggtaactgtctagtttttttaagttttcatgatttcaatatttttacagtttaacgcgaccagtttcaaactcgaaggttttgtgagaaatgaagaaggcactatgatgcagaaagtttgttccgaatttatttgtgtaagtcgagaaacatattcgtcaacaattttcattaaatattcagagacgcttcacttctacgttgcttttcgatgtttccggacgtttcttcgacttggtcggacagattgatcgggaatatcaacaaaaaatgggaatgcctagtagaattattgatgaattttcaaatggaattcctgaaaattgggccgaccttatctattcctgcatgtcagccaaccaaagaagcgcacttcgccctatccaacaggctccaaaagaaccaattagaactagaacagaaccaattgttacgttggcagatgaaaccgagctaactggaggatgccagaaaaattccgaaaacgagaaagaaaggaacagacgtgagcgtgaagaacagcaaacaaaggaacgtgagagaagattagaagaagaaaaacaacgacgagatgctgaagctgaggctgaaagaaggcgaaaagaagaggaagagctggaagaagctaattacacccttcgtgctccgaaatctcagaacggcgagccaatcactccgataaga

C. elegans cosmid K06A5, 24323 bp.Flat sequence file –3955 bp shown.

Page 2: Genomics lecture 3

Genome sequence of C.elegans.

Sequence of entire genome.

Sequence of cDNA clones.

Approximately 19,500 PREDICTED protein coding gene sequences.

Large number of various kinds of functional RNAs – not discuss further.

For this lecture – focus predicted proteins.

Gene prediction? How?Science, December 1998.

Page 3: Genomics lecture 3

Computer based predictions

GENEFINDER (C.elegans), BLAST (all genomes) and other computer programs.

Biases in coding sequence - in C. elegans non-coding is AT rich. Splice site signals, initiator methionines, termination codons.Likely exons and probable/possible splice patterns.

BLAST – compare the Translation of all 6 reading frames.

• Evidence that a prediction is correct?• Homology with genes in other organisms – homologues.• Known protein families.

• Experimental evidence.

Page 4: Genomics lecture 3

The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences.

The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches.

http://www.ncbi.nlm.nih.gov/ The National Center for Biotechnology Information (NCBI), the U.S. National Library of Medicine.

mqnpmillifclfcavicsrgtdsdiphef

How does BLAST work?

BLAST compares small sequential blocks – or WINDOWS- of sequence against massive databases. It looks for regions of similarity and scores them.

Protein SequenceSingle Letter code

Search windows

Page 5: Genomics lecture 3

Large Protein

More BLAST

High similarity BLAST score

Low similarity BLAST score

Conserved regions

Small windows of comparison - detect LOCAL regions of similarity.

Output - % identity and % similarity (permits conservative substitutions of aa.)

Gives overall score and probability of relatedness.

If the entire protein sequence was compared in one go, you may get a relatively low overall similarity.

How did genes and gene families evolve and what is meant by protein domains?We need to come back to this – remember the question!

Non-conserved regions

Page 6: Genomics lecture 3

Go to NCBI http://www.ncbi.nlm.nih.gov/ Go to Blast then look down the left for “Choose a BLAST program to run”From within that section, select “protein blast”.Copy the above protein sequence and paste it into the box on the top left of web page.Scroll down the page and click the big blue BLAST button.

Have a look at the outcome – any questions – post to the Forum on moodle.

mqnpmillif clfcavicsr gtdsdiphef hkmlkhaksl nsllrdlhvi yspemtnrhvektdkhgaal slksgsmsaq rivsiqnisd demdgytlfh lqsmkdikqg ndtcnlqsvcvpipqlsddp qvlmypkcye vkqcvgsccn svetchpgti nlvkkhvael lyigngrfmfnmtkeitmee htscscfdcg sntpqcapgf vvgrsctcec ankeernncv gnatwnaetckcecdlkcee gkilhkdrcd cvrrrqhhgg prghhghrhh hrsrpidtee vqkigqlkvgrigg

Below is the sequence of a protein:

BLAST is one of the powerful computational tools for Comparative Genomics

HOMEWORK

Page 7: Genomics lecture 3

Expressed sequence tags (ESTs) – cDNA clones.

To make cDNA mRNA is copied to DNA with reverse transcriptase.

RNA → DNA

“The Central Dogma” of Molecular Biology

DNA → mRNA → Protein

Retroviruses (e.g. HIV).

RNA genome → DNA → integration → mRNA → protein

Computational biology is mostly predictive – not EXPERIMENTAL

Lets look at simple experimental evidence for existence of genes.

Page 8: Genomics lecture 3

Typical eukaryotic gene - double stranded DNA

intronexon

Primary transcript – single sense strand RNA – introns present5’ 3’OH

Capping, splicing, poly-adenylation

First strand cDNA synthesis -reverse transcriptase

Messenger RNA (mRNA)5’ CAP

OH-TTTTTTTT-5’ DNA primer

AAAAAAAAAAA 3’OH

TTTTTTTTAAAAAAAAAAA RNA/cDNA duplex

TTTTTTTTAAAAAAAA

Second strand cDNA – DNA polymerase

Double stranded cDNA

RNA Polymerase1.

2.

3.

4.

RNA exon

Making cDNA

Page 9: Genomics lecture 3

EST sequencing was carried out in parallel to genome sequencing.

Simplest experimental evidence that a bit of genomic DNA contains a gene.

OH-TTTTTTTT-5’ DNA primer

AAAAAAAAAAA 3’OHMessenger RNA (mRNA)

cDNA synthesis oligo dT priming

Making cDNA

cDNA synthesis by random priming

OH-NNNNNNNNN-5’

DNA primer

AAAAAAAAAAA 3’OH

Random 6-mers or 9-mers

The advantage of Random Priming is cDNA clones not biased towards 3’ end of gene.

Page 10: Genomics lecture 3

Typical eukaryotic gene - double stranded DNA

EST sequences

Sequence data from Random Primed cDNA – ESTs (or EST Tags)

EST 1

EST 3EST 2

EST 4

The sequencing of ESTs uncovered frequent examples of differential splicing.

Common examples of which are exon skipping (above)

Alternative 5’ exons, alternative splice altering stop codons, genes within genes etc.

Above true for C. elegans, humans, flies, and many other species.

Page 11: Genomics lecture 3

• C. elegans EST data from approximately 50,000 cDNA clones.• Identified 9,356 different genes.

1. Grind up thousands of worms.2. Prepare mRNA – convert to cDNA with reverse transcriptase – clone in plasmid.3. Some mRNSs exist at extremely low levels of abundance.4. Low abundance cDNAs may be impossible to clone randomly.

Page 12: Genomics lecture 3

Reverse transcriptase PCR – very sensitive.

cDNA from mRNA using reverse transcriptase.

Amplify cDNA by PCR – primers designed from predicted genes.

Clone and analyse products.

Experimentally confirmed genes raised to > 18,000.

Full length cDNA– valuable for confirming intron/exon structure.

Gene

mRNAAAAAAAAA

Primer A.

Primer B

Page 13: Genomics lecture 3

Summary of predicted and known gene sequences in C. elegans

1. Predicted 19,500 genes.

2. At least 18,000 expressed as RNA.

3. Average of 1 gene per 5 kb.

4. ~ 42% have detectable homologies to genes/proteins outside Nematoda.

Page 14: Genomics lecture 3

Genome Size

Organism Genome Genes

E.coli (bacteria) 4.64 Mb 4,377S. cerevisiae (fungal) 12.1 Mb 6,163C.elegans (metazoan) 100 Mb 19,300Arabadopsis (plant) 118 Mb ~20,000D. melanogaster (fruit fly) 135.6 Mb 13,472Mus musculus (mouse) 3059 Mb ~25,000Homo sapiens (obvious) 3286 Mb ~25,000

Page 15: Genomics lecture 3

Number Description

650 7 TM chemoreceptor410 Eukaryotic protein kinase domain240 Zinc finger, C4 (transcription factor)170 Collagen140 7 TM receptor130 Zinc finger, C2H2 (transcription factor)120 Lectin C-type domain short and long forms100 RNA recognition motif (RRM, RBD, or RNP domain)90 Zinc finger, C3HC4 type (transcription factor)90 Protein-tyrosine phosphatase90 Ankyrin repeat90 WD domain, G-beta repeats80 Homeobox domain (transcription factor)80 Neurotransmitter-gated ion channel80 Cytochrome P45080 Helicases conserved C-terminal domain80 Alcohol/other dehydrogenases, short-chain type70 UDP-glucoronosyl and UDP-glucosyl transferases70 EGF-like domain70 Immunoglobulin superfamily

The C. elegans Top 20 protein Homologies

Page 16: Genomics lecture 3

Does the “Top 20” list tell us anything?

Previous slide looked rather boring?

Test your memory – what was on the list?

Many of the large gene families are implicated in developmental control.

Core set of proteins needed for general cell biology/metabolism to make a cell – e.g. S. cerevisiae ~6,163 genes.

Evolution of developmental complexity – amplification of families of regulatory molecules.

The above in part explains the increase in number of genes in multicellular organisms – it does not explain fully the increase in DNA content.

Page 17: Genomics lecture 3

How much does DNA sequence teach us?

Remember that what we can learn from protein similarities is limited by what we know about the similar proteins.

We still need to connect genes/proteins with functions.

Page 18: Genomics lecture 3

C. elegans mutants

dpy-7: Short fat worm – exoskeletal defect.

ced-4: Programmed cell death defective.

unc-51: Paralysed - abnormal axons.

dec-2: long defecation cycle – genetically constipated.

Wild Type

How has genomics influenced genetics?

Page 19: Genomics lecture 3

bli-3

egl-30

mab-20

fog-1unc-73unc-57dpy-5

dpy-14fer-1

unc-29lin-11

unc-75

unc-101

glp-4

unc-54

Chromosome I

-15

-10

-5

0

5

10

15

20

25

Central cluster

Left arm

Right arm

m.u.

Genetic mapping.

m.u. = map unit.

Genetic mapping – recombination.

1 m.u. is 1% recombination per meiosis.

fog-1

glp-4

+

+ glp-4

+fog-1

+

Parent Recombinant

We wanted to investigate the molecular detail of gene defined by mutation.We knew where mutant genes mapped and we knew their phenotype.

Page 20: Genomics lecture 3

bli-3

egl-3

0

mab

-20

fog-

1

unc-

73dp

y-5

fer-

1lin

-11

unc-

75

unc-

101

glp-

4

unc-

54

-15

-10 -5 0 5 10 15 20 25

Genetic map

How can the physical and genetic maps be aligned?Identify the sequence of genes defined by mutation.

AGCCTTTATGGCGAGATGGATAGCT………………………..………………………………………….TATAASequence of genomes – individual chromosomes

Physical Map of clones

Page 21: Genomics lecture 3

bli-3

egl-3

0

mab

-20

fog-

1

unc-

73dp

y-5

fer-

1lin

-11

unc-

75

unc-

101

glp-

4

unc-

54

-15

-10 -5 0 5 10 15 20 25

Genetic map

Physical map

• An association or alignment between the physical and genetic maps.

Page 22: Genomics lecture 3

bli-3

egl-3

0

mab

-20

fog-

1un

c-73

dpy-

5

fer-

1lin

-11

unc-

75

unc-

101

glp-

4

unc-

54

-15

-10 -5 0 5 10 15 20 25

Genetic map

Physical map

Positional cloning of genes defined by mutation.

Imagine lin-11 and unc-101 had both been cloned.

Where on the physical map might unc-75 be?

Page 23: Genomics lecture 3

Transgenic C.elegans – rescue of mutant phenotype.

DNA injected into the gonads of the adult hermaphrodites.

Form large heritable DNA molecules termed "free arrays".

Page 24: Genomics lecture 3

1. Inject cosmid into the mutant.2. Observe transgenic progeny for phenotypic rescue.3. Subclone individual genes from cosmid.4. Observe transgenic progeny for phenotypic rescue.

Cosmid sequence

Genes

Phenotypic Rescue

Inject unc-75 mutant worms.

Page 25: Genomics lecture 3

bli-3

egl-3

0

mab

-20

fog-

1un

c-73

dpy-

5

fer-

1lin

-11

unc-

75

unc-

101

glp-

4

unc-

54

-15

-10 -5 0 5 10 15 20 25

Genetic map

Physical map

Positional cloning of genes defined by mutation.

Attempt phenotypic rescue with cosmids.

• The standard route to clone C. elegans genes defined by mutation.

• The more genes are cloned the easier it becomes to clone others.

Page 26: Genomics lecture 3

Can’t make transgenic humans – but the same positional information is used to identify Human disease genes.

Page 27: Genomics lecture 3

RNA Interference (RNAi)

RNAi - sequence-specific inactivation of gene function by, either by double stranded RNA or siRNA.

Since its discovery in C.elegans, it has been found to work in many organisms – e.g. cultured vertebrate cells, plants, trypanosomes, Drosophila.

Page 28: Genomics lecture 3

Mediators of RNAi - short interfering RNAs (siRNAs)

21-23 nt dsRNA duplexes.

DICER – Highly conserved family of RNaseIII enzymes.Targets double stranded RNA.

Page 29: Genomics lecture 3

Argonaute

Single Stranded interfering RNA

Page 30: Genomics lecture 3

RNAi in C.elegans.

ds RNA

Observer phenotype of F1 offspringNoticed that site of injection did not matter – intestine works??How could that affect embryos?Systemic RNAi

Page 31: Genomics lecture 3

Bacterial Feeding Method in C. elegansExpress dsRNA of a cloned C.elegans gene in a strain of E.coli. Worms eat the bacteria as food.

RNAi of the gene can be obtained both in the worms that feed on the dsRNA expressing bacteria, and in the F1 progeny of these worms.

Page 32: Genomics lecture 3

Transport of dsRNA into Cells bythe Transmembrane Protein SID-1Science 301, 1545 (2003)

sid-1 mutants are defective in systemic RNAi

SID-1 protein

Page 33: Genomics lecture 3

Loss of function phenotype can be estimated by RNAi.

RNAi by feeding method – whole genome RNAi projects.

Clones of 16,757 predicted genes tested in genome wide screen.

10.3% gave obvious phenotype.

RNAi as a tool for genetic analysis

Redundancy between genes.

RNAi is capable of functioning for more than one gene at a time.

Permits analysis of functionally redundant genes.

Page 34: Genomics lecture 3

Summary, C. elegans Genomics

Permits comparisons with human genes.

Most human disease genes have C. elegans homologues.

Powerful genetic tools – experiments on genes.

Detailed anatomy – relate gene to function.

Examples of processes investigated.

Programmed cell death.Signalling.Cell adhesion.Axonal guidance.Oncogene function.Insulin PathwayAgeing

Page 35: Genomics lecture 3

How did genes evolve and what are gene/protein families

Page 36: Genomics lecture 3

Early genomes– Early genomes made of RNA

• RNA world - no cells (in modern sense), just RNA, starting with 1 gene

• RNotide polymerase activity - catalyse own synth.• Later on - translation - encoded info for production of proteins

– Involves nucleic acids ‘coding for’ proteins– Later emergence of DNA as the info store - genome stability - less labile– Modern functions of nucleic acids

• coding - proteins via mRNA• catalytic – ribozymes• structural – rRNA, tRNA• regulatory - miRNAsnucleotides

RNA

DNA

mRNA

tRNA, rRNA

protein

Inorganic surface

*

Page 37: Genomics lecture 3

‘Tree of Life’- Tree of all Animals

Common ancestor=> common genome

• Each species’ genomedescended with modificationfrom genome of ancestor

Reconstruction of picture of ‘ancestral genome’?

Comparative genomics - tells us about stateof ancestor and changes along each branch

Where did our genome come from?….

*

Page 38: Genomics lecture 3

Initial ligation to form early chromosomes

inversion

duplication / deletion

accumn. of point mutations

Invasion - horizontal gene transfer & transposable elements

Genes and Genome evolution

• What processes lead to genome evolution…?*

Page 39: Genomics lecture 3

TSS ATG stop

Domain 1 Domain 2

Poly A tail

promotergene

mRNA

protein

5’-UTR 3’-UTR

Exon 2 Exon 3Intron 1

Exon 4Exon 1

Structure of a typical eukaryotic gene

What features of all genes are missing from this diagram….?

*