“the genome project: on zoos and curing cancer” - …€œthe genome project: on zoos and curing...

22
1 “The Genome Project: On Zoos and Curing Cancer” Richard K. Wilson, Ph.D. Genome Sequencing Center Washington University School of Medicine “Completed” April 2003 ~3 billion base pairs (bp) 24 chromosomes: 1-22, X, Y ~25,000 - 30,000 genes Only 2% of the genome encodes a gene The sequence represents ~99% of the genome. Approximately 99% of the sequence is in a highly accurate “finished” state. The other ~1% is somewhat lower quality but is ordered & oriented. The Human Genome

Upload: lyque

Post on 17-Apr-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

1

“The Genome Project:On Zoos and Curing Cancer”

Richard K. Wilson, Ph.D.Genome Sequencing Center

Washington University Schoolof Medicine

• “Completed” April 2003• ~3 billion base pairs (bp)• 24 chromosomes: 1-22, X, Y• ~25,000 - 30,000 genes• Only 2% of the genome encodes a gene• The sequence represents ~99% of the

genome. Approximately 99% of thesequence is in a highly accurate “finished”state. The other ~1% is somewhat lowerquality but is ordered & oriented.

The Human Genome

Page 2: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

2

http://genome.ucsc.edu

http://www.ncbi.nlm.nih.gov

Page 3: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

3

What’s next?

“applied genomics”

more genomes…

Page 4: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

4

ChordataAmphibians

Vertebrate genomes:Completed or in progress

Fishes

ReptilesBirds

Marsupials

Monotrem

esRodents

Primates

CarnivoresH. sapiens

WU-GSC: work in progress…• Human: chr 2/4: manuscript in preparation…• Mouse: finishing in progress… (~30 Mb/month; target; Apr 2005)• Chimp:

- 4X draft: assembly released, analysis in progress…- Improved genome sequence: additional WGS, BACs in progress…- Finishing: chr 7 & Y (and ENCODE) in progress…

• Chicken:- 6.6X draft: assembly released, analysis in progress…- ENCODE regions in finishing…- Submitted proposal to improve draft…

• S. mediterranea: 6-8X draft: sequencing in progress…• Drosophila:

- D. yakuba: 8X WGS: assembly complete, analysis & pre-finishing in progress…- D. simulans: 8X WGS: sequencing (multiple strains) in progress…

• Macaque: 7-8X WGS: 7.4M reads scheduled…• Caenorhabditis:

- C. remanei: 7X WGS: sequencing complete, assembly in progress…- C. japonica: 7X WGS: heterozygosity analysis in progress…- CB5161: 7X WGS: heterozygosity analysis in progress…

• Others: Platypus & Lamprey BACs: investigatory sequencing in progress…

Page 5: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

5

ComparativeGenomics

Developmental biologyAgriculture

Why sequence the chicken genome?

Page 6: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

6

The evolution of birds,flight and feathers

Gallus gallus• Red jungle fowl• Ancestor of moderndomestic breeds

• Genome size = ~1.1 Gb• Autosomes ~38• Chromosome subtypes -macrochromosomes andmicrochromosomes

• Sex chromosomes: ZW(female is heterogamete)

Why sequence the chicken genome?

MammalsMammalsFishFish BirdsBirdsInvertebratesInvertebratesAmphibiansAmphibians

25

0 M

yr -

--

31

0 M

yr -

--

40

0 M

yr -

--

90

0 M

yr -

--

Caen

orha

bditi

s el

egan

s

Dro

soph

ila m

elan

ogas

ter

Ano

phel

es g

ambi

ae

Dan

io r

enio

Xeno

pus

trop

ical

is

Gal

lus

gallu

s

Orn

ithor

hync

hus

anat

inus

Bos

taur

us

Cani

s fa

mila

ris Ratt

us n

orve

gicu

s

Rhes

us m

acqu

e

Hom

o sa

pien

s

Pan

trog

lody

tes

Mus

mus

culu

s

Mon

odel

phis

dom

estic

a

Fugu

The comparative genomics playing field

Page 7: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

7

whole genome

BACs

fosmids

G. gallus genome: Strategy…

plasmids

Genome assembly• Ultracontigs: 82• Avg. ucontig length: 11.4 Mb

Physical map• Total clones for FPC: 142,718• Total contigs: 260• Total contigs anchored to

chicken chromosomes: 202

• Avg. scontig length: 32 kb• N50 scontig length: 8.2 Mb

WGS sequence assembly• Genome size: 1.06 Gb• Total Q20 bp: 7.0 Gb• Sequence coverage: 6.6X• Total repeat bp: 7.6 Mb (7%)• Avg. contig length: 10 kb

• Anchored ucontigs: 60• Anchored ucontig length: 0.77 Gb

G. gallus genome: Strategy…

Page 8: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

8

Minilda output: Martin Kryswinski

Clone set currently exists as 9,270 clones with 77 kb avg. overlap

BAC minimal tiling path

ensembl Chicken Gene Index• Gene predictions generated using

genewise and exonerate based onprotein, cDNA and EST evidence

• 17,784 gene structure predictions(including 75 pseudogenes)

• 28,491 transcripts• 6.5 exons/transcript

• 95% of chicken RefSeq CDS basescovered

• 87% of available chicken cDNAsaligned

• 18,995 EST genes produced from440,815 ESTs

• http://pre.ensembl.org/chicken

Page 9: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

9

Genome sequence

Questions to Ponder

Why is the chick genome size constrained?How did sex chromosomes evolve?How did microchromosomes evolve?

What genes are lost/retained in bird to mammal lineage?Why are certain repeat classes absent?

Why is segmental duplication 2x that of human?How do we ascertain causative mutations from QTLs?

What do chicken to mammal alignments tell us about evolution?Do non-genic conserved sequences exist in chicken?

Improving the G. gallus genome1. Sequence selected BACs• Use the physical map• Sequence/local assembly of

difficult/repetitive regions• Sequence underrepresented

regions (e.g., Z & W chr)

2. Pre-finishing• Primer-directed reads to close

gaps & improve low qualityregions

3. Finishing• If necessary, depending on

genome-specific standard…

Page 10: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

10

“applied genomics”

Variation and mutation

GCG AGG GAT AAT TGT …genome 2

…CysGlyAspArgAla

GCA AGC GAT AAT TGT …genome 4

…CysGlyAspSerAlaGCA AAA GAT AAT TGA …genome 5

…STOPGlyAspLysAla

GCA AAA GAT AAT TGT …genome 3

Cys …GlyAspLysAla

GCA AGA GAT AAT TGT …genome 1

Cys …GlyAspArgAla

correlationto disease?

Page 11: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

11

“Mutational Profiling”

gene of interest

multiple individuals

Mutational Profiling

Page 12: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

12

PCR amplification

Mutational Profiling

DNA sequencing

High-throughput M.P.

Page 13: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

13

High-throughput M.P.

High-throughput M.P.

a new pipeline

• High quality data• Perfect sample tracking• Low cost

Page 14: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

14

Mutational Profiling

• Pulmonary surfactant protein B(SPB) deficiency

• Prostate cancer• Non-small cell lung cancer• Acute myelogenous leukemia

Acute myelogenous leukemia• A group of diseases caused by a variety of inherited

and acquired genetic and epigenetic changes• Most frequently reported form of leukemia among

adults. Approximately 10,000 new cases per year inthe U.S.

• Generic therapeutics exist, but most patients stilldie from this disease.

• Risk stratification:• Age• De novo vs. secondary• Cytogenetic and genetic alterations• Response to initial therapy, relapse

Page 15: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

15

Acute myelogenous leukemia

A variety of genetic and epigenetic events areprobably responsible for the heterogeneity of AML.

Acute myelogenous leukemia• ~450 target genes• “Discovery Set”:

Matched DNAsamples from 47 AMLpatients.

• “Validation Set”: Anadditional 94matched patientsamples.

• Expression profiling,array CGH, functionalgenomics/mousemodels.

Page 16: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

16

Acute myelogenous leukemia

Target genesReceptor tyrosine kinasesCytoplasmic tyrosine kinasesHOX genesAbundant ProteasesTranscription factorsTumor suppressorsDNA repairRAS pathway genes

Signaling modifiersPhosphatasesCytokine receptorsCell cycle genesImmune surveillanceApoptosis relatedDrug resistanceMiscellaneous

Acute myelogenous leukemiaGene Mutations

CBF-! R151C

c-KIT M541L

c-MYC N11S Y32H V170I

FLT3 D835Y exon 11 internal tandem duplication (ITD)

NRAS G13R

PML R307C

RAR-" T43I

nonsynonomous

synonomous

20% of AML patients10% of AML patients

Page 17: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

17

Acute myelogenous leukemia

TM TK TKFLT3

expansion

FLT3 *

20-25%

~10%D835Y

Page 18: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

18

Non-small cell lung cancer• Lung cancer is the leading cause of cancer deaths in

men and women (US: 164,100 new cases and 156,900deaths in 2000).

• Non-small cell lung cancer is the most common typeof lung cancer. It typically metastasizes more slowlythan small cell lung cancer.

• Cigarette smoking is the most common cause of lungcancer: 87% of all cases are associated with smoking.

• Treatment is typically a combination of surgery,chemotherapy and radiation therapy.

NSC lungcancer

• Mutational profilingis focused on kinasegenes.

• Samples includetissue biopsies frompatients who haveresponded well tothe kinase inhibitorsIressa and Tarceva.

• Just underway…

Page 19: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

19

Non-small cell lung cancer

05.22.04 EGFR Paraffin Extracted Samples and Germline Controls, tested over exons 18-24

IR18 E19 IR19 IR19 E20 E21 E23

155432

156146

156286

162603

162740

173192

180094

H_AO-0005038 GG ++ AA TT AG TT CT

H_AO-0005346 GG -- AA TT GG TT TT E19 156146del TTAAGAGAAGCAACATCT

H_AO-0005039 AG ++ AA TT AA TT CC

H_AO-0005344 AG ++ AA TT AA GT CC

H_AO-0005040 GG ++ AG CT AG TT CC

H_AO-0005342 GG ++ AA TT AG GT CC

H_AO-cntr101 GG ++ NN CT AG TT CC

H_AO-cntr102 GG ++ NN CT AG TT CC

egfrcon.c1 GG ++ AA TT GG TT TT

Exons examined: 18, 19, 20, 21, 22, 23, 24

Indels

Mutational profiling data for three Iressa responders

Revolutionizing medicine!

• A better understanding of thegenetics and the molecularevents underlying human disease.

• Diagnostic DNA sequencing• “Designer therapeutics”

Page 20: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

20

Gene-based designer drugs

bcr-abl bcr-ablTargetprotein

Targetprotein

Challenges!

• DNA samples- Collection & records- Sample type & purity- Informed consent/HIPPA

• Technology & throughput improvements• Software tools & data management• Data visualization & statistical analysis• Correlation of complex datasets

Page 21: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

21

Acknowledgements• WU Genome Sequencing Center

Lucinda Fulton, Bob Fulton, Pat Minx, Tina Graves, TracyMiner, Kim Delahaunty, Bill Nash, Kym Pepin, Ginger Fewell,Jim Eldred, Dave Dooling, Ph.D., Asif Chinwalla, LaDeanaHillier, Doris Kupfer, Ph.D., John Spieth, Ph.D., Wes Warren,Ph.D., Sandy Clifton, Ph.D., Elaine Mardis, Ph.D., many others…

• Acute myelogenous leukemiaTimothy J. Ley, M.D. - WUSM

• Non-small cell lung cancerHarold E. Varmus, M.D. & William Pao, M.D., Ph.D. - MSKCC

Page 22: “The Genome Project: On Zoos and Curing Cancer” - …€œThe Genome Project: On Zoos and Curing Cancer ... investigatory sequencing in progress ... •A group of diseases caused

22