personal genomics & watson’s genome

38
Personal Genomics & Watson’s Genome Scott Bray Jaimie Barkley Rachel Blumhagen Kristy Theodorson

Upload: della

Post on 08-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Personal Genomics & Watson’s Genome. Scott Bray Jaimie Barkley Rachel Blumhagen Kristy Theodorson. 1st human genome sequenced using NEXTGEN technology Identified novel genes, SNPs, CNVs and indel polymorphisms Results consistent with traditional methods used to sequence Venter’s genome - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Personal Genomics & Watson’s Genome

Personal Genomics&

Watson’s Genome

Scott BrayJaimie Barkley

Rachel BlumhagenKristy Theodorson

Page 2: Personal Genomics & Watson’s Genome

• 1st human genome sequenced using NEXTGEN technology

• Identified novel genes, SNPs, CNVs and indel polymorphisms

• Results consistent with traditional methods used to sequence Venter’s genome

• Pilot project for personalized genome sequencing

Page 3: Personal Genomics & Watson’s Genome

NEXTGEN ProsNEXTGEN Pros

• Less time• Two months

• Less expensive• Approximately 1/100 of the cost of traditional capillary

electrophoresis

• More Efficient• Avoids loss of genomic sequence due to amplification of

DNA in a cell-free system

Page 4: Personal Genomics & Watson’s Genome

Quicker, smaller, cheaperGenome Sequenced (publication year)

HGP (2003) Venter (2007) Watson (2008)

Time taken (start to finish)

13 years 4 years 4.5 months

Number of scientists listed as authors

>2,800 31 27

Cost of sequencing $2.7 billion $100 million < $1.5 million

Coverage 8-10x 7.5x 7.4x

Number of institutes involved

16 5 2

Number of countries involved

6 3 1

M. Wadman Nature 452, 788 (2008).

Page 5: Personal Genomics & Watson’s Genome

How’d they do it?

• Genomic Extraction of white blood cells• Nebulization• 454 pyrosequencing• 234 runs @ 105Mb per run• Assemble?

Page 6: Personal Genomics & Watson’s Genome

No Assembly required!! (ok, a little)

• “Reference” sequence (Build 36) to align reads

– official reference genome assembly

– includes both WGS and BAC sequence data assemblies

– additional genomic sequences incorporated

Page 7: Personal Genomics & Watson’s Genome

• Reads were aligned to a reference sequence with 7.4X coverage

• Uniquely Mapped Reads (1.5 million) were WGS assembled

Page 8: Personal Genomics & Watson’s Genome

7.4X Coverage

Why would they have lower coverage on the X chromosome?

X -Chromosome

Page 9: Personal Genomics & Watson’s Genome
Page 10: Personal Genomics & Watson’s Genome

Single Nucleotide Polymorphisms Single Nucleotide Polymorphisms

• For “known” SNPs: 50% homozygous, and 50% heterozygous, but “novel” SNPs were mostly heterozygous - Why?

Does this result support the hypothesis that the SNPs are “novel”?

Filter and Align

14 million initially found

3.3 million

2.7 million matched “known”

from dbSNP10,425 did not match dbSNP (unlikely to be third allele or error in

dbSNP 0.38% false discovery rate)

0.61 million deemed “novel”

Filter

Page 11: Personal Genomics & Watson’s Genome

Traditional SequencingTraditional SequencingVenter’s Genome

• 7.5-fold coverage, using WGSA method with Sanger sequencing

Similar “novel” SNP results between NEXTGEN and traditional sequencing

Page 12: Personal Genomics & Watson’s Genome

Verification of SNP IdentificationVerification of SNP Identification• “known” SNPs identified were compared with the experimental

genotyping of the subjects DNA using microarray– microarray of reference sequence hybridized to Watson’s DNA

• 494,713 markers successfully genotyped

• Watson’s DNA sequence had high agreement with the homozygous reference and homozygous variant, but relatively low agreement to heterozygous – Why?

Page 13: Personal Genomics & Watson’s Genome

Accuracy of SNP Identification

13-fold coverage required to detect 99% of all heterozygous SNPs

Coverage is key

Page 14: Personal Genomics & Watson’s Genome
Page 15: Personal Genomics & Watson’s Genome

Insertions-Deletions (Indels)• Identified 222,718• Size range of 2-38,896 bp

Why do they not have data on length of insertions?

Decrease in deletions frequency with increase in size of deletions

Page 16: Personal Genomics & Watson’s Genome

Do the indels cause a frame shift?

• 345 indels found in coding regions

• Primers were designed for 111 of them, followed by Sanger sequencing– 78 indels validated 66 of them were in lengths

of multiples of 3 (no frame shift)– 65 were found as heterozygotes

Page 17: Personal Genomics & Watson’s Genome

Interesting Find…

• They found a homozygous 4-base deletion in exon 11 of Watson’s SGEF gene

• SGEF is highly conserved in vertebrates– Guanine nucleotide exchange factor thought to

regulate membrane dynamics in promotion of vesicle formation

What does this suggest???

Page 18: Personal Genomics & Watson’s Genome
Page 19: Personal Genomics & Watson’s Genome

CGH microarray

Page 20: Personal Genomics & Watson’s Genome

Copy Number Variations

• CNVs: local gains or losses of regions in the genome because of duplication or deletion– associated with genetic disease– detectable by variation in the average DNA sequence

coverage of the region

• Comparative genomic hybridization (CGH) used– Examine relative fluorescence intensity in wells– Microarray revealed 23 CNV regions

Page 21: Personal Genomics & Watson’s Genome

CNVs

• CNV’s are polymorphic: - segregate as alleles with varying frequency, - depends on the reference genome

• None of the CNV regions were identified to be involved with any known phenotype. – However 34 genes are predicted to be affected. These genes include:

two olfactory receptor groups, several with possible roles in prostate, breast, and colon cancer, a gene from the HLA-D locus, and two proteins involved in RNA editing.

Page 22: Personal Genomics & Watson’s Genome
Page 23: Personal Genomics & Watson’s Genome

Experimental Conclusions

• 3.3 million SNPs identified– 8,996 were non-synonymous ‘known’ SNPs– 1,573 were ‘novel’

• Of the non-synonymous known SNPs, 342 alleles matched mutations found in the Human Gene Mutation Database (HGMD)

32 disease causing

Page 24: Personal Genomics & Watson’s Genome

Experimental Conclusions

– 10 out of 12 alleles are highly penetrant, Mendialian recessive disease-causing alleles

• 7 out of 10 were heterozygous, the other three only exhibited one allele

• Subject does not have the diseases.

Page 25: Personal Genomics & Watson’s Genome

1.5 million unaligned reads

65% matched known repeats

110K contigs 29Mb of sequence

33 cDNA w/ no map location

Protein prediction, 60 significant Matches to 49 proteins

Page 26: Personal Genomics & Watson’s Genome

Criticism

• “It’s a new standard of sequencing technology,” says Venter. “But I don’t think it’s a new standard of genome coverage and independent assembly.”

• Good if reference seq. is available, if not?• Dealing with repeats with small reads (no

mate pairs, can coverage compensate?)• Still haven't learned to read “the book of life”

Page 27: Personal Genomics & Watson’s Genome

Personal GenomicsPersonal Genomics“My Genome, My Self”- Steven PinkerJan. 11, 2009

• Personal Genome Project (PGP-10)

• Publicly available for association studies

• Personal genomics is important to the associations between human genetic variation, physiology and disease risk

Page 28: Personal Genomics & Watson’s Genome

ProsPros– Personalized medicine, customized to patient’s biochemistry– Better genetic testing for screening and prevention of at risk

patients– Creation of dataset that can be referenced for association

studies– Useful for evolution studies

ConsCons– “Genes of Doom”– Insurance and employment discrimination GINA– Direct-to-consumer testing ( bypass health professionals to test

for breast cancer alleles or even mutations linked to cystic fibrosis)

– Genetic determinism

Page 29: Personal Genomics & Watson’s Genome

“Genetic Determinism”Examples of Single Gene Disorders

• Autosomal recessive:– Cystic fibrosis (CF)– Phenylketonuria (PKU)– Sickle cell anemia – ADA deficiency, a rare

immunodeficiency disorder ("bubble boy" disease)

• Autosomal dominant:– Familial hypercholesterolemia – Huntington's disease

• X-linked recessive:– Duchenne muscular dystrophy – Hemophilia A

• X-linked dominant:– few, very rare, disorders are

classified as X-linked dominant- hypophosphatemic rickets (vitamin

D -resistant rickets)

Page 30: Personal Genomics & Watson’s Genome

All else is in the numbers…or better yet the genesgenes

• “Geno’s Paradox”: single genes are not very informative

• Traits are typically a result of many genes, each having little effects correlating genes with some traits is (currently) too

complex a test for a gene can identify ONE contributor to a

trait, but the observance of a trait

Page 31: Personal Genomics & Watson’s Genome

Pinker’s Results

• FALSE Results• http://fire.biol.wwu.edu/young/470/stuff/steven_pinker_2.html

• Contradictory and confusing

“If you want to know whether you’re at risk for high cholesterol, have your cholesterol measured; if you want to know whether you are good at math, take a math test.”

–Steven Pinker

Page 32: Personal Genomics & Watson’s Genome

Common Types of Genetic Testing

• Newborn Screening: to identify disorders that can be treated in early stages of development – PKU treated by change in the mother’s diet

• Diagnostic: to confirm or rule out a specific genetic or chromosomal condition typically after symptoms are present

• Carrier: to identify if individual carries a copy of a mutated gene, typically done for prospective parents

• Predictive: presymptomatic, to assess probability of having a genetic disorder that may appear later in life

Page 33: Personal Genomics & Watson’s Genome

Nature versus Nuture… versus Chance?

• Environment and life experience• Stochastic events (chance)

– i.e. identical twins• same genetic makeup, same environment

• Behavioral Genetics: …WHO AM I?WHO AM I?

Personality traits Behavioral traits Decision-making traits

Page 34: Personal Genomics & Watson’s Genome

Genetic Information Genetic Information Nondiscrimination Act (2008)Nondiscrimination Act (2008)

• Prohibits insurers from refusing coverage of a healthy individual or charging that person higher premiums based on their genetic predisposition to developing a disease

• Prohibits employers from using genetic information to discriminate against individuals in hiring, firing, job placement, etc.

• “[GINA] is necessary to ensure that biomedical research continues to advance… such legislation is necessary so that patients are comfortable availing themselves to genetic diagnostic tests.“- NHGRI

Page 35: Personal Genomics & Watson’s Genome

Ethics

• One copy of APOE E4 variant triples the risk of developing Alzheimer’s

• Should your genome be public or private?• Should genetic counseling be required?• Third party complications– Ex: Pinker found he has a gene for familial

dysautonomia, knew to get nieces and nephews tested

Page 36: Personal Genomics & Watson’s Genome

Conclusions– Sequencing and interpretation of personal genomes

will become more accurate with increase in individuals sequenced

– Pro-active approach to ethical issues– NEXGEN Sequencing: $ 100,000 genome

• http://www.knome.com/home/

– NEX-NEXGEN Sequencing: $ 1,000 genome• 2004, NHGRI awarded $ 38 million dollars in grants

Page 37: Personal Genomics & Watson’s Genome

References• Ellerbroek et al. SGEF, a RhoG guanine nucleotide exchange factor that

stimulates macropinocytosis. Mol Biol Cell. 2004 Jul;15(7):3309-19• Pinker, S. My Genome, My Self. NYT, Jan 2009, pp 23-31.• Levy, S. et al. The diploid genome sequence of a single individual. PLoS

Biol. 5, e254–e286 (2007).• Wheeler et. al. 2008. The complete genome of an individual by massive

parallel sequencing. Nature 452: 872-877. • Olson, M. 2008. Dr. Watson’s base pairs. Nature 452: 819-820. • Wadman, M. 2008. James Watson’s genome sequenced at high speed.

Nature 452: 788.

Page 38: Personal Genomics & Watson’s Genome

Questions ???