epi519 gwas talk

65
Genome-wide Association Studies EPI 519 21 October 2010 Joshua C. Bis, PhD University of Washington, Cardiovascular Health Research Unit The Type 1 Diabetes Genetics Consortium. Nature Genetics, 2009 May 10

Upload: joshbis

Post on 14-Jun-2015

4.014 views

Category:

Education


1 download

DESCRIPTION

A lecture for UW EPI 519 providing background for genome-wide association studies, a few examples of recent papers in the CVD GWAS literature, and some lessons and new directions. The talk was originally given in 2008 (in collaboration with a colleagure), this version has been updated slightly for 2010 and includes references for further reading. Some of the typefaces may have been mangled on conversion; the file download should be more reliable.

TRANSCRIPT

Page 1: Epi519 Gwas Talk

Genome-wide Association

Studies

EPI 519 21 October 2010

Joshua C. Bis, PhD University of Washington, Cardiovascular

Health Research Unit The Type 1 Diabetes Genetics Consortium. Nature Genetics, 2009 May 10

Page 2: Epi519 Gwas Talk
Page 3: Epi519 Gwas Talk

Complex phenotypes

Manolio et al. J. Clin. Invest. 118:1590-1605 (2008).

Page 4: Epi519 Gwas Talk

rationale for association studies

Balding. Nature Reviews Genetics. 2006; 7:781-791

Page 5: Epi519 Gwas Talk

candidate genes

Manolio, Boerwinkle, O’Donnell, Wilson. Arterioscler Thromb Vasc Biol. 2004;24:1567-1577.

Page 6: Epi519 Gwas Talk

Trait Gene Polymorphism Frequency Deep Vein Thrombosis

F5 Arg506Gln 0.015

Graves’ disease CTLA4 Thr17Ala 0.62 Type 1 diabetes INS 5’VNTR 0.67 HIV infection CCR5 32 bp Ins/Del 0.05-0.07 Alzheimer’s disease APOE Epsilon 2/3/4 0.16-0.24 Creutzfelt-Jakob PRNP Met129Val 0.37

highly consistent associations*

Hirschhorn: Genet Med, Volume 4(2).March/April 2002.45-61

* Associations between polymorphisms and disease where at least 75% of identified studies achieved statistical significance. (out of 600 gene–disease studies reviewed)

Page 7: Epi519 Gwas Talk

“genomics” The field within genetics concerned with the structure and function of the entire DNA sequence of an individual or population.

-- Thomas Roderick McDonald’s Raw Bar

1986

Page 8: Epi519 Gwas Talk

genome-wide association study “… a study of common genetic variation across the entire human genome designed to identify genetic associations with observable traits.”

-- National Institutes of Health, “Policy for sharing of data obtained in

NIH-sponsored or conducted GWAS”

Page 9: Epi519 Gwas Talk

“A major strength of the genome-wide approach … has been its freedom from reliance on prior knowledge.”

-- “A HapMap harvest of insights into the genetics of common disease”

(Manolio, Brooks, Collins.)

Page 10: Epi519 Gwas Talk

genome-wide publication epidemic

genome.gov/GWAStudies :: 2 November 2009

Page 11: Epi519 Gwas Talk

$1.00

$0.10

$0.01

1 10 100 103 104 105 106

2001 2005

ABI TaqMan

ABI SNPplex

Illumina GoldenGate

Affymetrix 10K

Affymetrix MegAllele

Illumina Infinium/Sentrix

Perlegen Affymetrix 100k/500K

# SNPs

Illumina 2.5M

Costs per Genotype

S. Chanock, NCI

Page 12: Epi519 Gwas Talk

Modified from http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=helpsnpfaq

Page 13: Epi519 Gwas Talk

haplotypes

The International HapMap Consortium. Nature | Vol 437 | 27October2005

Page 14: Epi519 Gwas Talk

“… to create a public, genome-wide database of common human sequence variation, providing information needed as a guide to genetic studies of clinical phenotypes.”

-- October 2002

Page 15: Epi519 Gwas Talk

Ben Fry, for Genome Research. November 2005

Page 16: Epi519 Gwas Talk

imputation

Use patterns of variation from HapMap to impute genotypes. Increases power by allowing for association testing at untyped markers and allows comparisons across studies and platforms by using a common set of SNPs.

Li, Willer, Sanna, Abecasis. Annu Rev Genomics Hum Genet. 2009;10:387-406

Page 17: Epi519 Gwas Talk

(SOME PRACTICAL CONSIDERATIONS)

Page 18: Epi519 Gwas Talk

genotyping

Page 19: Epi519 Gwas Talk

genotyping

Page 20: Epi519 Gwas Talk

Ratio of intensities from two channels

Calls = 2461 No calls = 27

genotyping: raw data

Page 21: Epi519 Gwas Talk

analysis × 2.5 million

Page 22: Epi519 Gwas Talk

association study CT TT CT CT TT CT CT CT CC CT CT CC CC CT TT CT CT TT CC CC CT CC TT CTCT CT CC CT CT TT CC CT TT CC CC CC CT CT CT CT CT CC CT CT CC CC CT CTCC CT CC CT TT CT CC CT CC CT CT TT CT CT TT CT CT TT CT CC CC CC CC CTCT CC CT CT TT CC CC TT CT CT CT TT CT CC CT CT TT CC CT TT CC CC CC CTTT CT CT CT TT CC CT CT CC CT CT CT CT CC CT CT CC CC CC CT TT CC CC CCCC TT CT CC TT TT CC CT CT CC CC CT CC CT CT CC CT TT CC CC CC CT CT CCCT CC CT CC CC CT CT CT CT TT CC CT CC CC CT CT CT CT CT CT TT CC CT CTCT CT CC CT CC CT CT CC CC CT CT CT CC CC CC TT CT TT CT CT TT CC TT TTCT CT CC TT CT CC CT CC CT CC CC CC CC CT CT CT CC CT CC CT CT CT CC CCCT CT CT CT TT CC CT CT CT CT CT CT TT CT CT CT CC TT CT CC CC TT TT TTCT TT CC TT TT CC TT CT TT TT CT TT CT CT CC TT TT CT CT CC CC CT TT CTTT CC CC CT TT CC CT CT CT CC CT CC TT TT CT CC CT CC CT CT CT CC CT CCCC CT CT CT CC CT CC TT CT CC CT CT CT CT CC CT CC TT CT TT TT TT CT CTCT CT CT CT CC CT CT CT CC CT TT TT CT CT CC TT CT CT CC TT CC TT CT CTCT CT TT CC CC CC CC CT CT TT CT CC CT CT CC CC CC CT TT TT TT CC CT CTCC CC CC TT TT CT CT TT CT CT CT CC CT CC CC CC CC CT CC CT TT CT CT CTCT CT CC CT CT CT CT CT TT TT TT TT CC CC CC CT TT CT CC CT CT CT TT CCCT CT CC CC CC CT CC TT CC CT TT TT CC CT CC TT CT CT CT CC CC CC CT CTCT TT CT TT CT CC CC CT CC CC CT TT CT CT CT CT CC CT CC TT CT CC CC CTCC CC CT CC CT TT CT CC CT CC TT CC TT CT CC CT CC CT CT CT CC CT CT TTTT CC CT TT TT CC CT CC CT CT CT TT TT CT CC TT CT CT CC CT CC TT CT CTCC CC CT CT CC CT TT CT CT CT CT CT CT TT CC CT CT TT TT CC CT TT CC TTTT TT CC CC CC TT TT CT CC CC CT CT CC CT CT TT TT CC CT CT TT TT CT TTCC CC TT TT TT TT TT CT CT CT CT CC CC CT CC CC TT CT CC CT CT CC CT CTCC CT CC CC TT CT CT CC CT CT CC CT CT CC CT CT TT CT CT CC CC TT CC CCCT CT CC CT CT CT CT TT TT CT CC CT CT CC CT CC TT CC TT CC CC TT CT CCCC CC CC TT CT CT CT CT CC CT CT CT CT CT CC TT CT CT CT TT CT CC CC TTCC TT CT CC CC CT CT CT TT CT CT CC TT CT CT CT CT CC CT CT CC CC CC CTTT CC CT CC TT CC CT CT CT CT CC CT TT CT TT CC CT CT TT CC CC CT TT TTCT TT TT CT TT CT CC CC CT CT CT CC CC CT CT CC TT CC CT CC CT CC TT CCCT CC CT CT TT CC CT TT CT CT CT CT CT CT CT CC CC CT CC CT TT CT CC CCCT CC TT CT CT CC CT CT CT CC CC CT CC CT CC CC CC TT TT CC CT CC CC TTCT CT CT CC CT CT CT CT CT CC CC TT TT CC CC CC TT TT CT CC CC TT CT TTTT CT CC CC CC CT CT TT CC TT TT CT CT CT CT CT TT CT CC CT CT CC CT CTCT CT CT CT CC CC CT TT CC TT CT CC CT CT CT CT CC CT CT TT CT

CC CT CT TT TT CT CC TT TT CC CC CC TT TT TT CT CC TT CT TT TT CT TT CT TT TT TT CT CT CTCT CC TT TT CC CT CT CC TT CT CT TT TT CC CT CT CC CC CC CT CT CT TT CT TT CT TT TT CC CCCT TT TT CT CT CT CT TT CT CT TT TT CC CT CC CT CT CC CT CC CT CC CT CT CC CC TT TT TT TTCT CT CT TT CT CT TT CT TT CC TT CC CC TT CC CT TT CT CC CT CT CT TT CT CT CT CT CT CT CCCT CT CT CC CC CT CC CT TT CT TT CT CT CC CT CT CT CT CT TT CC CT TT TT CC TT CC CT TT CTCT CC TT CT CT TT CT TT TT CT CT CT CT TT CT CT TT CC TT CT CT CT TT TT TT CT CT CT CT CTCT CT CC CC CC TT CC CT TT CT CC CT CT TT TT TT TT CT CC TT CT CT TT CT CT CC CC CC TT CCTT CT CT CT CT CC CT TT CC CC TT TT CC TT CT CT CT CT CC CT CT CC TT CT CT CT CT CC CT CTCT TT CC TT CT CT TT CT CC CT CT CT TT CC CC CC CT CT CC CC CC CT CT CT CT CC CT CT CT CCCT CC CT TT CC TT CT TT CT CC CT CC CT CC CT TT TT CT CT CT CT CT CC TT CC CC CT CT TT CTCT CT CT CT TT CT CT TT CT CT CT CC CC CT CT CC CC CC CT CT CT CT CT CT CC CT CT CC CT TTTT TT CT CC CT CT CT CC TT CT CC CT CC TT TT CT TT TT CT CT TT CT CT TT CT TT CC CT CT CCCT CT CC CT CT CC TT TT CC CT CC CT CT CT CT CT TT CC CC CT CC TT TT TT CC CT CC CC CC CTCC CT CT TT CT CT CT CT CT CC CT TT CC CT TT CT CC CT CT CT CC CC CT CT CT TT CT CT CT CTTT CT CT TT CT CC TT CT CC TT CT CC CC CT CT CC TT CT CC CC CT CT CT CT CC CT CT CC CT CTTT CT TT CT TT TT TT CC CT CT CC CT CT CT CT CC CT CT CT CC TT CT CC TT CT CT CT CC TT TTTT TT CT CT CT CT CT CT CT CT CT CT CT CT TT CT TT CT CT CC CT CT CT CT CC CT TT TT TT CTCC CT TT CT CT CT TT TT CC CT CC CT CC CT CT TT CC TT CC CT CT TT CC CT CC CT CT CC CT CCCT TT CT TT TT CT CC CT CT TT TT CT CC CC CC CT CT CT TT CT TT CT CT CT CT TT CT TT CC TTCT CT CC CC CC CT CC CT CC TT TT TT CC CC TT CT CT CT TT CC CT CT CC CC CT TT CT CT CT TTTT CT CT CT TT CT CT CT TT CT TT CC CT CT TT CT CT TT TT CC TT TT CC CT CT CT CC CC CT TTTT CT TT CT CT CT CT TT TT CT TT CT CT CC CT CT CT TT TT CT CC CT TT CC CT TT TT CT TT CTCT CC CT TT TT CT CT TT CT CC CT TT CT CT CT CT CT CT CC CT CT CC CT CC TT CT CC TT CT TTCT TT CC CT CT CC TT CT CC TT CC CT CT TT TT CT CT TT CT CC CC CT CT TT CT CT TT CC CC CTTT CT TT TT CT CC TT CC CT CT CC TT CT CC CT CC TT CC TT CC CT TT TT TT CT CC TT CC CC CCCT CT CC TT CC TT CT CT CC TT TT CT TT CT CC CT CC CT CT CT TT TT TT CC CT CC CC CC TT TTCT CT CT CT CT TT CT CC CT TT CT TT CC CC CT TT CT CT CC CT CT CC TT TT TT CT TT CC TT TTCC CC CC CT CT TT CT CT CC TT CT CT TT CT CT CT CC TT CT CT CC CT CC CC CT CC CT CT CT TTTT CC CC TT CT CC TT TT CT TT CT CC CT CT CT TT CC CC CC CC CT CT CT CC CT CT CT TT CT CTTT TT CT CC CT CC CC CC CT CT TT CT CT CC CC TT CC CT TT CT TT CT CC CT CT TT TT TT CT TTCC CT CT CC CT CT CC CC CT CT TT CC TT TT CT CT CT TT CT CT CT CC TT TT TT CT TT TT CT CTCC CC CC CT CT CT CT CT CC CT CC CT TT CC CT TT CT CT CT CT CT TT CT TT CT CT TT TT TT CCTT CC CC CC CT CT CC CT TT TT CT CT CC TT CT TT CT TT CT CC CC CC CT TT TT TT CT CT CC CTCT CT CC CC CT CC CC TT CT CC CT CC CT CT CC CC TT CT CC TT TT CT CT TT CT CT CC TT CC CTCT CC CT CT CC CT TT CT CT CC TT TT CT CT CT CT CC TT CT CT CC CT TT TT CC CC CT TT TT CCCT CT CC CC TT CT CT TT TT CT CT CT CT CT TT CT CT CT CC CT TT CT CT CC CT CC CT TT CT CCCC TT CC TT CT CC TT CT TT CT TT TT CT CT CT TT CC CT TT CT CC TT CT TT CC TT CT TT CT CTTT TT CC CC TT CC CC CC CC CC CT CT CT CC CT CT CC CT TT CT CT CT CT CT CT TT CT CC CT CTCC CC CT CT CT CT CT TT TT CT CT TT TT CT TT CC TT TT CT CC TT TT CT TT CC TT CC CT CT TTCT CC TT CT CC CT CT TT CC CT CT TT TT CT CT TT CT CC CT CT CC TT CT CC CT TT TT CC CT TTCT CC CT CT TT CC CT TT TT TT TT CT TT CT TT CT CC CT CC TT CT CT CT CT TT CT CT TT CT CTTT CT TT TT CT CT CC TT CC CT CT CC CT TT CT CT CT CT TT CT CT CT TT CC CC CC CT CC TT CCTT CT CC CT CT CC CC CC CT CT CC TT CT CT CT CT CT CT CT CT TT TT CT CC CC CT CT CT CT CCCT TT TT CT CT CT CC CT CT TT CT CT CT CT CT CC CT CT CT TT CC CC TT CT CC CT CT TT CT CCCT TT CC CT CC CT CT CT TT CC CC CT CT TT CT CT CT TT CT CC CT CT CT CT TT CT CT TT CT CCCT CT CC CT CC CT CT CT CT CT CT CT CT TT TT CC CT TT CC CT TT CC CC CT CC CC CT TT CT CCCT CT CC CC CT CT CT CT CT TT TT CT CT TT CC CC CT CT CC CT CC CT TT CC CT CT CC CC TT TTTT CT TT CT TT CT TT TT CT TT CC TT CT CT CC CC CC CC CT CT TT TT TT CC CT CC CT TT CT TTCC CT TT CT CT CC CC CT CT CT TT TT TT CT CC CT TT TT CT TT CC CT TT CC CT CC CT CT CT TTCC CC CT TT TT TT CT CT CC CT TT CT CT CT TT CT TT CC CT CC CC CT CT CC CT CC TT CT CC CTCT CC CC CT CC TT CT CT CT TT CT CT CC CT CT CC TT CC CT CT TT CT CT TT TT CT CT TT TT TTCT CT CT CC CT CT CC TT CT CT CC CC CT CC CT CT CT CT CT CC CC CC CT CT TT CT TT CT CT CTCT CT CT CT CT TT TT TT CT CC CT CC CT TT TT TT TT TT CT CT CT TT CT CT TT CT CC CT CT CTCT CT TT TT CT CT TT TT CC CT TT TT CC CT TT CT TT TT CT TT TT CC CC CC CC CT CT CT TT CTTT CC CC CC TT CT CT TT TT CT CT CT CT CT TT CT CT CC CT CT CT CT TT CC CC CT CT CT CC TTCT TT TT CT TT TT CC CT CC TT CT CC TT CC CC CT CC CC TT TT CT CT CC CC CT TT TT CC CT TTTT TT TT CT CT CT TT CT CT CC CC CT TT CT TT CT TT CT CC CC CT CC CT CT CT TT CC CC CT CTCT CC CT CT TT CT CT CT CT CT CT CT CC CC TT CT TT CT CT CC CC CC TT CC TT CT CT CT CT TTCC TT CT CC TT CC TT CC CT TT CC CT TT CC CT CT CC CT CC CC CT CC CC CT CT CT CT CT CT CTCT CT CT CT CT CT CC CT TT CT TT CT CT CC TT CT CC CT CC CT CT TT TT CC CT CT CT TT CC CTTT CT CT TT CT CT CT TT CT CT CT TT CC CT CT TT TT CC TT CC CC TT TT TT TT CC CC CT CT CTCT CC CC TT CT TT CC TT CT CT CT TT CT CT CC CC CC TT CT CT CC CT CT CT CT CC CC CT CT CCCC CC CC CC CC CT CT TT CT CT CT CT CT CT TT CT CT TT TT CT CT CT TT CT CT CT CT TT TT TTCT CC CT TT TT CT TT TT TT CT

Odds ratio for C allele: 1.35, p = 6.3 x 10-7

controls cases

Page 23: Epi519 Gwas Talk

Manhattan plot

(McCarthy et al.,Nature Reviews Genetics, May 2008)

Page 24: Epi519 Gwas Talk

p-value the probability of seeing your data or more extreme

data if the null hypothesis is true. By chance, with 1,000,000 statistical tests: •  a threshold of p=0.05

would show 50,000 “significant” associations 360 cases : 360 controls

•  a threshold of p = 0.05/1,000,000 (5 x 10-8) would show 0.05 “significant” associations 1590 cases: 1590 controls.

Page 25: Epi519 Gwas Talk

study design considerations Case-control or cohort Sample size Phenotype definition

Comparability of cases and controls •  Genotyping quality •  Population substructure •  Laboratory procedures, genotyping, data cleaning

Page 26: Epi519 Gwas Talk

population stratification

requires both allele frequency and disease prevalence differences

Balding. Nature Reviews Genetics. 2006; 7:781-791

Page 27: Epi519 Gwas Talk

(modified from McCarthy et al.,Nature Reviews Genetics, May 2008)

Q-Q plots

Page 28: Epi519 Gwas Talk

TA Manolio et al. Nature 461, 747-753 (2009) doi:10.1038/nature08494

Feasibility of identifying genetic variants by risk allele frequency and strength of genetic effect (odds ratio).

Allele frequency & effect size

Page 29: Epi519 Gwas Talk

reasons for larger sample size: • More genotypes / tests

• More genotype error or misclassification

• Higher heterogeneity of association

• Lower effect size

• Lower frequency of risk allele

• Lower correlation between marker allele and risk allele.

Page 30: Epi519 Gwas Talk

power & sample size

(Rice, personal communication)

Page 31: Epi519 Gwas Talk

Multi-stage discovery Carry-forward a large number of potential associations through multiple, narrowing stages.

Protect against false positives via replication

Minimize false negative results via permissive early thresholds

From  Hoover,  R.  Epidemiology.  18(1):13-­‐17,  January  2007.  

Page 32: Epi519 Gwas Talk

Meta analysis Combine results from several studies to increase power using traditional methods of meta-analysis.

Allows for first stage discovery of small effect sizes

Page 33: Epi519 Gwas Talk

(SELECTED EXAMPLES)

Page 34: Epi519 Gwas Talk

Wellcome Trust Case Control Consortium Biggest projects undertaken to identify genetic variation that

may be associated with disease £ 9 million in funding from Wellcome Trust GWAS of seven common diseases: 2,000 cases each and 3,000

shared controls All genotyping data available to scientific community

www.wtccc.org.uk; (Nature, vol 447, 7 June 2007)

Page 35: Epi519 Gwas Talk

Lon Cardon, SISG 2007

Page 36: Epi519 Gwas Talk

UK Control groups are NOT very different

Lon Cardon, SISG 2007

Page 37: Epi519 Gwas Talk

WTCCC Results

Page 38: Epi519 Gwas Talk

WTCCC results

Samani, NEJM 2007

Page 39: Epi519 Gwas Talk

WTCCC results

Page 40: Epi519 Gwas Talk

Coronary Disease GWAS: 9p21 author McPherson Helgadottir Samani Larson

where when

Science May 2007

Science May 2007

NEJM August 2007

BMC Med Gen Sept 2007

design 3-stage case control

case-control case control cohort

discovery OHS 1 OHS 2 ARIC

deCODE: Iceland A WTCCC Framingham Heart Study

replication CCHS DHS

OHS-3

Iceland B 3 U.S. case-control

German Family Study

case definition

severe premature CHD MI MI or revascularization + fhx of CAD

incident MI

age at onset <60 <70 males <75 females

<66

Page 41: Epi519 Gwas Talk

9p21 results

Helgadottir, Science 2007 McPherson, Science 2007

study SNP locus hazard/odds ratio PAR

ARIC rs10757274 9p21 AB: 1.18 (1.02-1.37) BB: 1.29 (1.09-1.52)

12-15%

CCHS rs10757274 9p21 AB: 1.26 (1.12-1.42) BB: 1.38 (1.19-1.60)

10-13%

deCODE rs10757278 9p21 AB: 1.26 (1.16-1.36) BB: 1.64 (1.47-1.82)

21%

deCODE early onset

rs10757278 9p21 AB: 1.49 (1.31-1.69) BB: 2.02 (1.72 - 2.36)

31%

Page 42: Epi519 Gwas Talk

9p21 Gene Region

Page 43: Epi519 Gwas Talk

9p21 locus not located within a “gene” region contains CDKN2A and CDKN2B genes •  role in cell proliferation, cell aging and apoptosis -

important features of atherogenesis •  Sequencing did not reveal obvious candidates

may implicate a previously unrecognized gene or regulatory element

same region also associated with type 2 diabetes

Page 44: Epi519 Gwas Talk

MIGen: Population Study Cases (mean age) Controls (mean age) Italian ATVB (Italy) 1,693

(39 y) 1,668 (39 y)

Heart Attack Risk in Puget Sound (USA) 505 (46 y)

559 (45 y)

REGICOR (Spain) 312 (46 y)

317 (46 y)

MGH (USA) 204 (47 y)

260 (54 y)

FINRISK (Finland) 167 (47 y)

172 (47 y)

Malmö Diet & Cancer (Sweden) 86 (47 y)

99 (49 y)

Page 45: Epi519 Gwas Talk

MIGen: Design

Page 46: Epi519 Gwas Talk

MIGen: Early MI SNPs Locus genes of interest

✔ 1p13 CELSR2-PSRC1-SORT1 ✔ 1q41 MIA3 ✘ 2p36 -- ✪ 2q33 WDR12 ✪ 3q22.3 MRAS ✪ 6p24.1 PHACTR1 ✘ 6q25 MTHFD1L ✔ 9p21 CDKN2A-CDKN2B ✔ 10q11 CXCL12 ✘ 15q22 SMAD3 ✪ 19p13.2 LDLR ✪ 21q22 MRPS6-SLC5A3-KCNE2

Page 47: Epi519 Gwas Talk

MIGen: Risk Score

4 replicated loci: CDKN2A-B, CELSR2-PSRC1-SORT1, MIA3, CXCL12 5 new loci: SLC5A3-MRPS6-KCNE2, PHACTR, WDR12, LDLR, PCSK9

Page 48: Epi519 Gwas Talk

(LESSONS, QUESTIONS, DIRECTIONS)

Page 49: Epi519 Gwas Talk

Published Genome-Wide Associations through 6/2010, 904 published GWA at p<5x10-8 for 165 traits

NHGRI GWA Catalog www.genome.gov/GWAStudies

Page 50: Epi519 Gwas Talk

new biology: genomic context

Manolio, NEJM 2010

Page 51: Epi519 Gwas Talk

new biology: mechanisms

Manolio, NEJM 2010

Page 52: Epi519 Gwas Talk

new biology: connections

Manolio, NEJM 2010

Page 53: Epi519 Gwas Talk

missing heritability (2009)

Manolio, Nature 2009

number of loci % of heritability

explained Age-related macular degeneration 5 50%

Crohn’s disease 32 20%

Type-2 diabetes 18 6%

HDL cholesterol 7 5%

Height 40 5%

Early-onset MI 9 2.8%

Fasting glucose 4 1.5%

Page 54: Epi519 Gwas Talk

missing heritability many variants with small effects yet to be found •  larger sample sizes have revealed more loci

true positives below significance threshold contribution of rare variants failure to identify true causal variant structural variants poorly captured by arrays previous estimates of heritability flawed GxG or GxE interactions

Page 55: Epi519 Gwas Talk

missing heritability (update)

Meta-analysis of > 100,000 discovers 59 new associations SNPs explain ~12% of trait variability & ~ 25% heritability Some predict MI risk; point to LDL/HDL differences

Page 56: Epi519 Gwas Talk

disease prediction

hope: highly predictive and affordable genetic tests reality: low discriminatory and predictive ability

Manolio, NEJM 2010

Page 57: Epi519 Gwas Talk

next steps Ever larger sample sizes Studies of non-European ethnic populations Sequencing implicated genetic regions More complex genetic models •  Gene x Gene interactions •  pooling of rare variants

Functional biology: work in basic science and animal models

Page 58: Epi519 Gwas Talk

summary GWAS have led to new biology Small effect sizes Not useful in prediction Much yet to be discovered More complicated than we thought

Don’t forget: •  case definition •  QC measures •  sample size and power • multiple testing •  independent replication

Page 59: Epi519 Gwas Talk

“There have been few, if any, similar bursts of discovery in the history of medical research”

-- “Drinking from the fire hose …” (Hunter & Knox)

Page 60: Epi519 Gwas Talk

Consumer Genotyping Toys

Page 61: Epi519 Gwas Talk

Consumer Genotyping Toys

Page 62: Epi519 Gwas Talk

Consumer Genotyping Toys

Page 63: Epi519 Gwas Talk

Consumer Genotyping Toys

Page 64: Epi519 Gwas Talk

Sources / References / Reading 1.  The International HapMap Consortium.* A haplotype map of the human genome. Nature, 2005. 437(7063): p. 1299-320.[16255080]. 2.  The Type 1 Diabetes Genetics Consortium.* Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nature Genetics, 2009

May 10 [19430480] 3.  Myocardial Infarction Genetics Consortium.* Genome-wide association of early-onset myocardial infarction with single nucleotide polymorphisms and copy number

variants. Nat Genet. 2009 Mar;41(3):334-41 [19198609] 4.  The Wellcome Trust Case Control Constortium.* Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 2007. 447

(7145): p. 661-78.[17554300]. 5.  Balding, D.J., A tutorial on statistical methods for population association studies. Nat Rev Genet, 2006. 7(10): p. 781-91.[16983374]. 6.  Christensen, K. and J.C. Murray, What genome-wide association studies can do for medicine. N Engl J Med, 2007. 356(11): p. 1094-7.[17360987]. 7.  Frazer, K.A., et al., A second generation human haplotype map of over 3.1 million SNPs. Nature, 2007. 449(7164): p. 851-61.[17943122]. 8.  Hirschhorn, J.N., et al., A comprehensive review of genetic association studies. Genet Med, 2002. 4(2): p. 45-61.[11882781].  9.  Hoover, R. The evolution of epidemiologic research: from cottage industry to "big" science. Epidemiology. 2007 Jan;18(1):13-7. [17179754] 10.  Hunter, D.J. and P. Kraft, Drinking from the fire hose--statistical issues in genomewide association studies. N Engl J Med, 2007. 357(5): p. 436-9.[17634446]. 11.  Li Y, Willer C, Sanna S, Abecasis G., Genotype imputation. Annu Rev Genomics Hum Genet. 2009;10:387-406. [19715440] 12.  Johnson AD and O’Donnell CJ: Open access database of GWA results, BMC Medical Genetics 2009: 10:6 13.  Manolio, T.A., et al., Genetics of ultrasonographic carotid atherosclerosis. Arterioscler Thromb Vasc Biol, 2004. 24(9): p. 1567-77.[15256397]. 14.  Manolio, T.A., L.D. Brooks, and F.S. Collins, A HapMap harvest of insights into the genetics of common disease. J Clin Invest, 2008. 118(5): p. 1590-605.[18451988]. 15.  Manolio TA, Collins FS, Cox NJ, et al. Finding the missing heritability of complex diseases. Nature. 2009 Oct 8;461(7265):747-53. [19812666] 16.  Manolio, TA. Genomewide association studies and assessment of the risk of disease. N Engl J Med. 2010 Jul 8;363(2):166-76. [20647212] 17.  McCarthy, M.I., et al., Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet, 2008. 9(5): p. 356-69.[18398418]. 18.  Pearson, T.A. and T.A. Manolio, How to interpret a genome-wide association study. JAMA, 2008. 299(11): p. 1335-44.[18349094]. 19.  Samani NJ, Erdmann J, Hall AS, et al. Genomewide association analysis of coronary artery disease. N Engl J Med. 2007 Aug 2;357(5):443-53. [17634449] 20.  Teslovich TM, Musunuru K, Smith AV, et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature. 2010 Aug 5;466(7307):707-13. [20686565] 21.  NHGRI catalog of published GWA studies (http://genome.gov/GWASstudies)

Page 65: Epi519 Gwas Talk