on genome-wide association studies (gwas)

25
On genome-wide association studies (GWAS)

Upload: kieu

Post on 13-Jan-2016

63 views

Category:

Documents


0 download

DESCRIPTION

On genome-wide association studies (GWAS). association linkage disequilibrium population structure. case/control design single nucleotide polymorphism data. TTCAGTCAGATCC T AGCCC. Chromosome 1. TTCAGTCAGATCC C AGCCC. Chromosome 2. AAGTCAGTCTAGG G TCGGG. SNP. AAGTCAGTCTAGG A TCGGG. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: On genome-wide association studies (GWAS)

On genome-wide association studies (GWAS)

Page 2: On genome-wide association studies (GWAS)

•association

•linkage disequilibrium

•population structure

Page 3: On genome-wide association studies (GWAS)
Page 4: On genome-wide association studies (GWAS)

•case/control design

•single nucleotide polymorphism data

Page 5: On genome-wide association studies (GWAS)

AAGTCAGTCTAGGAAGTCAGTCTAGGAATCGGGTCGGG

TTCAGTCAGATCCTTCAGTCAGATCCTTAGCCCAGCCC

TTCAGTCAGATCCTTCAGTCAGATCCCCAGCCCAGCCC

AAGTCAGTCTAGGAAGTCAGTCTAGGGGTCGGGTCGGG

Chromosome 1

Chromosome 2

SNPSNP

Page 6: On genome-wide association studies (GWAS)
Page 7: On genome-wide association studies (GWAS)
Page 8: On genome-wide association studies (GWAS)
Page 9: On genome-wide association studies (GWAS)
Page 10: On genome-wide association studies (GWAS)
Page 11: On genome-wide association studies (GWAS)

Population structure explained part of the significant +11.2% inflation of test statistics we observed in an analysis of 6,322 nonsynonymous SNPs in 816 cases of type 1 diabetes and 877 population-based controls from Great Britain. The remainder of the inflation resulted from differential bias in genotype scoring between case and control DNA samples, which originated from two laboratories, causing false-positive associations.

Nature Genetics 37, 1243 - 1246 (2005) Published online: 9 October 2005; | doi:10.1038/ng1653

Population structure, differential bias and genomic control in a large-scale, case-control association studyDavid G Clayton1, Neil M Walker1, Deborah J Smyth1, Rebecca Pask1, Jason D Cooper1, Lisa M Maier1, Luc J Smink1, Alex C Lam1, Nigel R Ovington1, Helen E Stevens1, Sarah Nutland1, Joanna M M Howson1, Malek Faham2, Martin Moorhead2, Hywel B Jones2, Matthew Falkowski2, Paul Hardenbol2, Thomas D Willis2 & John A Todd1

Page 12: On genome-wide association studies (GWAS)

•premise: pop structure causes variance inflation of test statistic under null

•Y_i^2 ~ chi-square(1) ideally

•Y_i^2 ~ inflation factor lambda * chi-square(1)

•so use T_i = Y_i^2/lambda.hat

•lambda.hat = median(Y_i^2)/[ null median ]

Genomic Control (Devlin and Roeder)

Page 13: On genome-wide association studies (GWAS)

•genomic control (Devlin & Roeder)

•structured association (Pritchard et al)

•principal components (Price et al)

Handling population structure

Page 14: On genome-wide association studies (GWAS)

ArticleNature 447, 661-678 (7 June 2007) | doi:10.1038/nature05911; Received 26 March 2007; Accepted 11 May 2007

Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls

The Wellcome Trust Case Control Consortium

Page 15: On genome-wide association studies (GWAS)

•UK population; european ancestry

•seven diseases; 50 research groups (BD, CAD,CD,HT,RA,T1D,T2D)

•2000 cases per disease

•3000 common controls (two distinct sets)

•Affymetrix 500K mapping array set

Page 16: On genome-wide association studies (GWAS)

•16179 samples included (809 dropped considering contamination, non-Caucasian ancestry)

•469,557 SNPs included (93.8%)

•Average call rate 99.63%

•392,575 have MAF > 1%

Quality Control

Page 17: On genome-wide association studies (GWAS)
Page 18: On genome-wide association studies (GWAS)
Page 19: On genome-wide association studies (GWAS)
Page 20: On genome-wide association studies (GWAS)
Page 21: On genome-wide association studies (GWAS)
Page 22: On genome-wide association studies (GWAS)

There may be important population structure that is not well captured by current geographical region of residence. Present implementations of strongly model-based approaches such as STRUCTURE11, 12 are impracticable for data sets of this size, and we reverted to the classical method of principal components13, 14, using a subset of 197,175 SNPs chosen to reduce inter-locus linkage disequilibrium. Nevertheless, four of the first six principal components clearly picked up effects attributable to local linkage disequilibrium rather than genome-wide structure. The remaining two components show the same predominant geographical trend from NW to SE but, perhaps unsurprisingly, London is set somewhat apart

Page 23: On genome-wide association studies (GWAS)

The overall effect of population structure on our association results seems to be small, once recent migrants from outside Europe are excluded. Estimates of over-dispersion of the association trend test statistics (usually denoted ; ref. 15) ranged from 1.03 and 1.05 for RA and T1D, respectively, to 1.08–1.11 for the remaining diseases. Some of this over-dispersion could be due to factors other than structure, and this possibility is supported by the fact that inclusion of the two ancestry informative principal components as covariates in the association tests reduced the over-dispersion estimates only slightly (Supplementary Table 6), as did stratification by geographical region. This impression is confirmed on noting that P values with and without correction for structure are similar (Supplementary Fig. 9). We conclude that, for most of the genome, population structure has at most a small confounding effect in our study, and as a consequence the analyses reported below do not correct for structure. In principle, apparent associations in the few genomic regions identified in Table 1 as showing strong geographical differentiation should be interpreted with caution, but none arose in our analyses.

Page 24: On genome-wide association studies (GWAS)
Page 25: On genome-wide association studies (GWAS)