gwas for quantitative traits -...

36
GWAS for quantitative traits Peter M. Visscher [email protected] Queensland Institute of Medical Research

Upload: lephuc

Post on 07-Feb-2018

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

GWAS for quantitative traits

Peter M. Visscher

[email protected]

Queensland Institute ofMedical Research

Page 2: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Overview

• Darwin and Mendel

• Background: population genetics

• Background: quantitative genetics

• GWAS– Examples

– Analysis

– Statistical power

Page 3: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

[Galton, 1889]

Page 4: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Mendelian Genetics

Following a single (or several) genes that we can directly score

Phenotype highly informative

as to genotype

Page 5: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Darwin & Mendel

• Darwin (1859) Origin of Species– Instant Classic, major immediate impact

– Problem: Model of Inheritance• Darwin assumed Blending inheritance

• Offspring = average of both parents

• zo = (zm + zf)/2

• Fleming Jenkin (1867) pointed out problem– Var(zo) = Var[(zm + zf)/2] = (1/2) Var(parents)

– Hence, under blending inheritance, half the variation is removed each generation and this must somehow be replenished by mutation.

Page 6: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Mendel• Mendel (1865), Experiments in Plant Hybridization• No impact, paper essentially ignored

– Ironically, Darwin had an apparently unread copy in his library

– Why ignored? Perhaps too mathematical for 19th century biologists

• Rediscovery in 1900 (by three independent groups)

• Mendel’s key idea: Genes are discrete particles passed on intact from parent to offspring

Page 7: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

The height vs. pea debate

(early 1900s)

Do quantitative traits have the same hereditary and evolutionary properties as discrete characters?

Biometricians Mendelians

Page 8: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

RA Fisher (1918). Transactions of the Royal Societyof Edinburgh52: 399-433.

m-a m+d m+a

QQ

Qq

qq

Trait

m-a m+d m+a

QQ

Qq

qq

Trait

Page 9: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Population Genetics

• Allele and genotype frequencies• Hardy-Weinberg Equilibrium• Linkage (dis)equilibrium

Page 10: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Allele and Genotype Frequencies

6

Given genotype frequencies, we can always compute allelefrequencies, e.g.,

The converse is not true: given allele frequencies we cannot uniquely determine the genotype frequencies

For n alleles, there are n(n+1)/2 genotypes

If we are willing to assume random mating,

Hardy-Weinbergproportions

∑≠

+=ji

jiiii AAfreqAAfreqp )(21)(

≠=

=jipp

jipAAfreq

ji

iji for 2

for )(

2

Page 11: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Hardy-Weinberg• Prediction of genotype frequencies from allele freqs

• Allele frequencies remain unchanged over generations,provided:

• Infinite population size (no genetic drift)

• No mutation

• No selection

• No migration

• Under HW conditions, a single generation of randommating gives genotype frequencies in Hardy-Weinbergproportions, and they remain forever in these proportions

QC in GWAS studies

Page 12: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Linkage equilibrium

Random mating and recombination eventually changesgamete frequencies so that they are in linkage equilibrium (LE).

Once in LE, gamete frequencies do not change (unless acted on by other forces)

At LE, alleles in gametes are independent of each other:

freq(AB) = freq(A)*freq(B)freq(ABC) = freq(A) * freq(B) * freq(C)

Page 13: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Linkage disequilibriumWhen linkage disequilibrium (LD) present, alleles are nolonger independent --- knowing that one allele is in the gamete provides information on alleles at other loci:

freq(AB) ≠ freq(A) * freq(B)

The disequilibrium between alleles A and B is given by

DAB = freq(AB) – freq(A)*freq(B)

GWAS relies on LD between markers and causal variants

Page 14: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Linkage equilibrium Linkage disequilibrium

Q1 M1

Q2 M2

Q1 M2

Q2 M1

Q1 M1

Q2 M2

Q1 M2

Q2 M1

Q1 M1

Q1 M1

Q2 M2

Q2 M2

Q1 M1

Q2 M2

Q1 M1

Q2 M2

Page 15: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

The Decay of Linkage DisequilibriumThe frequency of the AB gamete is given by

freq(AB) = freq(A)*freq*(B) + DAB

If recombination frequency between the A and B lociis c, the disequilibrium in generation t is

D(t) = D(0) (1 – c)t

Note that D(t) -> zero, although the approach can be slow when c is very small

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0 10 20 30 40 50 60 70 80 90 100Generation

LD

c = 0.10c = 0.01c = 0.001

NB: Gene mapping & GWAS

Page 16: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Forces that Generate LD

• Drift (finite population size)• Selection• Migration (admixture)• Mutation• Population structure (stratification)

Effective population size determines the number of markers needed for GWAS

Page 17: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Quantitative Genetics

The analysis of traits whose variation is determined by both a number of genes and

environmental factors

Phenotype is highly uninformative as tounderlying genotype

m-a m+d m+a

QQ

Qq

qq

Trait

m-a m+d m+a

QQ

Qq

qq

Trait

Page 18: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Complex (or Quantitative) trait

• No (apparent) simple Mendelian basis for variation in the trait

• May be a single gene strongly influenced by environmental factors

• May be the result of a number of genes of equal (or differing) effect

• Most likely, a combination of both multiple genes and environmental factors.

• Example: Blood pressure, cholesterol levels, IQ, height, etc.

Page 19: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Basic model of Quantitative Genetics

Basic model: P = G + E

G = average phenotypic value for that genotypeif we are able to replicate it over the universeof environmental values, G = E[P]

G x E interaction --- G values are differentacross environments. Basic model nowbecomes P = G + E + GE

Page 20: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Biometrical model for single diallelic Quantitative

Trait Locus (QTL)

Contribution of the QTL to the Mean (X)

aaAaAAGenotypes

Frequencies, f(x)

Effect, x

p2 2pq q2

a d -a

( )∑=i

ii xfxµ

= a(p2) + d(2pq) – a(q2)Mean (X) = a(p-q) + 2pqd

Page 21: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Example: Apolipoprotein E & Alzheimer’s

Genotype ee Ee EE

Average age of onset 68.4 75.5 84.3

2a = G(EE) - G(ee) = 84.3 - 68.4 --> a = 7.95

d = G(Ee) - [ G(EE)+G(ee)]/2 = -0.85

d/a = -0.10 Only small amount of dominance

Page 22: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Biometrical model for single diallelic QTL

Contribution of the QTL to the Variance (X)

aaAaAAGenotypes

Frequencies, f(x)

Effect, x

p2 2pq q2

a d -a

= (a-m)2p2 + (d-m)22pq + (-a-m)2q2Var (X)

( ) ( )∑ −=i

ii xfxVar 2µ

= VQTL

HW proportions

Page 23: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Biometrical model for single diallelic QTL

= (a-m)2p2 + (d-m)22pq + (-a-m)2q2Var (X)

= 2pq[a+(q-p)d]2 + (2pqd)2

= VAQTL+ VDQTL

Additive effects: the main effects of individual allelesDominance effects: represent the interaction between alleles

Page 24: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Biometrical model for single biallelic QTL

aa Aa AA

m

-a

a

d

Var (X) = Regression Variance + Residual Variance= Additive Variance + Dominance Variance

Fisher 1918

Page 25: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Association (GWAS)

• State of play

• Model

• Analysis method

• Power of detection

Page 26: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

• GWAS works

• Effect sizes are typically small– Disease: OR ~1.1 to ~1.3

– Quantitative traits: % var explained <<1%

Disease Number of loci

Percent of Heritability Measure Explained

Heritability Measure

Age-related macular degeneration

5 50% Sibling recurrence risk

Crohn’s disease 32 20% Genetic risk (liability)

Systemic lupus erythematosus

6 15% Sibling recurrence risk

Type 2 diabetes 18 6% Sibling recurrence risk

HDL cholesterol 7 5.2% Phenotypic variance

Height 40 5% Phenotypic variance

Early onset myocardial infarction

9 2.8% Phenotypic variance

Fasting glucose 4 1.5% Phenotypic variance

Page 27: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Effect sizes QT (104 SNPs)% variance explained, quantitative

traits

05

101520253035

0.1 0.3 0.5 0.7 0.9 1.1 1.3 1.5 1.7

Freq

uenc

y

Page 28: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au
Page 29: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Linear model for single SNP

• Allelic

Y = µ+ b*x + ex = 0, 1, 2 for genotypes aa, Aa and AA

• Genotypic

Y = µ + Gi + eGi = genotype group for corresponding to

genotypes aa, Aa and AA

Additive model

Additive + dominance model

Page 30: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Method

• Linear regression

• ANOVA

• (other: maximum likelihood, Bayesian)

Page 31: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Test statistic (allelic model)

212,1

22

2

~)ˆvar(/ˆ

)1,0(~)ˆ(/ˆ

χ

σ

≈=

≈=

N

N

FbbT

NtbbT

)1(2)var()ˆvar(

22

ppNxNb ee

−==

σσ

Page 32: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Statistical Power (additive model)

q2 = {2p(1-p)[a + d(1-2p)]2} / σp2

Non-centrality parameter of χ2 test:

λ = Nq2/(1-q2) ≈ Nq2

Required sample size given type-I (α) and type-II (β) error:

N = [(1-q2)/(q2)](z(1-α/2) + z(1-β))2 ≈ (z(1-α/2) + z(1-β))

2 / q2

Page 33: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

LD again

r2 = LD correlation between QTL and genotyped SNP

Proportion of variance explained at SNP= r2q2

Required sample size for detectionN ≈ (z(1-α/2) + z(1-β))

2 / (r2q2)

Page 34: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Genetic Power Calculator (Shaun Purcell)http://pngu.mgh.harvard.edu/~purcell/gpc/

Page 35: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

Serum bilirubin: if all GWAS were so simple…

RS2070959_A210

95%

CI P

HEN

OTY

PE

2.000

1.500

1.000

0.500

0.000

-0.500

38% of phenotypic variance explained

Page 36: GWAS for quantitative traits - Bioinformaticsbioinformatics.org.au/ws09/presentations/Day3_PVisscher.pdf · GWAS for quantitative traits Peter M. Visscher peter.visscher@qimr.edu.au

1984