lecture 6: introduction to quantitative...

33
Lecture 6: Introduction to Quantitative genetics Bruce Walsh lecture notes Liege May 2011 course version 25 May 2011

Upload: others

Post on 04-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Lecture 6: Introduction to

Quantitative genetics

Bruce Walsh lecture notesLiege May 2011 courseversion 25 May 2011

Quantitative GeneticsThe analysis of traits whose

variation is determined by botha number of genes andenvironmental factors

Phenotype is highly uninformative as tounderlying genotype

Complex (or Quantitative) trait• No (apparent) simple Mendelian basis for variation in the

trait

• May be a single gene strongly influenced by environmentalfactors

• May be the result of a number of genes of equal (ordiffering) effect

• Most likely, a combination of both multiple genes andenvironmental factors

• Example: Blood pressure, cholesterol levels– Known genetic and environmental risk factors

• Molecular traits can also be quantitative traits

– mRNA level on a microarray analysis

– Protein spot volume on a 2-D gel

Phenotypic distribution of a trait

Consider a specific locus influencing the trait

For this locus, mean phenotype = 0.15, whileoverall mean phenotype = 0

Goals of Quantitative Genetics

• Partition total trait variation into genetic (nature)vs. environmental (nurture) components

• Predict resemblance between relatives– If a sib has a disease/trait, what are your odds?

• Find the underlying loci contributing to geneticvariation– QTL -- quantitative trait loci

• Deduce molecular basis for genetic trait variation

• eQTLs -- expression QTLs, loci with a quantitativeinfluence on gene expression– e.g., QTLs influencing mRNA abundance on a microarray

Dichotomous (binary) traits

Presence/absence traits (such as a disease) can (and usually do) have a complex genetic basis

Consider a disease susceptibility (DS) locus underlying a disease, with alleles D and d, where allele D significantly increases your disease risk

In particular, Pr(disease | DD) = 0.5, so that thepenetrance of genotype DD is 50%

Suppose Pr(disease | Dd ) = 0.2, Pr(disease | dd) = 0.05

dd individuals can rarely display the disease, largelybecause of exposure to adverse environmental conditions

If freq(d) = 0.9, what is Prob (DD | show disease) ?

freq(disease) = 0.12*0.5 + 2*0.1*0.9*0.2 + 0.92*0.05 = 0.0815

From Bayes’ theorem, Pr(DD | disease) = Pr(disease |DD)*Pr(DD)/Prob(disease) = 0.12*0.5 / 0.0815 = 0.06 (6 %)

dd individuals can give rise to phenocopies 5% of the time,showing the disease but not as a result of carrying therisk allele

Pr(Dd | disease) = 0.442, Pr(dd | disease) = 0.497

Thus about 50% of the diseased individuals are phenocopies

Basic model of Quantitative Genetics

Basic model: P = G + E

Phenotypic value -- we will occasionallyalso use z for this value

Genotypic value

Environmental value

G = average phenotypic value for that genotypeif we are able to replicate it over the universeof environmental values, G = E[P]

Basic model of Quantitative Genetics

Basic model: P = G + E

G = average phenotypic value for that genotypeif we are able to replicate it over the universeof environmental values, G = E[P]

G x E interaction --- G values are differentacross environments. Basic model nowbecomes P = G + E + GE

Q1Q1 Q2Q1 Q2Q2

C C + a(1+k) C + 2aC C + a + d C + 2a

C -a C + d C + a

2a = G(Q2Q2) - G(Q1Q1)

d = ak =G(Q1Q2 ) - [G(Q2Q2) + G(Q1Q1) ]/2

d measures dominance, with d = 0 if the heterozygoteis exactly intermediate to the two homozygotes

k = d/a is a scaled measure of the dominance

Contribution of a locus to a trait

Example: Apolipoprotein E &Alzheimer’s

84.375.568.4Average age of onset

EEEeeeGenotype

2a = G(EE) - G(ee) = 84.3 - 68.4 --> a = 7.95

ak =d = G(Ee) - [ G(EE)+G(ee)]/2 = -0.85

k = d/a = 0.10 Only small amount of dominance

Example: Booroola (B) gene

2.662.171.48Average Litter size

BBBbbbGenotype

2a = G(BB) - G(bb) = 2.66 -1.46 --> a = 0.59

ak =d = G(Bb) - [ G(BB)+G(bb)]/2 = 0.10

k = d/a = 0.17

Fisher’s (1918) Decomposition of GOne of Fisher’s key insights was that the genotypic valueconsists of a fraction that can be passed from parent tooffspring and a fraction that cannot.

µG =!

Gij · freq(QiQj )Mean value, with

Average contribution to genotypic value for allele i

Gij = µG + !i +!j + "ij

Consider the genotypic value Gij resulting from an AiAj individual

In particular, under sexual reproduction, parents onlypass along SINGLE ALLELES to their offspring

Since parents pass along single alleles to theiroffspring, the !i (the average effect of allele i)represent these contributions

Gij = µG + !i +!j + "ij

"Gij = µG +!i + !j

The genotypic value predicted from the individualallelic effects is thus

The average effect for an allele is POPULATION-SPECIFIC, as it depends on the types and frequencies of alleles that it pairs with

Gij = µG + !i +!j + "ij

"Gij !Gij = "ij

Dominance deviations --- the difference (for genotypeAiAj) between the genotypic value predicted from thetwo single alleles and the actual genotypic value,

"Gij = µG +!i + !j

The genotypic value predicted from the individualallelic effects is thus

Gij = µG + 2!1 + (!2 ! !1)N + "ij

Gij = µG + !i +!j + "ij

Fisher’s decomposition is a Regression

Predicted valueResidual error

A notational change clearly shows this is a regression,

Independent (predictor) variable N = # of Q2 alleles

Gij = µG + 2!1 + (!2 ! !1)N + "ij

2!1 + (!2 !!1)N =

#$%$&

2!1 forN = 0, e.g, Q1Q1

!1 + !2 forN = 1, e.g, Q1Q2

2!2 forN = 2, e.g, Q2Q2

Regression slopeIntercept

0 1 2

N

G G22

G11

G21

Allele Q2 common, !1 > !2

0 1 2

N

G G22

G11

G21

Allele Q1 common, !2 > !1

Slope = !2 - !1

0 1 2

N

G G22

G11

G21

Both Q1 and Q2 frequent, !1 = !2 = 0

2aa(1+k)0Genotypic

value

Q2Q2Q2Q1Q1Q1Genotype

Consider a diallelic locus, where p1 = freq(Q1)

µG = 2p2 a(1 + p1k)Mean

Allelic effects

!2 = p1a [ 1 +k (p1 � p2 ) ]!1 = � p2a [ 1 + k (p1 � p2 ) ]

Dominance deviations "ij = Gij ! µG! !i ! !j

Average effects and Additive Genetic Values

A (Gij ) = !i + !j

A =n!

k=1

'!(k)

i + !(k)k

(

The ! values are the average effects of an allele

A key concept is the Additive Genetic Value (A) ofan individual

A is called the Breeding value or the Additive geneticvalue

A =n!

k=1

'!(k)

i + !(k)k

(

Why all the fuss over A?

Suppose father has A = 10 and mother has A = -2for (say) blood pressure

Expected blood pressure in their offspring is (10-2)/2 = 4 units above the population mean. Offspring A =average of parental A’s

KEY: parents only pass single alleles to their offspring.Hence, they only pass along the A part of their genotypicvalue G

Genetic Variances

Gij = µg + (!i + !j ) + "ij

#2(G) =n!

k=1

#2(!(k)i + !(k)

j ) +n!

k=1

#2("(k)ij )

#2(G) = #2(µg +(!i + !j ) + "ij) = #2(!i + !j ) + #2("ij)

As Cov(!,") = 0

Genetic Variances

#2(G) =n!

k=1

#2(!(k)i + !(k)

j ) +n!

k=1

#2("(k)ij )

#2G = #2

A + #2D

Additive Genetic Variance(or simply Additive Variance)

Dominance Genetic Variance(or simply dominance variance)

Key concepts (so far)• !i = average effect of allele i

– Property of a single allele in a particular population (depends ongenetic background)

• A = Additive Genetic Value (A)

– A = sum (over all loci) of average effects

– Fraction of G that parents pass along to their offspring

– Property of an Individual in a particular population

• Var(A) = additive genetic variance– Variance in additive genetic values

– Property of a population

• Can estimate A or Var(A) without knowing any of theunderlying genetical detail (forthcoming)

#2A = 2p1 p2 a2[ 1+ k (p1 p2 ) ]2One locus, 2 alleles:

Q1Q1 Q1Q2 Q2Q2

0 a(1+k) 2a

Dominance alters additive variance

When dominance present, Additive variance is anasymmetric function of allele frequencies

#2A = 2E[!2 ] = 2

m!

i=1

!2i pi

Since E[!] = 0, Var(!) = E[(! -µa)2] = E[!2]

#2D = E["2 ] =

m!

i=1

m!

j=1

"2ij pi pj

#2D = (2p1 p2 ak)2One locus, 2 alleles:

Q1Q1 Q1Q2 Q2Q2

0 a(1+k) 2a

Equals zero if k = 0

This is a symmetric function ofallele frequencies

Dominance variance

Can also be expressed in terms of d = ak

Additive variance, VA, with no dominance (k = 0)

Allele frequency, p

VA

Complete dominance (k = 1)

Allele frequency, p

VA

VD

Epistasis

Gijkl = µG + (!i + !j + !k +!l) + ("ij + "kj)+ (!!ik +!!il + !!jk + !!jl)+ (!"ikl + !"jkl + !"kij + !"lij)+ (""ijkl)

= µG + A+ D + AA + AD + DD

These components are defined to be uncorrelated,(or orthogonal), so that

#2G = #2

A + #2D + #2

AA + #2AD +#2

DD

Gijkl = µG + (!i + !j + !k +!l) + ("ij + "kj)+ (!!ik +!!il + !!jk + !!jl)+ (!"ikl + !"jkl + !"kij + !"lij)+ (""ijkl)

= µG + A + D + AA + AD + DD

Additive x Additive interactions -- !!, AA

interactions between a single alleleat one locus with a single allele at another

Additive x Dominance interactions -- !", AD

interactions between an allele at onelocus with the genotype at another, e.g.allele Ai and genotype Bkj

Dominance x dominance interaction --- "", DD

the interaction between the dominancedeviation at one locus with the dominancedeviation at another.