population genetics 2: linkage...

15
1 Population Genetics 2: Linkage disequilibrium Genotype frequencies in a population A gene B gene f AA = p 2 f Aa = 2pq f aa = q 2 f BB = x 2 f Bb = 2xy f bb = y 2 p + q = 1 x + y = 1 Consider two loci and 1 generation of random mating: A gene: AA, Aa, and aa B gene: BB, Bb, and bb Random association of alleles at a single locus: HWE What about random association of alleles at different loci after random mating?

Upload: others

Post on 25-May-2020

17 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

1

Population Genetics 2:

Linkage disequilibrium

Genotype frequencies in a population A gene B gene

fAA = p2

fAa = 2pq faa = q2

fBB = x2

fBb = 2xy fbb = y2

p + q = 1 x + y = 1

Consider two loci and 1 generation of random mating: A gene: AA, Aa, and aa

B gene: BB, Bb, and bb

Random association of alleles at a single locus: HWE

What about random association of alleles at different loci after random mating?

Page 2: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

2

Random association in gametes

Alleles at A locus

A(p) a(q)

B (x)

AB (px)

aB (qx)

Alle

les

at B

locu

s b

(y) Ab (py)

ab (qy)

remember: p + q =1 and x + y = 1

Consider two loci and 1 generation of random mating:

Random association of alleles at a different locus:

LINKAGE EQUILIBRIUM

GAMETIC PHASE EQUILIBRIUM

Consider two loci and 1 generation of random mating:

Surprisingly common result:

Gene A: HWE

Gene B: HWE

Gene A + Gene B: disequilibrium

Page 3: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

3

Population 1: 100% AABB Population 2: 100% aabb Mix populations equally: 50% AABB + 50% aabb 1 generation of random mating (only three matings possible) : AABB x AABB = AABB aabb x aabb = aabb AABB x aabb = AaBb Nine genotypes are possible: They did not reach equilibrium after one generation of random mating. With continued random mating the “missing” genotypes would appear, but not immediately at their equilibrium frequencies!

Consider two loci (on different chromosomes) and 1 generation of random mating: Example:

AaBB aaBB aaBb

AABb AAbb Aabb

AABB aabb AaBb

We only see 1/3 after 1 generation of random mating!

Consider two loci and 1 generation of random mating:

- attainment of linkage equilibrium is gradual - about 50% of disequilibrium “breaks down” per generation - linkage disequilibrium (LD) persists in populations for many generations - LD = gametic phase disequilibrium

Page 4: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

4

LD in individuals (BIOL 2030 stuff):

Case 1: AB gamete + ab gamete = AaBb Case 2: Ab gamete + aB gamete = AaBb

New symbolism:

AB/ab

indicates the union of AB gamete + ab gamete

We need a new symbolism

LD in individuals: Let’s take an AB/ab individual as an

example:

What types of gametes can the AB/ab make?

(1)  AB

(2)  ab Parental or non-recombinant gametes

(3) Ab

(4) aB Non-parental or recombinant gametes

By the way, lets assume physical linkage.

A B

a b

Physical linkage:

Notation = AB/ab

Page 5: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

5

Parental configuration

Gametes Notation:

Parental (non-recombinant): A B

Parental (non-recombinant): a b

Recombinant: A b

Recombinant: a B

1 crossing over event: 50% parental and 50% recombinant !!!

Review meiosis in a single meiocyte:

LD in individuals:

A B

a b

Physical linkage:

Notation = AB/ab

When genes are on the same chromosome:

fAB = fab ≥ fAb = faB

f (non-recombinant) ≥ f (recombinant)

Recombination fraction (r) is the proportion of recombinant gametes produced by an individual.

When r = 0: fAB + fab = 100% [fAb + faB = 0%]

When r = 0.5: fAB + fab = 50% [fAb + faB = 50%]

Page 6: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

6

From:

iGenetics

P. J. Russell (2002)

page 350

LD in individuals:

When genes are on different chromosomes:

fAB = fab = fAb = faB

f (non-recombinant) = f (recombinant)

A

a

Un linked genes:

B

b

r = 0.5

•  when genes are on different chromosomes

•  when genes are on same chromosome and recombination is high enough for independent assortment

Page 7: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

7

LD in individuals:

Individual AB/ab produces the following:

(1) AB: fAB = 0.38 (2) ab: fab = 0.38 (3) Ab: fAb = 0.12 (4) aB: faB = 0.12

r = 0.12 + 0.12 = 0.24

What do we expect for individuals if allelic association is random?

What do we expect in a population if allelic association is random?

LD in populations:

Random association in gametes

Alleles at A locus

A(p) a(q)

B (x)

AB (px)

aB (qx)

Alle

les

at B

locu

s

b (y)

Ab (py)

ab (qy)

remember: p + q =1 and x + y = 1

fAB =px fab = qy fAb = py faB =qx

fAB + fab + fAb + faB = 1

Page 8: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

8

A B

LD in populations: The frequency of an AB gamete in a population has two sources:

A B

•  Some individuals have this configuration: non-recombinant [parental]

•  Some individuals produce this as a recombinant configuration.

tsrecombinan

tsrecombinanfrom randomat

B andA together putting of prob

recombof prob

tsrecombinan-non

generationlast in gametes

AB offrequency AB

ionrecombinat noofy probabilit

'AB )()1( pxrfrf +−=

LD in populations:

Page 9: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

9

tsrecombinan

tsrecombinanfrom randomat

B andA together putting of prob

recombof prob

tsrecombinan-non

generationlast in gametes

AB offrequency AB

ionrecombinat noofy probabilit

'AB )()1( pxrfrf +−=

))(1( AB'

AB pxfrpxf −−=−Obs. – exp.

fAB = px + D fab = qy + D fAb = py - D faB =qx - D

))(1( AB pxfrD −−= = the linkage disequilibrium parameter

In excess due to LD

Deficient due to LD

Remember:

Individual AB/ab produces the following:

(1) AB: fAB = 0.38 (2) ab: fab = 0.38 (3) Ab: fAb = 0.12 (4) aB: faB = 0.12

r = 0.12 + 0.12 = 0.24

What do we expect for an individuals if association is random?

What do we expect in a population if association is random?

Page 10: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

10

Forces that increase D in populations:

1. Migration 2. Natural selection 3. Genetic Drift

LD in populations:

smaller) is (whichever or max pyqxD =

•  Comparing D among populations is difficult.

•  Standardize D as a fraction of the theoretical maximum for the popn

maxDD

LD toduedeficient

aBAb

LD todue excess

abAB ffffD ×−×=

Page 11: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

11

LD in a population:

MN blood group fM = p = 0.5425 fn = q = 0.4575

Ss blood group fS = x = 0.3080 fs = y = 0.6920

Gamete frequencies MS = 474/2000 = 0.2370 (+) Ms = 611/2000 = 0.3055 (-) nS = 142/2000 = 0.0710 (-) ns = 773/2000 = 0.3865 (+)

Example: blood group polymorphism in a sample of 1000 British people

D = (0.2370)(0.3865) – (0.3055)(0.0710) = 0.07

or max pyqxD =

D = efMS × efns

non recombinant − efMs × efnS

recombinant

Dmax: qx = 0.14 or py = 0.37; so Dmax = 0.14

D is (0.07/0.14)*100 = 50% of the theoretical maximum

expected px = 0.1671 py = 0.3751 qx = 0.1409 qy = 0.3166

Homework:

Genotype counts in the population MN locus Ss locus MM = 298 MN = 489 NN = 213

SS = 483 Ss = 418 ss = 99

Use chi-square test to:

1.  determine if each locus is in HWE

2.  determine if gamete frequencies are in equilibrium

Gametes MS = 474 Ms = 611 NS = 142 Ns = 773

Page 12: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

12

Recombination reduces LD:

Rate of decay of LD under various recombination rates

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1 9 17 25 33 41 49 57 65 73 81 89 97

generations

Stan

dard

ized

dis

equi

libriu

m

D/D

max

r= 0.001

r= 0.01

r= 0.1 r= 0.5

Recombination reduces LD:

“Hitchhiking” of a mutator gene with and without recombination

Adapted from Sniegowski et al. (2000) BioEssays 22:1057-1066.

No recombination

Recombination

Mutator allele that increase the mutation rate

Beneficial allele subject to strong positive selection

r = 0

r = 0.5

Page 13: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

13

Mapping disease genes: 1.  Family studies

•  Uses family pedigrees

•  Co-segregation of disease and a marker on the pedigree

2.  Allelic association studies (LD mapping) •  Uses population data

•  Relies on strong LD only among closely linked loci

•  Sample affected and unaffected individuals

•  Very large samples are required!

•  Look for markers with more LD in affected individuals

Both approached have powers and pitfalls

Mapping disease genes:

r = 0.01 (1 centiMorgan) is about 1million bp in humans

Page 14: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

14

Bitter tasting Tasty mimics

(Papilio memnon females)

female-limited Batesian mimicry 5-locus linkage group (supergene):

§  Tail length

§  Hind-wing pattern

§  Forewing pattern

§  Epaulet color

§  Body color

Selection: Only certain complex color morphs provide gains in fitness

LD: Few maladaptive patterns produced per generation due to linkage

male

Class I Class III Class II

The MHC locus on human chromosome 6

LD over 10s-100s million of bp due to selection for combinations of loci

Page 15: Population Genetics 2: Linkage disequilibriumawarnach.mathstat.dal.ca/~joeb/biol3046/PDFs/slides/Slides_PopGen_T2.pdf · Population Genetics 2: Linkage disequilibrium Genotype frequencies

15

Sexual reproduction reduces LD

Linage disequilibrium: Keynotes • Attainment of equilibrium at different loci is gradual; > 1 generation of random mating.

• Physical linkage slows the rate to equilibrium even more!

• “r” determines the rate to equilibrium, the lower the fraction, the longer to equilibrium.

• When r = 0.5 the loci are said to be un-linked; such loci are very far apart on the same chromosome, or

in different chromosomes. When r < 0.5 the genes are said to be linked. When r =0 the loci are in permanent disequilibrium.

• Disequilibrium can arise from sources other than linkage:

o Admixture of populations o Natural selection acting on one or more of the loci o Inbreeding in plants that regularly undergo self-fertilization o Genes located in a chromosomal inversion (SUPERGENE)

• The term LINKAGE DISEQUILIBRIUM is used to describe any source of disequilibrium, regardless of whether

the two genes are physically linked or not.