recombination generation. a diagramstatgen.psu.edu/resource/lecture1.pdf · thus, the development...
TRANSCRIPT
Recombination
January 20, 2010
44 3 Linkage Analysis and Map Construction
3.2 Experimental Design
The tendency of alleles of di!erent genes on the same chromosome to pass into thesame haplotype at meiosis is related to the degree of linkage between these genes.Thus, the development of linkage analysis critically relies upon a segregating pedigreein which both recombinant and nonrecombinant gamete types can be counted. Apedigree comprising a backcross or an F2 population, initiated with two contrastinginbred lines, has proven a most powerful and e"cient tool for linkage analysis.
In practice, two inbred lines that are homologous for two alternative alleles ofeach gene are crossed as parents P1 and P2 to generate an F1 progeny. Thus, allF1 individuals are heterozygous at all genes. These heterozygous F1’s can either bebackcrossed to each of their parents to generate two backcrosses (B1 and B2) or the F1
individuals can be crossed with each other to produce the F2 generation. A diagramillustrating this crossing procedure is illustrated in Fig. 3.1.
P1 ! P2
A
B
!!!!!!A
B
a
b
!!!!!!ab
F1
" A
B
!!!!!!ab
"
AB !"
AB Ab aB ab
1!r2
r2
r2
1!r2
#! ab
" "B1 B2$
%A
B
!!!!!!A
B
A
B
!!!!!!A
b
A
B
!!!!!! a
B
A
B
!!!!!!ab
1!r2
r2
r2
1!r2
&' "
$%
a
b
!!!!!!A
B
a
b
!!!!!!A
b
a
b
!!!!!! a
B
a
b
!!!!!!ab
1!r2
r2
r2
1!r2
&'
F2$%
A
B
!!!!!!A
B
A
B
!!!!!!A
b
A
b
!!!!!!A
b
A
B
!!!!!! b
B
A
B
!!!!!!ab
(A
b
!!!!!! a
B
A
b
!!!!!!ab
a
B
!!!!!! a
B
a
B
!!!!!!ab
a
b
!!!!!!ab
(1!r)2
4r(1!r)
2r2
4r(1!r)
2(1!r)2+r2
2r(1!r)
2r2
4r(1!r)
2(1!r)2
4
&'
Fig. 3.1. Experimental design used for linkage analysis of markers.
Consider two markers, A, with alleles A and a, and B, with two alleles B and b.Two inbred line parents, P1 and P2, are homozygous for the large and small allelesof these two genes, respectively. Parent P1 generates gamete or haplotype AB duringmeiosis, whereas parent P2 generates gamete ab. These two gametes are combined
Recombination Genetic Maps Mendel and others Calculations Simulations
Genetic Recombination
A chromosome inherited by an offspring is a mosaic of the parents twochromosomes.
Recombination
A chromosome inherited by an offspring from a parent isactually a mosaic of the parent’s two chromosomes.Genetic Recombination −→ genetic material is exchangedbetween a chromosome of paternal origin and thecorresponding chromosome of maternal origin.
January 20, 2010 Recombination 2 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
Genetic Maps and Physical Maps
Recombination fraction: the probability two given loci end up onregions of different origin—occurs when the two loci are separated by anodd number of crossovers.
Genetic Linkage Map: indicates the chromosomal order andrecombination fractions between genetic markers.
Sequence Map: indicates the chromosomal order and number of basepairs between genetic markers.
Cytogenetic Map: constructed from visible bands after chromosomalstaining.
Radiation Hybrid (RH) Map: a precise genetic linkage map generatedfrom DNA fragmentation.
January 20, 2010 Recombination 3 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
NCBI — Map Viewer
Map Viewer
Getting Started Using Map Viewer
Vertebrates: human, chimpanzee, lab mouse, rat, duck-billed platypus,cattle, river buffalo, dog, horse, cat, sheep, pig, zebra fish,chicken
Invertebrates: pea aphid, African malaria mosquito, honey bee, fruit fly,jewel wasp, red flour beetle, nematode, purple sea urchin,
Protozoa: social amoeba
Plants: onion, asparagus, cultivated oat, barley, rice, rye, wheat, maize,beet, brown mustard, black mustard, field mustard, pepper,soybean, kidney bean, alfalfa, poplar, sweet almond, oak,nightshade, tomato, eggplant, potato, cocoa, mungbean, winegrape
Fungi: baker’s yeast, fission yeast
Red indicates species we have collaborated on. Details: statgen.psu.eduJanuary 20, 2010 Recombination 4 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
Linkage Map Examples
January 20, 2010 Recombination 5 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
Definitions
Genetic Marker: a polymorphic strand of DNA—we will denote bysymbols such as A, B, and C.
Allele: one of a series of different forms of a genetic locus – we willdenote by symbols such as A or a for two alleles or A1, . . . Ak
for k alleles.
Genotype: describes the two alleles (in diploid cells) at a single loci – wewill denote by symbols such as AA, Aa, and aa.
Homozygous: genotypes that have alleles that are identical; e.g. AA, aa.
Heterozygous: genotypes that have two different alleles; e.g. Aa.
Haplotype: sequence of alleles along a chromosome–we will denote bysymbols such as AbC or ACTGGTCA.
6 1 Basic Genetics
b
aA
B .(1.2)
Definition 1.1. [Some Basic Terms] For alleles A and B, the arrangement displayedin diagram (1.1) is termed coupling and is written AB/ab; the arrangement in diagram(1.2) is called repulsion and is indicated by Ab/aB. The relative arrangement ofnonalleles (i.e., A vs. B, A vs. b, a vs. B, or a vs. b) at di!erent loci along a chromosomeis called the linkage phase.
At an early stage of meiosis, the two chromosomes 1 and 2 lie side by side withcorresponding loci aligned. If the parental genotype is AB/ab, we can represent thealignment as in Fig. 1.3A. Each of the paired chromosomes is then duplicated toform two sister strands (chromatids) connected to each other at a region called thecentromere. The homologous chromosomes form pairs, so that each resulting complexconsists of four chromatids known as a tetrad (Fig. 1.3B). At this stage, the non-sister chromatids adhere to each other in a semi-random fashion at regions calledchiasmata. Each chiasma represents a point where crossing over between two non-sister chromatids can occur (Fig. 1.3C). Chiasmata do not occur entirely at random,as they are more likely farther away from the centromere, and it is unusual to findtwo chiasmata in very close proximity to each other.
A
B
A
B
a
b
A
B
a
B
A
B
A
b
a
B
a
b
a
b
A
B
a
b
A
b
a
b
A (pairing up) B (tetrad) C (crossing over) D (haplotype)
centro- mere
chromo- some 21 chro-
matid 21 21 chiasma NR R R NR
Fig. 1.3. Diagram for crossing!over between linked loci A and B.
Each gamete receives one chromatid from a tetrad to make up the haploid com-plement (Fig. 1.3D). Since it is possible that more than one crossover occurs on thechromosomes, some chromosomes in the haploid complement consist of a number ofsegments from the two parental chromosomes. The number of segments is determinedby the number of crossovers that occurred in the formation of the chromatid thatbecame the chromosome. If no crossovers occur, then the chromosome will be a repli-cate of an entire parental chromosome. If one crossover occurs between two loci Aand B, then the chromosome will consist of two segments, one from each parentalchromosome. In the former case, the resultant gametes must be AB or ab, just likethe parental chromosomes. In the latter case, where there is one point of exchange,we have the new combinations Ab and aB, called recombinant types. In general, if
genotype at top loci: Aagenotype at bottom loci: Bbgenotype at both loci: AaBb
haplotype on left: Abhaplotype on right: aBdiplotype: Ab/aBparental diplotype: Ab|aB
January 20, 2010 Recombination 6 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
Gregor Mendel (1822–1884)
“father of modern genetics”
monastic priest and abbot of St.Thomas’s Abby, Bruno
cultivated and researched some29,000 pea plants in themonastery garden
published results in 1866, buttheir importance was onlydiscovered in the 20th century,well after his death
R. A. Fisher (1890–1962)suggested Mendel’s data was“too good”.
January 20, 2010 Recombination 7 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
The Two “Laws”
1 Law of Segregation states that allele pairs separate during gameteformation and randomly unite at fertilization. Nonrandom associationcan be tested with a χ2 statistic.
2 Law of Independent Assortment alleles of different genes assortindependently of one another during gamete formation. This is only truefor genes not linked to each other, e.g. genes on different chromosomes.Two genes can be tested for independent assortment with a χ2 statistic.
January 20, 2010 Recombination 8 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
F2 Cross: Depicting Dominance and Co-Dominance
January 20, 2010 Recombination 9 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
F2 Cross: Two Dominant Traits, No Linkage
color marker B: B (brown, dominant); b (white)
tail length marker S: S (short, dominant); s (long)
January 20, 2010 Recombination 10 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
Bateson, Saunders, and Punnett Suspect Linkage
Bateson, W., et al. Experimental studies in the physiology of heredity.Reports to the Evolution Committee of the Royal Society 2, 1-55, 80-99(1905)
January 20, 2010 Recombination 11 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
Our First R Calculation...
chisq.test(c(284,21,21,55),p=c(9/16,3/16,3/16,1/16))
Chi-squared test for given probabilities
data: c(284, 21, 21, 55)
X-squared = 134.7282, df = 3, p-value < 2.2e-16
January 20, 2010 Recombination 12 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
The Rest of the Story
Blixt, S., 1975 Why didn’t Gregor Mendel find linkage? Nature 256,206.
Lobo, I. and Shaw, K., 2008 Discovery and types of genetic linkageNature Education 1(1).
January 20, 2010 Recombination 13 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
More R Calculations: Testing Mendelian SegregationExamples Backcross F2 Aa aa AA Aa aa
Observation 44 59 43 86 42 Expected frequency ! ! " ! " Expected number 51.5 51.5 42.75 85.5 42.75
The x2 test statistic is calculated by x2 = ! (obs – exp)2 /exp = (44-59)2/103 = 2.184 < x2
df=1 = 3.841, for BC, (43-42.75)2/42.75+(86-85.5)2/85.5+(42-42.75)2/42.75=0.018 < x2
df=2 =5.991, for F2
The marker under study does not deviate from Mendelian segregation in both the BC and F2.
chisq.test(c(44,59),p=c(1/2,1/2))
Chi-squared test for given probabilities
data: c(44, 59)
X-squared = 2.1845, df = 1, p-value = 0.1394
chisq.test(c(43,86,42),p=c(1/4,1/2,1/4))
Chi-squared test for given probabilities
data: c(43, 86, 42)
X-squared = 0.0175, df = 2, p-value = 0.9913
January 20, 2010 Recombination 14 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
Recombination Fraction in Backcross
r̂ = min
(n10 + n01
n,1
2
)
Testing for linkage BC AaBb aabb Aabb aaBb Obs n11 n00 n10 n01 !n="nij Freq !(1-r) !(1-r) !r !r Gamete type nNR= n11+n00 nR= n10+n01 Freq with no linkage ! ! Exp !n !n
#2 = "(obs – exp)2/exp = (nNR - nR)2/n ~ #2
df=1
Example AaBb aabb Aabb aaBb 49 47 3 4
nNR= 49+47=96 nR= 3 + 4 = 7 n=96+7=103
#2 = "(obs – exp)2/exp = (96-7)2/103 = 76.903 > #2df=1 = 3.841
These two markers are statistically linked. r^ = 7/103 = 0.068
chisq.test(c(49,47,3,4),p=rep(1/4,4))
Chi-squared test for given probabilities
data: c(49, 47, 3, 4)
X-squared = 77, df = 3, p-value < 2.2e-16
A more powerful test:
chisq.test(c(49+47,3+4),p=rep(1/2,2))
Chi-squared test for given probabilities
data: c(49 + 47, 3 + 4)
X-squared = 76.9029, df = 1, p-value < 2.2e-16
cat("r hat is ",(3+4)/(49+47+3+4))
r hat is 0.06796117
January 20, 2010 Recombination 15 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
Recombination in F2
44 3 Linkage Analysis and Map Construction
3.2 Experimental Design
The tendency of alleles of di!erent genes on the same chromosome to pass into thesame haplotype at meiosis is related to the degree of linkage between these genes.Thus, the development of linkage analysis critically relies upon a segregating pedigreein which both recombinant and nonrecombinant gamete types can be counted. Apedigree comprising a backcross or an F2 population, initiated with two contrastinginbred lines, has proven a most powerful and e"cient tool for linkage analysis.
In practice, two inbred lines that are homologous for two alternative alleles ofeach gene are crossed as parents P1 and P2 to generate an F1 progeny. Thus, allF1 individuals are heterozygous at all genes. These heterozygous F1’s can either bebackcrossed to each of their parents to generate two backcrosses (B1 and B2) or the F1
individuals can be crossed with each other to produce the F2 generation. A diagramillustrating this crossing procedure is illustrated in Fig. 3.1.
P1 ! P2
A
B
!!!!!!A
B
a
b
!!!!!!ab
F1
" A
B
!!!!!!ab
"
AB !"
AB Ab aB ab
1!r2
r2
r2
1!r2
#! ab
" "B1 B2$
%A
B
!!!!!!A
B
A
B
!!!!!!A
b
A
B
!!!!!! a
B
A
B
!!!!!!ab
1!r2
r2
r2
1!r2
&' "
$%
a
b
!!!!!!A
B
a
b
!!!!!!A
b
a
b
!!!!!! a
B
a
b
!!!!!!ab
1!r2
r2
r2
1!r2
&'
F2$%
A
B
!!!!!!A
B
A
B
!!!!!!A
b
A
b
!!!!!!A
b
A
B
!!!!!! b
B
A
B
!!!!!!ab
(A
b
!!!!!! a
B
A
b
!!!!!!ab
a
B
!!!!!! a
B
a
B
!!!!!!ab
a
b
!!!!!!ab
(1!r)2
4r(1!r)
2r2
4r(1!r)
2(1!r)2+r2
2r(1!r)
2r2
4r(1!r)
2(1!r)2
4
&'
Fig. 3.1. Experimental design used for linkage analysis of markers.
Consider two markers, A, with alleles A and a, and B, with two alleles B and b.Two inbred line parents, P1 and P2, are homozygous for the large and small allelesof these two genes, respectively. Parent P1 generates gamete or haplotype AB duringmeiosis, whereas parent P2 generates gamete ab. These two gametes are combined
January 20, 2010 Recombination 16 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
Testing Linkage in F2Testing the linkage in the F2 BB Bb bb
AA Obs n22=20 n21 =17 n20=3 Exp with no linkage 1/16n 1/8n 1/16n
Aa Obs n12 =20 n11 =49 n10 =19 Exp with no linkage 1/8n !n 1/8n
aa Obs n02=3 n01 =21 n00=19 Exp with no linkage 1/16n 1/8n 1/16n
n = !nij = 191 "2 = !(obs – exp)2/exp ~ "2
df=1 = (20-1/16"191)/(1/16"191) + … = a > "2
df=1=3.381
Therefore, the two markers are significantly linked.
chisq.test(c(20,20,3,17,49,21,3,19,19),
p=c(1/16,1/8,1/16,1/8,1/4,1/8,1/16,1/8,1/16))
Chi-squared test for given probabilities
data: c(20, 20, 3, 17, 49, 21, 3, 19, 19)
X-squared = 27.807, df = 8, p-value = 0.0005124
January 20, 2010 Recombination 17 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
Recombination Estimation in F2
N<-matrix(c(20,20,3,17,49,21,3,19,19),3,3)
n<-sum(N)
r<-.1
dif=1
eps=10^(-10)
count<-0
while(dif>eps){
r.old<-r
phi<-(r^2)/((1-r)^2+r^2)
r<-1/(2*n)*(2*(N[3,1]+N[1,3])+N[3,2]+N[2,3]+N[2,1]+N[1,2]+2*phi*N[2,2])
dif<-abs(r-r.old)
count<-count+1
}
r
[1] 0.3073867
count
[1] 22
January 20, 2010 Recombination 18 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
Likelihood Ratio Test
Multinomial likelihood
L(p1, . . . , pk) =n!
n1! · · ·nk!k∏
i=1
pnii
We use L0 for the value of L when the p’s are confined to null hypothesis,and L1 for the alternative hypothesis. The likelihood ratio statistic is
LR = −2 log(L0
L1
)= 2 (lnL1 − lnL0) = 2
[k∑
i=1
ni (ln p̂i1 − ln p̂i0)
]∼ χ2
k
where p̂i0 and p̂i1 are estimates of the p’s under the null and alternativehypotheses, respectively, and k is determined by the number of additionalconstraints imposed by the null over the alternative.
January 20, 2010 Recombination 19 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
A Second Test of Linkage in F2
Testing for no linkage amounts to testing H0 : r =12 vs. H1 : r 6= 1
2
Linkage analysis in the F2
BB Bb bb AA Obs n22 n21 n20
Freq !(1-r)2 "r(1-r) !r2 Aa Obs n12 n11 n10
Freq "r(1-r) "(1-r)2+"r2 "r(1-r) aa Obs n02 n01 n00
Freq !r2 "r(1-r) !(1-r)2
Likelihood function L(r|nij) = n!/(n22!...n00!) ![!(1-r)2]n22+n00[!r2]n20+n02["r(1-r)]n21+n12+n10+n01
!["(1-r)2+"r2]n11
Let the score = 0 so as to obtain the MLE of r, but this will be difficult because AaBb contains a mix of two genotype formation types (in the dominator we will have "(1-r)2+"r2).
LR = 2[n22(1/4(1− r̂)2 − 1/4(1− 1/2)2
)+
+ n12(1/4r̂(1− r̂)− 1/4(1/2)(1− 1/2)
)+
+ · · ·+ n00
(1/4(1− r̂)2 − 1/4(1− 1/2)2
)]
January 20, 2010 Recombination 20 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
LR Test in R
N
[,1] [,2] [,3]
[1,] 20 17 3
[2,] 20 49 19
[3,] 3 21 19
r
[1] 0.3073867
p<-function(r){
c(1/4*(1-r)^2,
1/2*r*(1-r),
1/4*r^2,
1/2*r*(1-r),
1/2*(1-r)^2+1/2*r^2,
1/2*r*(1-r),
1/4*r^2,
1/2*r*(1-r),
1/4*(1-r)^2)
}
sum(p(r))
[1] 1
n<-as.vector(N)
2*sum(n*log(p(r))-n*log(p(1/2)))
[1] 27.98067
n<-as.vector(N)
LR<-2*sum(n*log(p(r))-n*log(p(1/2)))
pchisq(LR,1,lower=FALSE)
[1] 1.225332e-07
This is a more powerful test than thebasic χ2 test performed with 8=9-1degrees of freedom.
January 20, 2010 Recombination 21 / 23
Recombination Genetic Maps Mendel and others Calculations Simulations
Performance of r̂
r.est<-function(N){
n<-sum(N)
r<-.1
dif=1
eps=10^(-10)
while(dif>eps){
r.old<-r
phi<-(r^2)/((1-r)^2+r^2)
r<-1/(2*n)*(2*(N[3,1]+N[1,3])+
N[3,2]+N[2,3]+N[2,1]+N[1,2]+
2*phi*N[2,2])
dif<-abs(r-r.old)
}
r
}
N<-matrix(rmultinom(1,200,p(.25)),3,3)
r.est(N)
[1] 0.2328327
res<-replicate(999,
r.est(N=matrix(rmultinom(1,20,
p(.25)),3,3)))
mean(res); sd(res)
[1] 0.2522241
[1] 0.08790692
res<-replicate(999,
r.est(N=matrix(rmultinom(1,200,
p(.25)),3,3)))
mean(res); sd(res)
[1] 0.2497108
[1] 0.02560488
January 20, 2010 Recombination 22 / 23