intro to quantitative genetics - vipbg.vcu.edu€¦ · intro to quantitative genetics 1/18: course...
Post on 11-Jul-2020
3 Views
Preview:
TRANSCRIPT
Intro to Quantitative Genetics
HGEN502, 2011
Hermine H. Maes
Intro to Quantitative Genetics
1/18: Course introduction; Introduction to Quantitative Genetics & Genetic Model Building
1/20: Study Design and Genetic Model Fitting 1/25: Basic Twin Methodology 1/27: Advanced Twin Methodology and Scope of
Genetic Epidemiology 2/1: Quantitative Genetics Problem Session
Aims of this talk
Historical Background Genetical Principles
Genetic Parameters: additive, dominance Biometrical Model
Statistical Principles Basic concepts: mean, variance, covariance Path Analysis Likelihood
Quantitative Genetics Principles
Analysis of patterns and mechanisms underlying variation in continuous traits to resolve and identify their genetic and environmental causes Continuous traits have continuous phenotypic range;
often polygenic & influenced by environmental effects Ordinal traits are expressed in whole numbers; can be
treated as approx discontinuous or as threshold traits Some qualitative traits; can be treated as having
underlying quantitative basis, expressed as a threshold trait (or multiple thresholds)
Types of Genetic Influence
Mendelian Disorders Single gene, highly penetrant, severe, small %
affected (e.g., Huntington’s Disease) Chromosomal Disorders
Insertions, deletions of chromosomal sections, severe, small % affected (e.g., Down’s Syndrome)
Complex Traits Multiple genes (of small effect), environment, large %
population, susceptibility – not destiny (e.g., depression, alcohol dependence, etc)
Genetic Disorders
Great 19th Century Biologists
Gregor Mendel (1822-1884): Mathematical rules of particulate inheritance (“Mendel’s Laws”)
Charles Darwin (1809-1882): Evolution depends on differential reproduction of inherited variants
Francis Galton (1822-1911): Systematic measurement of family resemblance
Karl Pearson (1857-1936): “Pearson Correlation”; graduate student of Galton
Family Measurements
Standardize Measurement
Pearson and Lee’s diagram for measurement of “span” (finger-tip to finger-tip distance)
From Pearson and Lee (1903) p.378
Parent Offspring Correlations
From Pearson and Lee (1903) p.387
Sibling Correlations
© Lindon Eaves, 2009
Nuclear Family Correlations
Quantitative Genetic Strategies
Family Studies Does the trait aggregate in families? The (Really!) Big Problem: Families are a mixture of
genetic and environmental factors Twin Studies
Galton’s solution: Twins One (Ideal) solution: Twins separated at birth But unfortunately MZA’s are rare Easier solution: MZ & DZ twins reared together
Twin Studies Reared Apart
Minnesota Study of Twins Reared Apart (T. Bouchard et al, 1979 >100 sets of reared-apart twins from across the US & UK All pairs spent formative years apart (but vary tremendously in amount
of contact prior to study) 56 MZAs participated
Types of Twins
Monozygotic (MZ; “identical”): result from fertilization of a single egg by a single sperm; share 100% of genetic material
Dizygotic (DZ, “fraternal” or “non-identical”): result from independent fertilization of two eggs by two sperm; share on average 50% of their genes
Logic of Classical Twin Study
MZs share 100% genes, DZs (on avg) 50% Both twin types share 100% environment
If rMZ > rDZ, then genetic factors are important If rDZ > ½ rMZ, then growing up in the same
home is important If rMZ < 1, then non-shared environmental
factors are important
Causes of Twinning
For MZs, appears to be random For DZs,
Increases with mother’s age (follicle stimulating hormone, FSH, levels increase with age)
Hereditary factors (FSH) Fertility treatment Rates of twins/multiple births are increasing, currently
~3% of all births
Zygosity of Twins
Chorionicity of Twins
100% of DZ twins are dichorionic ~1/3 of MZ twins are dichorionic and ~2/3 are monochorionic
Virginia Twin Study of Adolescent Behavioral Development
Twin Correlations
MZ Stature DZ Stature
© Lindon Eaves, 2009
Ronald Fisher (1890-1962)
1918: On the Correlation Between Relatives on the Supposition of Mendelian Inheritance
1921: Introduced concept of “likelihood”
1930: The Genetical Theory of Natural Selection
1935: The Design of Experiments Fisher developed mathematical
theory that reconciled Mendel’s work with Galton and Pearson’s correlations
Fisher (1918): Basic Ideas
Continuous variation caused by lots of genes (polygenic inheritance)
Each gene followed Mendel’s laws Environment smoothed out genetic differences Genes may show different degrees of dominance Genes may have many forms (multiple alleles) Mating may not be random (assortative mating) Showed that correlations obtained by Pearson & Lee
were explained well by polygenic inheritance [“Mendelian” Crosses with Quantitative Traits]
Biometrical Genetics
Lots of credit to: Manuel Ferreira, Shaun Purcell Pak Sham, Lindon Eaves
Revisit common genetic parameters - such as allele frequencies, genetic effects, dominance, variance components, etc
Use these parameters to construct a biometrical genetic model
Model that expresses the:
(1) Mean
(2) Variance
(3) Covariance between individuals
for a quantitative phenotype as a function of genetic parameters.
Building a Genetic Model
Population level
Transmission level
Phenotype level
G
G
G G
G
G
G
G
G
G G
G
G
G
G G
G G
G
G
G
G
G G
P P
Allele and genotype frequencies
Mendelian segregation Genetic relatedness
Biometrical model Additive and dominance components
Genetic Concepts
Population level
1. Allele frequencies
A single locus, with two alleles - Biallelic / diallelic - Single nucleotide polymorphism, SNP
Alleles A and a - Frequency of A is p - Frequency of a is q = 1 – p
A a
A a
Every individual inherits two alleles - A genotype is the combination of the two alleles - e.g. AA, aa (the homozygotes) or Aa (the heterozygote)
2. Genotype frequencies (Random mating)
A (p) a (q)
A (p)
a (q)
Allele 1 A
llele
2 AA (p2)
aA (qp)
Aa (pq)
aa (q2)
Hardy-Weinberg Equilibrium frequencies
P (AA) = p2
P (Aa) = 2pq
P (aa) = q2
p2 + 2pq + q2 = 1
Population level
Transmission level
Pure Lines AA aa
F1 Aa Aa
AA Aa Aa aa
3:1 Segregation Ratio
Intercross
Mendel’s experiments
Aa aa
Aa aa
F1 Pure line
Back cross
1:1 Segregation ratio
Transmission level
Transmission level
Pure Lines AA aa
F1 Aa Aa
AA Aa Aa aa
3:1 Segregation Ratio
Intercross
Aa aa
Aa aa
F1 Pure line
Back cross
1:1 Segregation ratio
Transmission level
Segregation, Meiosis
Mendel’s law of segregation
A3 (½) A4 (½)
A1 (½)
A2 (½)
Mother (A3A4)
A1A3 (¼)
A2A3 (¼)
A1A4 (¼)
A2A4 (¼)
Gametes
Father (A1A2)
Transmission level
1. Classical Mendelian traits
Dominant trait (D - presence, R - absence) - AA, Aa D - aa R
Recessive trait (D - absence, R - presence) - AA, Aa D - aa R
Codominant trait (X, Y, Z) - AA X - Aa Y - aa Z
Phenotype level
2. Dominant Mendelian inheritance
D (½) d (½)
D (½)
d (½)
Mother (Dd)
DD (¼)
dD (¼)
Dd (¼)
dd (¼)
Father (Dd)
Phenotype level
3. Dominant Mendelian inheritance with incomplete penetrance and phenocopies
D (½) d (½)
D (½)
d (½)
Mother (Dd)
DD (¼)
dD (¼)
Dd (¼)
dd (¼) Father (Dd)
Phenocopies
Incomplete penetrance
Phenotype level
4. Recessive Mendelian inheritance
D (½) d (½)
D (½)
d (½)
Mother (Dd)
DD (¼)
dD (¼)
Dd (¼)
dd (¼)
Father (Dd)
Phenotype level
Two kinds of differences
Continuous Graded, no distinct boundaries e.g. height, weight, blood-pressure, IQ,
extraversion
Categorical Yes/No Normal/Affected (Dichotomous) None/Mild/Severe (Multicategory) Often called “threshold traits” because
people “affected” if they fall above some level of a measured or hypothesized continuous trait
Phenotype level
Polygenic Traits
1 Gene 3 Genotypes 3 Phenotypes
2 Genes 9 Genotypes 5 Phenotypes
3 Genes 27 Genotypes 7 Phenotypes
4 Genes 81 Genotypes 9 Phenotypes
Mendel’s Experiments in Plant Hybridization, showed how discrete particles (particulate theory of inheritance) behaved mathematically: all or nothing states (round/wrinkled, green/yellow), “Mendelian” disease How do these particles produce a continuous trait like stature or liability to a complex disorder?
Phenotype level
Quantitative traits
AA
Aa
aa
Phenotype level
m
d +a
P(X)
X
AA
Aa
aa
-a
AA Aa aa
Genotypic means
Biometric Model
Genotypic effect
Phenotype level
m -a m +d m +a
Very Basic Statistical Concepts
1. Mean (X)
2. Variance (X)
3. Covariance (X,Y)
4. Correlation (X,Y)
Mean, variance, covariance
1. Mean (X)
Mean, variance, covariance
2. Variance (X)
Mean, variance, covariance
3. Covariance (X,Y)
Mean, variance, covariance (& correlation)
4. Correlation (X,Y)
€
rx,y =covx,ysxsy
Biometrical model for single biallelic QTL
Biallelic locus - Genotypes: AA, Aa, aa - Genotype frequencies: p2, 2pq, q2
Alleles at this locus are transmitted from P-O according to Mendel’s law of segregation
Genotypes for this locus influence the expression of a quantitative trait X (i.e. locus is a QTL)
Biometrical genetic model that estimates the contribution of this QTL towards the (1) Mean, (2) Variance and (3) Covariance between individuals for this quantitative trait X
Biometrical model for single biallelic QTL
Biallelic locus - Genotypes: AA, Aa, aa - Genotype frequencies: p2, 2pq, q2
Alleles at this locus are transmitted from P-O according to Mendel’s law of segregation
Genotypes for this locus influence the expression of a quantitative trait X (i.e. locus is a QTL)
Biometrical genetic model that estimates the contribution of this QTL towards the (1) Mean, (2) Variance and (3) Covariance between individuals for this quantitative trait X
1. Contribution of the QTL to the Mean (X)
aa Aa AA Genotypes
Frequencies, f(x)
Effect, x
p2 2pq q2
a d -a
= a(p2) + d(2pq) – a(q2) Mean (X) = a(p-q) + 2pqd
Biometrical model for single biallelic QTL
2. Contribution of the QTL to the Variance (X)
aa Aa AA Genotypes
Frequencies, f(x)
Effect, x
p2 2pq q2
a d -a
= (a-m)2p2 + (d-m)22pq + (-a-m)2q2 Var (X)
= VQTL
Broad-sense heritability of X at this locus = VQTL / V Total
Broad-sense total heritability of X = ΣVQTL / V Total
Biometrical model for single biallelic QTL
= (a-m)2p2 + (d-m)22pq + (-a-m)2q2 Var (X)
= 2pq[a+(q-p)d]2 + (2pqd)2
= VAQTL + VDQTL
m
d +a –a
AA aa Aa
Additive effects: the main effects of individual alleles Dominance effects: represent the interaction between alleles
d = 0
Biometrical model for single biallelic QTL
= (a-m)2p2 + (d-m)22pq + (-a-m)2q2 Var (X)
= 2pq[a+(q-p)d]2 + (2pqd)2
= VAQTL + VDQTL
AA aa Aa
Additive effects: the main effects of individual alleles Dominance effects: represent the interaction between alleles
m
–a +a d
d > 0
Biometrical model for single biallelic QTL
= (a-m)2p2 + (d-m)22pq + (-a-m)2q2 Var (X)
= 2pq[a+(q-p)d]2 + (2pqd)2
= VAQTL + VDQTL
AA aa Aa
Additive effects: the main effects of individual alleles Dominance effects: represent the interaction between alleles
m
–a +a d
d < 0
Biometrical model for single biallelic QTL
aa Aa AA
Var (X) = Regression Variance + Residual Variance = Additive Variance + Dominance Variance
m
–a
+a
d
Biometrical model for single biallelic QTL
Var (X) = 2pq[a+(q-p)d]2 + (2pqd)2
VAQTL + VDQTL
Demonstrate
2A. Average allelic effect
2B. Additive genetic variance
NOTE: Additive genetic variance depends on allele frequency p & additive genetic value a as well as dominance deviation d
Additive genetic variance typically greater than dominance variance
Biometrical model for single biallelic QTL
2A. Average allelic effect (α)
The deviation of the allelic mean from the population mean
a(p-q) + 2pqd
A a αa αA
? ? Mean (X)
Allele a Allele A Population
AA Aa aa a d -a
A p q ap+dq q(a+d(q-p)) a p q dp-aq -p(a+d(q-p))
Allelic mean Average allelic effect (α)
1/3
Biometrical model for single biallelic QTL
Denote the average allelic effects - αA
= q(a+d(q-p)) - αa
= -p(a+d(q-p))
If only two alleles exist, we can define the average effect of allele substitution - α = αA - αa - α = (q-(-p))(a+d(q-p)) = (a+d(q-p))
Therefore: - αA
= qα - αa
= -pα
2/3
Biometrical model for single biallelic QTL
2B. Additive genetic variance
The variance of the average allelic effects
2αA
Additive effect
2A. Average allelic effect (α)
Freq.
AA
Aa aa
p2
2pq
q2
αA + αa
2αa
= 2qα
= (q-p)α
= -2pα
VAQTL = (2qα)2p2 + ((q-p)α)22pq + (-2pα)2q2
= 2pqα2
= 2pq[a+d(q-p)]2 d = 0, VAQTL= 2pqa2
p = q, VAQTL= ½a2
3/3
αA = qα
αa = -pα
Biometrical model for single biallelic QTL
2B. Additive genetic variance 2A. Average allelic effect (α)
3. Contribution of the QTL to the Covariance (X,Y)
2. Contribution of the QTL to the Variance (X) 1. Contribution of the QTL to the Mean (X)
Biometrical model for single biallelic QTL
AA
Aa
aa
AA Aa aa (a-m) (d-m) (-a-m)
(a-m)
(d-m)
(-a-m)
(a-m)2
(a-m)
(-a-m)
(d-m)
(a-m)
(d-m)2
(d-m) (-a-m) (-a-m)2
3. Contribution of the QTL to the Cov (X,Y)
Biometrical model for single biallelic QTL
AA
Aa
aa
AA Aa aa (a-m) (d-m) (-a-m)
(a-m)
(d-m)
(-a-m)
(a-m)2
(a-m)
(-a-m)
(d-m)
(a-m)
(d-m)2
(d-m) (-a-m) (-a-m)2
p2
0
0
2pq
0 q2
3A. Contribution of the QTL to the Cov (X,Y) – MZ twins
= (a-m)2p2 + (d-m)22pq + (-a-m)2q2 Covar (Xi,Xj)
= VAQTL + VDQTL
= 2pq[a+(q-p)d]2 + (2pqd)2
Biometrical model for single biallelic QTL
AA
Aa
aa
AA Aa aa (a-m) (d-m) (-a-m)
(a-m)
(d-m)
(-a-m)
(a-m)2
(a-m)
(-a-m)
(d-m)
(a-m)
(d-m)2
(d-m) (-a-m) (-a-m)2
p3
p2q
0
pq
pq2 q3
3B. Contribution of the QTL to the Cov (X,Y) – Parent-Offspring
Biometrical model for single biallelic QTL
e.g. given an AA father, an AA offspring can come from either AA x AA or AA x Aa parental mating types
AA x AA will occur p2 × p2 = p4
and have AA offspring Prob()=1
AA x Aa will occur p2 × 2pq = 2p3q and have AA offspring Prob()=0.5 and have Aa offspring Prob()=0.5
therefore, P(AA father & AA offspring) = p4 + p3q = p3(p+q) = p3
Biometrical model for single biallelic QTL
AA
Aa
aa
AA Aa aa (a-m) (d-m) (-a-m)
(a-m)
(d-m)
(-a-m)
(a-m)2
(a-m)
(-a-m)
(d-m)
(a-m)
(d-m)2
(d-m) (-a-m) (-a-m)2
p3
p2q
0
pq
pq2 q3
= (a-m)2p3 + … + (-a-m)2q3 Cov (Xi,Xj)
= ½VAQTL = pq[a+(q-p)d]2
3B. Contribution of the QTL to the Cov (X,Y) – Parent-Offspring
Biometrical model for single biallelic QTL
AA
Aa
aa
AA Aa aa (a-m) (d-m) (-a-m)
(a-m)
(d-m)
(-a-m)
(a-m)2
(a-m)
(-a-m)
(d-m)
(a-m)
(d-m)2
(d-m) (-a-m) (-a-m)2
p4
2p3q
p2q2
4p2q2
2pq3 q4
= (a-m)2p4 + … + (-a-m)2q4 Cov (Xi,Xj)
= 0
3C. Contribution of the QTL to the Cov (X,Y) – Unrelated individuals
Biometrical model for single biallelic QTL
Cov (Xi,Xj)
3D. Contribution of the QTL to the Cov (X,Y) – DZ twins and full sibs
¼ genome
¼ (2 alleles) + ½ (1 allele) + ¼ (0 alleles)
MZ twins P-O Unrelateds
¼ genome ¼ genome ¼ genome
# identical alleles inherited from parents
0 1 (mother)
1 (father)
2
= ¼ Cov(MZ) + ½ Cov(P-O) + ¼ Cov(Unrel)
= ¼(VAQTL+VDQTL
) + ½ (½ VAQTL) + ¼ (0)
= ½ VAQTL + ¼VDQTL
Biometrical model for single biallelic QTL
Biometrical model predicts contribution of a QTL to the mean, variance and covariances of a trait
Var (X) = VAQTL + VDQTL
1 QTL
Cov (MZ) = VAQTL + VDQTL
Cov (DZ) = ½VAQTL + ¼VDQTL
Var (X) = Σ(VAQTL) + Σ(VDQTL
) = VA + VD Multiple QTL
Cov (MZ)
Cov (DZ)
= Σ(VAQTL) + Σ(VDQTL
) = VA + VD
= Σ(½VAQTL) + Σ(¼VDQTL
) = ½VA + ¼VD
Summary
Biometrical model underlies the variance components estimation performed in Mx
Var (X) = VA + VD + VE
Cov (MZ)
Cov (DZ)
= VA + VD
= ½VA + ¼VD
Summary
Path Analysis
HGEN502, 2011
Hermine H. Maes
Model Building
Write equations for means, variances and covariances of different type of relative or
Draw path diagrams for easy derivation of expected means, variances and covariances and translation to mathematical formulation
Method of Path Analysis
Allows us to represent linear models for the relationship between variables in diagrammatic form, e.g. a genetic model; a factor model; a regression model
Makes it easy to derive expectations for the variances and covariances of variables in terms of the parameters of the proposed linear model
Permits easy translation into matrix formulation as used by statistical programs
Path Diagram Variables
Squares or rectangles denote observed variables
Circles or ellipses denote latent (unmeasured) variables
Upper-case letters are used to denote variables Lower-case letters (or numeric values) are used
to denote covariances or path coefficients
Variables
latent variables
observed variables
Path Diagram Arrows
Single-headed arrows or paths (–>) are used to represent causal relationships between variables under a particular model - where the variable at the tail is hypothesized to have a direct influence on the variable at the head
Double-headed arrows (<–>) represent a covariance between two variables, which may arise through common causes not represented in the model. They may also be used to represent the variance of a variable
Arrows
double-headed arrows
single-headed arrows
Path Analysis Tracing Rules
Trace backwards, change direction at a 2-headed arrow, then trace forwards (implies that we can never trace through two-headed arrows in the same chain).
The expected covariance between two variables, or the expected variance of a variable, is computed by multiplying together all the coefficients in a chain, and then summing over all possible chains.
Non-genetic Example
Cov AB
Cov AB = kl + mqn + mpl
Expectations
Cov AB = Cov BC = Cov AC = Var A = Var B = Var C =
Expectations
Cov AB = kl + mqn + mpl Cov BC = no Cov AC = mqo Var A = k2 + m2 + 2 kpm Var B = l2 + n2 Var C = o2
Genetic Examples
MZ Twins Reared Together DZ Twins Reared Together MZ Twins Reared Apart DZ Twins Reared Apart Parents & Offspring
MZ Twins Reared Together
Expected Covariance
Twin 1 Twin 2
Twin 1 a2+c2+e2
variance a2+c2
Twin 2 a2+c2
covariance a2+c2+e2
MZ Twins RT
DZ Twins Reared Together
Expected Covariance
Twin 1 Twin 2
Twin 1 a2+c2+e2 .5a2+c2
Twin 2 .5a2+c2 a2+c2+e2
DZ Twins RT
MZ Twins Reared Apart
DZ Twins Reared Apart
Twins and Parents
Role of model mediating between theory and data
top related