advances in quantitative trait analysis martin farrall ... · university of oxford wellcome trust...
TRANSCRIPT
Advances in Quantitative Trait Analysis
Martin Farrall
Dept. Cardiovascular MedicineUniversity of Oxford
Wellcome Trust Centre for Human Genetics
a variety of vegetables of theBrassica oleracea species,
differing morphologies and tastes
the familiar response to artificial selection(selective breeding) is driven by QTL
a selection of membersof the Canis familiaris species,
differing sizes, shapes, temperaments
DZ twins
-0.5
-0.25
0
0.25
0.5
0.75
1
1.25
1.5
-0.5 -0.25 0 0.25 0.5 0.75 1 1.25 1.5
first-twin
co-twin
MZ twins
-0.5
-0.25
0
0.25
0.5
0.75
1
1.25
1.5
-0.5 -0.25 0 0.25 0.5 0.75 1 1.25 1.5
first-twin
scatter plots of first and co-twin log(% F-cell) trait
Many human biochemical and physiological traitsshow quantitative genetic variation
linkage analysis(thousands of samples, hundreds of markers)
unselected sibships ……………
selected sibships …………… (affecteds > P95)
association analysis(hundreds of samples, thousands of markers)
unselected cohorts …………
selected cases and controls ………… (cases > P95, controls < P50)
61680 sibpairs
4941 ASPs
994 individuals
256 individuals
Mapping human QTL in natural populations by brute force!
QTL h2 = 5%, α = 0.00005, 1 - β = 0.85
Genetic architecture of QTL
•how many?•range of effect sizes?•interactions with other QTL and environment
Molecular mechanisms for QTL
•non-synonymous - protein function
•affect gene regulation - cis & trans
Genetic architecture of quantitative traits
•infinitesimal model – huge number of genes, each with a tiny effect(RA Fisher 1930) - evolutionary change is imperceptible and gradual
Ronald Aylmer Fisher 1890 - 1962
time
character
time
character
gradual change punctate change
Genetic architecture of quantitative traits
•oligogenic model – handful of genes of equal effect(mathematically simple, hopelessly optimistic)
•exponential model – finite number of genes with varying effect sizes
first proposed by Alan Robertson (1920 - 1989)
theoretical model – H Allen Orr Evolution 1998 52:935-49
QTL for number of mechanosensory bristlesin Drosophila melanogaster
Dilda and Mackay Genetics 2002162:1655-74
ware the “Beavis” effect!
Incubation time QTL (CAST/Ei and NZW/OlaHSd) F2 mice innoculated with scrapie
(prion) extract
0
1
2
3
4
5
6
0 0.05 0.1 0.15 0.2 0.25
fraction of variance
gen
e r
an
k
Lloyd et al. Proc Natl Acad Sci 2001 98:6279-83
QTL for Lp(a)
0
1
2
3
4
5
6
7
0 5 10 15 20 25 30 35 40 45 50 55 60 65
lodscore
gen
e r
an
k
Thanks to: Simona Barlera, Claudia Specchia, Enrico Nicolis e Benedetta Chiodini
Istituto Mario Negri - Milano
PROCARDIS - www.procardis.org
Lipoprotein(a) - a risk factor for coronary heart disease
Fisher-Orr model of adaptive evolution
imagine a 2-dimensional adaptive space
Fisher-Orr model of adaptive evolution
in which mutations have random and pleiotropic effects
Fisher-Orr model of adaptive evolution
mutations can have different effect sizes, bigmutations are less likely to be adaptive
Fisher-Orr model of adaptive evolution
but successful mutations must shrink the space
Fisher-Orr model of adaptive evolution
so new mutations tend to besmaller and smaller until the
optimal state is attained
1
2
3
4
0 2 4 6
effect size
gen
e r
an
k
Xu S Genetics 2003 163:789-801
a Bayesian analysis
b conventional single QTL models
QTL mapping under the exponential model
250 QTL exponential model
0
50
100
150
200
250
0.0% 1.0% 2.0% 3.0% 4.0% 5.0% 6.0% 7.0% 8.0%
QTL-specific heritability
gene r
ank b
y e
ffect
siz
e
L-shaped gamma distribution - Piganeau and Eyre-Walker PNAS 100:10335-40, 2003
Scaled so the total h2 = 30% 10 largest QTLs jointly account for 86% of the total h2
So just how hard might it be to map the larger QTL?
cases selected with extreme traits (5% tail), “healthy controls”, 85% power,common variant sampled, allowance for 30K tests and FDR (false discovery rate)method to allow for multiple testing
% heritability smallest
total QTL mappable QTL mapped effect alpha no. cases
250 10 85% 0.77% 1.67E-05 1230
5K 45 51% 0.19% 7.50E-05 4318
30K 100 26% 0.05% 1.67E-04 15350
Genetic architecture of QTL
•how many?•range of effect sizes?•interactions with other QTL and environment
Molecular mechanisms for QTL
•non-synonymous - protein function
•affect gene regulation - cis & trans
What types of genetic variation underlie speciation?
…..and quantitative variation within a species?
•birth of new genes - duplication
•death of old genes - pseudogenes, deletions
•non-synonymous - directly affecting protein function
•variation affecting gene regulation
M. C. King, A. C. Wilson, Science 188, 107 (1975)
comparative sequencing of chimpanzee and human: <1.5% single nucleotidesubstitutions,many indels and retrotransposons
•25% of genes in human, yeast and fly genomes are differentially expressed between individuals
•cis- and trans- acting factors are potential mechanisms for eQTL (25-35% cis, 65-75 % trans)
•cis- acting factors affect a specific gene, trans- acting factors may well affect multiple genes
•cis-regulatory variants could lie in promoter or enhancer sequences,alter chromatin structure epigenetically, or lie in transcribed sequence and affect half-life
•trans-regulatory variation could affect level of transcription factor (TF) or alter binding/activity of TF
eQTL - expression QTL
Comparison of Brown Norway and Wistar rat liver expression, Affymetrix RAE230a chip (15923 genes)Thanks to Christine Blancher, Steven Wilder and Dominique Gauguier for sharing this data
Strain differences in gene expression indicatequantitative genetic variation in transcriptome - eQTL
A porcine regulatory mutation in IGF2 causes a major QTL effect onmuscle growth, fat mass and heart size
Van Laere et al. Nature 2003 425:832-6
methylation status of CpG island differs between muscle and liver
EMSA q unmethylated wild-typeQ unmethylated mutantq* methylated wild-type
IGF2 in vitro expression is reduced by q>Q(P3 = pig IGF2 promoter, q and Q are intron3 fragments)
A porcine regulatory mutation in IGF2 causes a major QTL effect onmuscle growth, fat mass and heart size
Van Laere et al. Nature 2003 425:832-6
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
A B C D
clade
ACE a
cti
vit
y
0.40-0.240.62%AinsGA
0.190.711.94%GinsAC
0.15-0.312.60%GdelAC
0.160.443.29%GinsGC
0.08-0.097.46%GdelAA
0.080.327.75%GinsAA
0.060.6217.07%GinsGA
0.04-0.4228.11%AinsAC
0.04-0.1830.94%AinsAA
s.e.meanfreq.A31958G31839insCA23495GA6138C
5’ 3’
5’ breakpoint 3’ breakpoint
Quantitative variation in human ACE maps to multiple SNPs(promotor, intron and synonymous variants)
clade A clade B
two clades have accumulateddifferent variants during theirevolution
most are neutral variants,some are low variants,some are high variants
clade A clade B clade C
recombination can generatenew combinations of variants
clade A clade B clade D
depending on the crossover point!
HBS1L cMYB AHI1m4 m7
O. Røsby & K. Berg J Intern Med. 2000 Jan;247(1):139-52
But not all quantitative variation is attributable to regulatory variation:Lp(a) depends on number of “kringle IV” repeats in apolipoprotein(a)
Summary
•the exponential model has theoretical and empirical support
•large cohorts are inevitable required for reliable QTL mapping
•regulatory variants (eQTL) make an important contribution toquantitative variation within and between species
The future
•can we use the exponential model to our advantage?
•what is the balance between relatively rare non-synonymousvariants vs. common variants influencing gene regulation
•what proportion of cis eQTL map to promoter vs. enhancersequences?