regulatory variants explain much more heritability … · more heritability than coding variants...
TRANSCRIPT
![Page 1: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/1.jpg)
Regulatory variants explain much more heritability than coding
variants across 11 common diseases
Alexander Gusev
Harvard School of Public Health
May 8, 2014
![Page 2: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/2.jpg)
Functional annotation of the human genome
“The question as to what proportion of this complexity is truly functional remains open, however, and this ambiguity presents a serious challenge to genome scientists.”
— Mudge et. al. 2013 Genome Research
“Here, we assign biochemical functions for 80% of the genome” — ENCODE Consortium 2012 Nature
![Page 3: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/3.jpg)
Enrichment in functional categories
• Effect-sizes enriched in genic categories (promoter, UTR, exon)1.
• GWAS hits enriched 1.4x in DNaseI Hypersensitivity Sites (DHS)2,3.
• hg2 enriched 1.5x around
genes4.
• Can we precisely quantify this enrichment?
1Schork et al. 2013 PLoS Gen; 2Maurano et al. 2012 Science; 3Trynka et al. 2013 Nat. Gen.; 4Lee et al. 2012 Nat. Gen.
Ecker et. al. 2012 Nature
![Page 4: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/4.jpg)
“hidden” vs. “missing” heritability
h2GWAS ≈ 0.01
h2g = 0.27 (se=0.02)
h2 = 0.81 (CI=0.73-0.90)
Sullivan et al., Arch Gen Psych 2003; Ripke et al. Nat Gen, 2013; Lee et al. Nat Gen, 2012
![Page 5: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/5.jpg)
“hidden” vs. “missing” heritability
GWAS ≈ 0.01
hg2 = 0.27 (se=0.02)
h2 = 0.81 (CI=0.73-0.90)
GWAS 0.01
hg2
0.27 h2
0.81 ← hidden → ← missing →
Other traits: Yang et. al. 2010 Nat Gen; Visscher et. al. 2012 AJHG;
Lee et. al. 2013 Nat Gen
![Page 6: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/6.jpg)
Given a annotations, phenotype is generated from linear combination of Normal genetic effect sizes and some Normal noise/environment
Variance of the phenotype is modeled by multiple kinships of pairwise sample relationships/covariance. Kinships compete for variance.
Each kinship is estimated directly from SNPs in relevant annotation
Heritability (hg2) is the genetic component of V(y)
Estimating hg2
Yang et al. 2010 Nat Genet.
![Page 7: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/7.jpg)
Measuring hg2 enrichment
• Enrichment = (%hgi2) / (%SNPsi)
Note: % SNPs ≈ % bp
• P-value = ZScore[ (%hgi2 - %SNPsi) / (%hgi
2 se) ]
• Extensive simulations:
– Accounting for LD between categories
– Unbiased for uniform causal variants.
– Nearly-unbiased for complex disease architecture (all rare / rare DHS / etc).
![Page 8: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/8.jpg)
%bp
Coding All exons 1%
UTR 5’ and 3’ untranslated regions 1%
Promoter +/- 2kbp of TSS 2%
DHS* Regulatory regions from 217 cell-types 16%
Intron 29%
Intergenic 52%
Broad classes of functional variants
*ENCODE and ROADMAP data analyzed by: H. Xu, C. Zang, Liu Lab; G. Trynka, Raychaudhuri Lab
![Page 9: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/9.jpg)
%bp
Coding All exons 1%
UTR 5’ and 3’ untranslated regions 1%
Promoter +/- 2kbp of TSS 2%
DHS* Regulatory regions from 217 cell-types 16%
Intron 29%
Intergenic 52%
Broad classes of functional variants
*ENCODE and ROADMAP data analyzed by: H. Xu, C. Zang, Liu Lab; G. Trynka, Raychaudhuri Lab
Hierarchical, non-overlapping
![Page 10: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/10.jpg)
Real data: Analysis of 11 complex traits
WTCCC1
2,700 shared controls
1,700 cases each:
• Bipolar disorder
• Coronary artery disease
• Crohn’s disease*
• Hypertension
• Rheumatoid arthritis*
• Type 1 diabetes*
• Type 2 diabates
WTCCC2
5,200 shared controls
1,800-9,300 cases:
• Schizophrenia
• Ankylosing spondylitis*
• Multiple sclerosis*
• Ulcerative colitis*
*HLA excluded from
autoimmune traits
WTCCC, Nature 2007; WTCCC Nature 2011
![Page 11: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/11.jpg)
Meta-analysis: Significant DHS enrichment
Coding
(4.1x)
UTR
(3.5x)
Promoter
(2.2x)
DHS
(1.6x)
Intron
(0.8x)
Intergenic
(0.6x)
Genotyped SNPs
Functional Category
% S
NP
−H
eri
tab
ility
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
2.6e−04 2.2e−031.9e−02
8.0e−06
6.4e−024.1e−13
Mean observed
Expected (% SNPs)
Coding
(13.8x)
UTR
(8.4x)
Promoter
(2.8x)
DHS
(5.1x)
Intron
(0.1x)
Intergenic
(−0.1x)
1000G Imputed SNPs
Functional Category
% S
NP
−H
eri
tab
ility
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
4.7e−04 4.3e−03 1.2e−01
<1e−20
5.5e−12
<1e−20
Mean observed
Expected (% SNPs)
• Very large DHS enrichment: 79% hg
2 vs. 16% of SNPs.
• Significant coding enrichment: 8% hg
2 vs. 1% of SNPs.
• Intron/intergenic not significantly different from zero!
• Enrichment greater in autoimmune traits
![Page 12: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/12.jpg)
Coding
(4.1x)
UTR
(3.5x)
Promoter
(2.2x)
DHS
(1.6x)
Intron
(0.8x)
Intergenic
(0.6x)
Genotyped SNPs
Functional Category
% S
NP
−H
eri
tab
ility
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
2.6e−04 2.2e−031.9e−02
8.0e−06
6.4e−024.1e−13
Mean observed
Expected (% SNPs)
Coding
(13.8x)
UTR
(8.4x)
Promoter
(2.8x)
DHS
(5.1x)
Intron
(0.1x)
Intergenic
(−0.1x)
1000G Imputed SNPs
Functional Category
% S
NP
−H
eri
tab
ility
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
4.7e−04 4.3e−03 1.2e−01
<1e−20
5.5e−12
<1e−20
Mean observed
Expected (% SNPs)
Enrichment greatest in imputed variants
Coding
(4.1x)
UTR
(3.5x)
Promoter
(2.2x)
DHS
(1.6x)
Intron
(0.8x)
Intergenic
(0.6x)
Genotyped SNPs
Functional Category
% S
NP
−H
eri
tab
ility
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
2.6e−04 2.2e−031.9e−02
8.0e−06
6.4e−024.1e−13
Mean observed
Expected (% SNPs)
Coding
(13.8x)
UTR
(8.4x)
Promoter
(2.8x)
DHS
(5.1x)
Intron
(0.1x)
Intergenic
(−0.1x)
1000G Imputed SNPs
Functional Category
% S
NP
−H
eri
tab
ility
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
4.7e−04 4.3e−03 1.2e−01
<1e−20
5.5e−12
<1e−20
Mean observed
Expected (% SNPs)
Consistent with simulations from imputed data
![Page 13: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/13.jpg)
DHS (15.7%) 5.1x
Specific (4.4%) 6.1x (3.2e-03)
other
DHS (15.7%) 5.1x
DGF (8.5%) 5.1x (9.0e-01)
other
Enrichment at functional sub-categories
DHS (15.7%) 5.1x
Enhancer (3.2%) 9.8x (5.1e-04)
other
Computationally inferred enhancers.
DHS peaks appearing in ≤2
cell-types.
Digital genomic foot-printing
(specific cleavage sites).
Hoffman, 2013 Nucl. Ac. Res. Trynka, 2013 Nat. Gen. Neph, 2013 Nature
![Page 14: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/14.jpg)
Phenotype/cell-type specific enrichment
• Enrichment in DHS peaks from specific cell-type versus remaining DHS.
• Test 83 cell-types in autoimmune traits, six significant enrichments relative to DHS:
1Trynka 2013, Nat Gen.; 2Maurano 2012, Science
Cell type Enrichment relative to DHS
PV
Th1 T-Cell 5.8x 1.4 x 10-05 [2]
Fetal kidney 5.4x 4.3 x 10-04
Monocyte CD14+ 4.3x 4.3 x 10-04 [2]
CD8+ primary 4.0x 1.7 x 10-04 [1]
Leukemia 3.5x 5.9 x 10-05
Lymphoblastoid 3.4x 3.1 x 10-05 [2]
![Page 15: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/15.jpg)
Estimates are robust to artifacts
• Disease architecture – Extensive LD/MAF simulations (in real data) – Replication in multiple PGC cohorts
• Estimated variance – Empirical jack-knife estimate is consistent
• Shared controls – Permuted study-wide enrichment minor: λGC = 1.3
• Case/control ascertainment
– Alternative hg2 estimation1, same answer
• Rare variants? 1Golan & Rosset, 2013 arxiv
![Page 16: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/16.jpg)
Rare coding variants do not confound
• Half of h2 still “missing” … can rare variants bias the observed results?
• Data: Exome-chip on 6,500-sample schizophrenia cohort; 64k rare (MAF<0.01) coding variants.
• Contribution from rare coding variants non-significant (hg
2 = 0.037 ± 0.029). • No impact on DHS enrichment.
• Purcell et. al 2014 Nature: contribution from rare
exome-sequence variants modest (0.4%-0.6%).
![Page 17: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/17.jpg)
Enrichment observed in association statistics
A: Stratified QQ plot (height)
B: P-value enrichment (crohn’s disease)
1Schork et al. 2013 PLoS Gen; 2Maurano et al. 2012 Science;
Expected P-value
Ob
serv
ed P
-val
ue
(By
fun
ctio
n)
GWAS P-value threshold En
rich
me
nt
in D
HS
![Page 18: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/18.jpg)
Simulations: Noisy relationship between biology and association statistics
0 2 4 6 8 10
0.0
0.5
1.0
1.5
2.0
2.5
3.0
P−value Enrichment
Minimum −log10(PV)
Fold
enri
chm
ent
●
●
●
●
●
●
Coding
UTR
Promoter
DHS
Intron
Intergenic
Simulation:
• 80% of hg2 from DHS; 10%
from coding; others uniform.
• Generate phenotypes, run 100’s of 32k sample imputed GWAS. Look for enrichment.
• LD & relative enrichment confound estimate GWAS P-value threshold
Enri
chm
ent
100 simulations
Coding Promoter UTR Intron DHS Intergenic
![Page 19: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/19.jpg)
Conclusions
• Partitioning “hidden” heritability informs biology and disease architecture. Implications for GWAS, fine-mapping, risk-prediction.
• Very large enrichment at DHS elements (enhancers, cell-types, etc.) across 11 traits.
• Low upper-bound on non-regulatory intronic/intergenic contribution.
• Non-significant contribution from low-frequency exome-chip variants (in SCZ).
• Larger cohorts will yield trait-specific results.
![Page 20: Regulatory variants explain much more heritability … · more heritability than coding variants across 11 common diseases ... •Case/control ascertainment ... PowerPoint Presentation](https://reader031.vdocument.in/reader031/viewer/2022022018/5b8ee2c009d3f2691f8ba82e/html5/thumbnails/20.jpg)
Acknowledgements
S Hong Lee
Benjamin M Neale
Gosia Trynka
Bjarni Vilhjalmsson
Hilary Finucane
Han Xu
Chongzhi Zang
Stephan Ripke
Eli Stahl
Schizophrenia Working Group of the Psychiatric Genomics
Consortium
SWE-SCZ Consortium
Anna K Kahler Christina M Hultman
Shaun M Purcell Steven A McCarroll
Mark Daly Patrick F Sullivan
Naomi R Wray Soumya Raychaudhuri
Alkes L Price
… and thanks also to: M. Kellis, A. Sarkar, J. Pickrell,
XS. Liu, N. Patterson
Gusev et. al. 2014 bioRxiv