epigenomic and regulatory genomics of complex human disease manolis kellis mit computer science...

30
Epigenomic and regulatory genomics of complex human disease Manolis Kellis Computer Science & Artificial Intelligence Laboratory road Institute of MIT and Harvard

Upload: judith-hensley

Post on 25-Dec-2015

222 views

Category:

Documents


7 download

TRANSCRIPT

Epigenomic and regulatory genomics of complex human disease

Manolis Kellis

MIT Computer Science & Artificial Intelligence Laboratory

Broad Institute of MIT and Harvard

Recombination breakpointsFa

mily

Inhe

ritan

ce

Me vs. my brother

My dadDad’s mom Mom’s dad

Hum

an a

nces

try

Dis

ease

risk

Genomics: Regions mechanisms drugs Systems: genes combinations pathways

Personal genomics today: 23 and Me

AMD Risk

CATGACTGCATGCCTG

GeneticVariant

Disease

Methyl.

DNAaccess.

Enhancer

H3K27ac

Promoter

Insulator

EpigeneticChanges

Geneexpr.

Molecular Phenotypes

Geneexpr.

Geneexpr.

GeneExpression

Changes

Muscle

Heart

Cortex

Lung

Blood

Skin

Nerve

Tissue/cell type

Organismalphenotypes

LipidsTensionHeartrateMetabol.Drug resp

Endophenotypes

Feedback from environment / disease state

Environment

1

2

Causal Regulators3

Chromatin states

Enhancer linking

Regulatory andsystems genomics

Apply to complex disease

Interpret GWAS

Epigenomics in patients

1

2

3DiseaseNetworks

Diverse tissues and cells: 1. Adult tissues and cells (brain, muscle, heart, digestive, skin, adipose, lung, blood…)2. Fetal tissues (brain, skeletal muscle, heart, digestive, lung, cord blood…)3. ES cells, iPS, differentiated cells (meso/endo/ectoderm, neural, mesench, trophobl)

Epigenomics Roadmap across 100+ tissues/cell types

Diverse epigenomic assays:1. Histone modifications

• H3K4me3, H3K4me1• H3K36me3• H3K27me3, H3K9me3• H3K27ac, H3K9ac

2. Open chromatin: • DNase

3. DNA methylation: • WGBS, RRBS, MRE/MeDIP

4. Gene expression• RNA-seq, Exon Arrays

Art: Rae Senarighi, Richard Sandstrom

Diverse chromatin signatures encode epigenomic state

• 100s of known modifications, many new still emerging• Systematic mapping using ChIP-, Bisulfite-, DNase-Seq

• H3K4me3• H3K9ac• DNase

• H3K36me3• H3K79me2• H4K20me1

• H3K4me1• H3K27ac• DNase

• H3K9me3• H3K27me3• DNAmethyl

• H3K4me3• H3K4me1• H3K27ac• H3K36me3• H4K20me1• H3K27me3• H3K9me3• H3K9ac

Enhancers Promoters Transcribed Repressed

Deep sampling of 9 reference epigenomes (e.g. IMR90)

Chromatin state+RNA+DNAse+28 histone marks+WGBS+Hi-CUWash Epigenome Browser, Ting Wang

Chromatin states capture combinations and dynamics

• Single annotation track for each cell type• Capture combinations of histone marks• Summarize cell-type activity at a glance• Study activity pattern across cell types

Correlatedactivity

Predictedlinking

Chromatin state annotations across 127 epigenomes

Reveal epigenomic variability: enh/prom/tx/repr/hetAnshul Kundaje

2.3M enhancer regions only ~200 activity patterns

Wouter Meuleman

immunedev/morph

morphlearning

muscle

<3smoothmuscle

kidney

liver

54000+ measurements (x2 cells, 2x repl)

Kheradpour et al Genome Research 2013

Systematic motif dissection in 2000 enhancers: 5 activators and 2 repressors in 2 cell lines

Example activator: conserved HNF4

motif matchWT expression

specific to HepG2

Non-disruptive changes maintain

expression

Motif match disruptions reduce

expression to background

Random changes depend on effect to motif match

1

2

Causal Regulators3

Chromatin states

Enhancer linking

Regulatory andsystems genomics

Apply to complex disease

Interpret GWAS

Epigenomics in patients

1

2

3DiseaseNetworks

The challenge of interpreting disease-association studies

• Large associated blocks with many variants: Fine-mapping challenge• No information on cell type/mechanism, most variants non-coding Epigenomic annotations help find relevant cell types / nucleotides

xx

• Disease-associated SNPs enriched for enhancers in relevant cell types• E.g. lupus SNP in GM enhancer disrupts Ets1 predicted activator

Revisiting disease- associated variants

Mechanistic predictions for top disease-associated SNPs

Disrupt activator Ets-1 motif Loss of GM-specific activation Loss of enhancer function Loss of HLA-DRB1 expression

Erythrocyte phenotypes in K562 leukemia cellsLupus erythromatosus in GM lymphoblastoid

`

Creation of repressor Gfi1 motif Gain K562-specific repression Loss of enhancer function Loss of CCDC162 expression

GWAS hits in enhancers of relevant cell types

Immune traits, heart, height, platelets, in relevant tissuesLuke Ward

Rank-based functional testing of weak associations

• Rank all SNPs based on GWAS signal strength• Functional enrichment for cell types and states

Enrichment peaks at 10,000s of SNPsdown the rank list, even after LD pruning!

Abhishek Sarkar

Weak-effect T1D hits in 1000s T-cell enhancers

• Enhancer enrichment strong for top ~30k SNPs• Heritability estimates also increase until ~30k SNPs

enhancersCD4+ T-cells

T-cellsB-cells

Other cell types

Abhishek Sarkar

Per s

tate

: (O

bs –

Exp

) / T

otal

Enhancers

Promoters

Brain methylation changes in AD patients

• 10,000s of methylation differences in AD vs. control• Harbor 1000s of genetic variants associated with AD• Localized in brain-specific enhancers and pathways

T1D/RA-enriched enhancers spread across genome

• High concentration of loci in MHC, high overlap• Yet: many distinct regions, 1000s of distinct loci

Abhishek Sarkar

Bayesian model for joining weak SNPs in pathways

Inputs OutputsGWAS summary statistics(SNP P-values)

Interaction network

Physical distances between ncSNPs and TSS

SNP disease-relevance (yes/no)

Gene disease-relevance(yes/no)

Gene target (if any) of each SNP3

Disease-relevantgene

Legend Gene near relevant SNP

Disease-relevantSNP

Gerald Quon

Poorly ranked SNP nearby

Highly rankedSNP nearby

0 1p(SNP relevant)

# SN

Ps (p

>0)

0

1200

0 1p(gene relevant)

# ge

nes

0

15k

Example 1: MAZ predicted role in T1D

• MAZ no direct assoc, but clusters w/ many T1D hits• MAZ indeed known regulator of insulin expression

Gerald Quon

Example 2: SP3 predicted role in MSPoorly ranked SNP nearby

Highly rankedSNP nearby

0 1p(SNP relevant)

0 1p(gene relevant)

# SN

Ps (p

>0)

0

300

# ge

nes

0

8k

• SP3 no direct assoc but clusters w/ many MS hits• SP3 is indeed down-regulated in MS patients

Gerald Quon

# non-genetic hits missing heritability

Gerald Quon

• Missing heritability partly due to weak variants• Regulators lacking association harbor rare variants

e.g. Coronary artery disease: GATA6 (congential heart disease), HNF1A (cardiovascular), PPARG (lipid metabolism, partial lipodystrophy)

Validate weak variant targets in model organisms

Use CRISPR/Cas to edit nucleotides, knockdown target genesAlzheimer: Differential activity in mouse neurodegenerationCardiac: Repolarization interval in zebrafish heart

Andreas PfenningXinchen Wang

1

2

Causal Regulators3

Chromatin states

Enhancer linking

Regulatory andsystems genomics

Apply to complex disease

Interpret GWAS

Epigenomics in patients

1

2

3DiseaseNetworks

Integrative analysis of 100+ epigenomes

1. Reference Epigenomes chromatin states, linking– Annotate dynamic regulatory elements in multiple cell types– Activity-based linking of regulators enhancers targets

2. Interpreting disease-associated sequence variants– Mechanistic predictions for individual top-scoring SNPs– Functional roles of 1000s of disease-associated SNPs

3. Disease networks: links SNPsgenesphenotypes

– Module-based linking of enhancers to their target genes

– Bayesian model for evaluating disease genes and SNPs

4. Genetic / epigenomic variation in health and disease– Genetic variationBrain methylationAlzheimer’s disease– Global repression of distal enhancers. NRSF, ELK1, CTCF

MIT Computational Biology Group

WouterMeuleman

Jason ErnstLuke Ward

Soheil FeiziGerald QuonDaniel

Marbach

BobAltshuler

AnshulKundaje

MattEaton

AbhishekSarkar

PouyaKheradpour

MIT Computational Biology Group

MarianaMendoza

JessicaWu

ManasiVartak

DavidHendrix

MukulBansal

MattRasmussen

StefanWashietl

AndreasPfenning

HaydenMetsky

LuisBarrera

ManolisKellis

Roadmap Epigenomics Integrative Analysis Team

Lisa ChadwickTing WangJohn Stam

Bing RenMartin Hirst

Joe CostelloBrad Bernstein

Aleks Milosavljevic

Anshul KundajeWouter MeulemanJason ErnstMisha BilenkyJianrong WangAngela YenLuke WardAbhishek SarkarGerald QuonPouya KheradpourAlireza Heravi-Moussavi

Cristian Coarfa, Alan Harris, Michael Ziller, Matthew Schultz, Matt Eaton, Andreas Pfenning, Xinchen Wang,

Paz Polak, Rosa Karlic, Viren Amin, Yi-Chieh Wu, Richard S Sandstrom, Zhizhuo Zhang,

GiNell Elliott, Rebecca Lowdon