ben logsdon, sage bionetworks - emory...

34
Integrating multiple lines of evidence to identify candidate AD drivers Ben Logsdon, Sage Bionetworks

Upload: others

Post on 26-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Integrating multiple lines of evidence to identify candidate AD drivers

    Ben Logsdon,Sage Bionetworks

  • AMP-AD

    ➢ 5 year U01 grant with joint funding from NIA and the FNIHa. 6 academic centers (Broad-Rush, UFL-Mayo-ISB, Emory, Mt.

    Sinai, Duke, Harvard)b. 4 industry partners (Lilly, Biogen Idec, AbbVie, GSK)

    ➢ Goal: identify new targets for intervention in Alzheimer’s diseasea. each academic center must provide list of targets for further

    preclinical validation at end of grant➢ All data released on an aggressive quarterly timeline

    a. RNAseq, exome sequencing, microRNA, DNA methylation, histone (H3K9) acetylation, genotype, clinical, etc...

    ➢ All data deposited into the AMP-AD Knowledge Portala. https://www.synapse.org/project/AMP_AD_Knowledge_Portal

  • ROS/MAP

    ➢ Combination of two longitudinal studies (ROS and MAP) to study Alzheimer’s disease in normal aging populations1,2

    ➢ RNA sequencing data available on post mortem brains from 592 patients (202 patients with clinically diagnosed AD)

    ➢ After processing and filtering (Broad Pipeline) there are FPKM gene expression levels on 22,894 genes (using ENSEMBL gene models)

    1. Bennett et al. (2012) (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3409291/)2. Bennett et al. (2012) (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3439198/)

  • ROS/MAP

    ➢ Combination of two longitudinal studies (ROS and MAP) to study Alzheimer’s disease in normal aging populations1,2

    ➢ RNA sequencing data available on post mortem brains from 592 patients (202 patients with clinically diagnosed AD)

    ➢ After processing and filtering (Broad Pipeline) there are FPKM gene expression levels on 22,894 genes (using ENSEMBL gene models)

    1. Bennett et al. (2012) (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3409291/)2. Bennett et al. (2012) (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3439198/)

  • Correlation between covariates

    1. RIN and batch effects shows some correlation

    1. Cognition scores and RIN are highly correlated

    1. Cognition scores are also concordant with age at death and visit

    PMI

    BATC

    H

    RIN

    cogd

    x

    span

    ishraceedu

    sex

    Age D

    x

    Apoe

    Age v

    isit

    Age d

    eath

  • Correlation between mRNA expression and covariates

    Based on 22894 genes and 592

    samples

  • Correlation between mRNA expression and covariates

    Based on 22894 genes and 592

    samples

    Batch effects CogDx

    RINAge

    PMI

  • Genie32

    WGCNA7LASSO6

    ARACNe4

    Tigress3

    Ridge5

    SPARROW1

    Generate networks with multiple methods

    2nd Generation Network Inference Methods

    1st Generation Network Inference Methods

  • Metanetwork Approach

    Ben Logsdon, SageThanneer Perumal, SageLara Mangravite, Sage

  • Generate Consensus Network and Modules

    Network

    ModulesEdge RankConsensus network

    14 distinct penalized regression methods to infer undirected graphs

    Modules identified in network

  • Network Inference Pipeline

    Starcluster + MPI Integration

  • Modularising Networks

  • Modularising Networks

  • Network Modules: Algorithm

    * Finding community structure in very large networks (http://arxiv.org/pdf/cond-mat/0408187v2.pdf)

    Hierarchical Agglomeration* a. Implementation: Fast greedy algorithm of igraphb. Complexity: O(m . d . log n) ~ O(n log2 n)c. Modularity:

    a. Interpretation: How different our modules are from random network

    b. Objective: To maximise Q by merging all the leaf nodes (i.e., communities) in a dendrogram of communities

  • ** Zhang, Y. et al. (2014). J Neurosci 34, 11929-11947, doi:10.1523/JNEUROSCI.1860-14.2014

    Module Evaluation in AD Rank Consensus (ROSMAP)

    Network Properties# Genes 22899# Edges 11117

    Module Properties# of Modules (size > 20) 21

    # of genes in the above 36 modules 4804Module Name Size Cell Markers** Odds Ratio

    red 870 Endothelial 23

    purple 348 Microglia 65

    blue 447 MyelinOligos 74

    brown 776 Astrocyte 23

    black 259 Neuron 8

    greenyellow 156 Neuron 11

    tan 111 OPC 19

  • Rank Consensus AD Network (ROSMAP)

    Endothelial

    Microglia

    Astrocyte

    Myelin Oligodendrocytes

    Ribosome

    mRNA splicingNeuron

    Response to Dicycoverine (Muscarine receptor inhibitor)

    Neuron

    Neuron

    OPC

  • ** Zhang, Y. et al. (2014). J Neurosci 34, 11929-11947, doi:10.1523/JNEUROSCI.1860-14.2014

    Module Evaluation in NCI Rank Consensus (ROSMAP)

    Network Properties# Genes 22899# Edges 11117

    Module Properties# of Modules (size > 20) 21

    # of genes in the above 36 modules 4804Module Name Size Cell Markers** Odds Ratio

    brown 312 Endothelial 32

    pink 675 Microglia 30

    yellow 432 MyelinOligos 97

    turquoise 803 Astrocyte 24

    magenta 210 Neuron 7

    greenyellow 189 Neuron 13

    blue 106 OPC 17

  • Rank Consensus NCI Network (ROSMAP)

    Endothelial

    MicrogliaAstrocyte

    Myelin Oligodendrocyte

    Neuron

    NeuronPrefrontal Cortex

    Chromatin/Alzheimer’s Disease Signatures

    Ion Channel

    Ribosome OPC

  • Rank Consensus AD Network (ROSMAP)

    Endothelial

    Microglia

    Astrocyte

    Myelin Oligodendrocytes

    Ribosome

    mRNA splicingNeuron

    Response to Dicycoverine (Muscarine receptor inhibitor)

    Neuron

    Neuron

    OPC

  • NCI

    Cogdx Networks

    MCI

    AD

    https://www.synapse.org/#!Synapse:syn5553756

  • BRAAK12

    BRAAK Networks

    BRAAK34

    BRAAK56

    https://www.synapse.org/#!Synapse:syn5553756

  • Overlap

    Odds Ratios (using Fisher’s exact test) comparing edge overlaps between ROSMAP networks. (All are significant fdr < 1e-16)

  • NCI Microglia Module

  • Enrichment analysis of microglia module

    Gene Set Name Odds Ratio FDR

    TF-LOF_Expression_from_GEO

    foxa2_20483781_p15_lung_lof_mouse_gpl1261_gse19204_up 9.22 4.07E-34

    myc_20940306_e13dot5_erythroblast_purified_from_liver_gof_mouse_gpl6885_gse18558_up 5.55 1.50E-22

    glis2_17618285_kidney_lof_mouse_gpl2897_gds2817_up 4.31 6.10E-16

    irf8_00000000_splenic_cd11bplusgrdash1_hdash2b_gen_background_lof_mouse_gpl6887_gse39228_down 3.69 2.99E-12

    gfi1b_22201127_amulv_gof_mouse_gpl6246_gds4302_up 3.46 1.12E-10

    Chip Experimental Analysis

    IRF8-21731497-J774-MOUSE 5.27 6.10E-08

    RUNX1-20887958-HPC-7-MOUSE 3.31 2.77E-11

    NR1H3-23393188-ATHEROSCLEROTIC-FOAM-HUMAN 3.31 6.90E-08

  • Mouse Microglial Overlaps (2 Months)

  • Mouse Microglial Overlaps (4 Months)

  • Mouse Microglial Overlaps (6 Months)

  • Mouse Microglial Overlaps (8 Months)

  • : : : : :

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    DriverEvent

    Patie

    nts

    Genes with expression under similar selection across patients 1-5

    11

    1

    11

    .6.3 .3.6 .6 .6

    .3

    .3

    .3

    .3

    Correlation among expression levels

    E-driver

    : : :

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    Conditioning on e-driver

    11

    1

    11

    -.40 0-.4 -.4 -.4

    0

    0

    0

    0

    Partial correlations among expression levels

    Residual expression

    * Logsdon et al. , Sparse expression bases in cancer reveal tumor drivers, Nucleic Acids Research (2015): gku1290

    Driver Mutation

    Active e-driver

    Pathway genesunderselection

    Expression Drivers (e-drivers)

  • : : :

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    Patie

    nts

    : : :

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    : : :

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    Driver Events (Unobserved)

    Candidate E-driversSynaptic Pruning Protein Misfolding

    : : :

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    ...

  • : : :

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    ...

    AD pathway genes

    G1 G2 Gm

    Patie

    nts

    β1

    β2

    β3

    ...

    β4

    βp

    Selection frequency in sparse basis for apoptosis inhibition genes (across G1, ...,Gm)

    0

    1

    Sparse basis parameters (ordered by how often they are selected, i.e. βj ≠ 0)

    Candidate E-drivers

    : : : : :

    β1 β2 β3 βp

    Learned G1 Sparse Basis: β1, β2, β3 = 1/3; β4,..., βp =0

    G1 ~3000 Possible E-drivers

    ...

  • Identifying candidate e-drivers

    •Learn a graph structure with SPARROW*

    •Use the hub genes as top candidate e-drivers

    •Rank selected genes based on e-driver score

    genes

    g

    gg

    gg

    g g

    gg

    g

    g

    gg

    gg

    g g

    gg

    g

    g

    gg

    gg

    g g

    gg

    g

    ~700patients

    AMP- AD expression data

    * Logsdon et al. , Sparse expression bases in cancer reveal tumor drivers, Nucleic Acids Research (2015): gku1290

  • Top Hubs in NCI Network