functional networks inference from rule-based …ico2s.org/data/papers/lazzarini2016.pdffunctional...
TRANSCRIPT
Lazzarini et al.
METHODOLOGY
Functional networks inference from rule-basedmachine learning modelsNicola Lazzarini1, Pawe l Widera1, Stuart Williamson2, Rakesh Heer3, Natalio Krasnogor1 and JaumeBacardit1*
Abstract
Background: Functional networks play an important role in the analysis of biological processes and systems.The inference of these networks from high-throughput (-omics) data is an area of intense research. So far, thesimilarity-based inference paradigm (e.g. gene co-expression) has been the most popular approach. It assumesa functional relationship between genes which are expressed at similar levels across different samples. Analternative to this paradigm is the inference of relationships from the structure of machine learning models.These models are able to capture complex relationships between variables, that often aredifferent/complementary to the similarity-based methods.
Results: We propose a protocol to infer functional networks from machine learning models, called FuNeL. Itassumes, that genes used together within a rule-based machine learning model to classify the samples, mightalso be functionally related at a biological level. The protocol is first tested on synthetic datasets and thenevaluated on a test suite of 8 real-world datasets related to human cancer. The networks inferred from thereal-world data are compared against gene co-expression networks of equal size, generated with 3 differentmethods. The comparison is performed from two different points of view. We analyse the enriched biologicalterms in the set of network nodes and the relationships between known disease-associated genes in a context ofthe network topology. The comparison confirms both the biological relevance and the complementary characterof the knowledge captured by the FuNeL networks in relation to similarity-based methods, and demonstratesits potential to identify known disease associations as core elements of the network. Finally, using a prostatecancer dataset as a case study, we confirm that the biological knowledge captured by our method is relevant tothe disease and consistent with the specialised literature and with an independent dataset not used in theinference process.
Availability: The implementation of our network inference protocol is available at:http://ico2s.org/software/funel.html
Keywords: machine learning; biological knowledge extraction; network inference; functional networks
Background
The inference of biological networks is a highly relevantand challenging task in systems biology and integra-tive bioinformatics. Biological networks are graphs in
*Correspondence: [email protected]
1 Interdisciplinary Computing and Complex BioSystems (ICOS) research
group, School of Computing Science, Newcastle University, Newcastle upon
Tyne, UK
Full list of author information is available at the end of the article
which nodes represent genes or proteins, and a con-nection between them indicates some kind of biologi-cal relationship, e.g. regulatory or functional. The net-work inference is, in an essence, an attempt to reverseengineer the biological relationships from the high-throughput biological data [1].
Most biological network inference methods focus onthe definition of gene regulatory networks, in whichedges represent direct regulatory interactions between
Lazzarini et al. Page 2 of 16
NETWORK INFERENCE OUTPUTINPUT
X
Y
SamplesX Y
similarity (X,Y) > thresholdX
Y
X Y
X
Y
Classification rules-omics data
X Y Pheno.
Phenotype information
Cancer Normal
XY
YX
Machine learning model Knowledge extraction
Edge inferenceProfile similarity
If X > 0.23 and Y > 0.55: Cancer
Fig 1 Two approaches to functional network inference: one based on the expression profile similarity and the other based on theextraction of knowledge from machine learning models. The similarity-based methods construct a new network edge X ↔ Y , whenthe similarity between the expressions of genes X and Y across the samples is above a threshold. Methods based on machinelearning, first build a predictive model, in this example a rule-based model, using the samples phenotype information (class labels)and then construct a network edge X ↔ Y , when genes X and Y are used together within that model to classify the samples. Asthese two approaches lead to different functional networks, it is possible that they capture complementary knowledge.
genes [2–4]. Far less effort has been put into the de-sign of methods to build functional networks in whicha connection indicates a functional relationship, e.g.membership in the same pathway or protein complex.One of the typical uses of these networks is the iden-tification of functional modules (subset of genes withmultiple internal connections and a few connectionswith genes outside the module that describe, explainor predict a biological process or phenotype.).
One of the earliest (but still widely used) approach toinfer functional networks is the ”guilt-by-association”principle [5]. That is, if two genes show similar expres-sion profiles, it is assumed they are also functionallyrelated (via a direct or indirect interaction). Initially,this paradigm was applied to infer networks from tran-scriptomics data, and this is why in most of the liter-ature it is known as the co-expression network infer-ence principle. Nevertheless, it is abstract enough tobe applied to all kinds of biological data. It has beendemonstrated that co-expression networks are able toeffectively identify pathways and candidate biomark-ers [6] or reveal gene modules representing a biologicalprocess perturbed in a disease [7], just to name a fewexamples, and the similarity-based approach remainsthe dominant method of functional network inferencetoday, with many recent examples: [8–12].
A different approach that is recently gaining popu-larity, is the use of machine learning techniques toinfer biological networks. Due to the wide range of
knowledge representations used within machine learn-ing methods (e.g. classification rules, decision trees, ar-tificial neural networks, SVM kernels, etc.), they candiscover more complex and diverse relationships, andovercome the limitations of the similarity-based meth-ods. This is possible since within machine learningmodels the attributes are associated not because theyare similar (e.g. have similar expression profiles), butbecause together they detect strong patterns. In addi-tion, if learning is supervised, it can take advantage ofthe additional phenotype information (class labels ofthe samples, e.g. case and control) available with thedata. Therefore, by mining the complex machine learn-ing models, it should be possible to uncover new anddifferent (biological) knowledge, that is likely to escapethe traditional approaches. Figure 1 illustrates thesedifferences between the two approaches (similarity-based methods vs. knowledge extraction from the ma-chine learning models).
Alternative strategies exist to infer networks usingmachine learning. One approach is to train machinelearning models that directly predict network edges[13], but this process requires an experimentally ver-ified ”ground truth” of known interactions and suit-able controls. A different approach, which is the focusof this work, is to generate machine learning modelsfrom the biological data and then mine the structureof the models to infer networks. Several types of ma-chine learning have been successfully applied to this
Lazzarini et al. Page 3 of 16
task: unsupervised learning in the form of associationrules [14], supervised learning using regression (modeltrees [15]) or classification (random forest [16]).
The specific focus of this paper is the network infer-ence from rule-based machine learning models. Suchmodels have been successfully applied before to ex-tract knowledge from genetic data [17] and identifydisease risk factors in a bladder cancer study [18]. Themethods presented in these works share some pipelinecomponents with our current work, such as the per-mutation test and a 2-phase learning strategy. In ourprevious works we applied rule-based machine learningto transcriptomics [19, 20], proteomics [21], lipidomics[22] and protein structure data [23]. We formulateda paradigm called co-prediction (in opposition to theclassic co-expression) in which the prediction rules ofa classification algorithm, in our case BioHEL [24], areused to identify relationships between genes.
Co-prediction is based on the assumption that at-tributes (e.g. genes) within the same classificationrules, due to their co-operation in predicting the sam-ple class, have an increased likelihood of being func-tionally related to the biological process in question(Figure 2). Differently than co-expression, the co-prediction approach exploits the phenotype informa-tion of the data (class labels) to detect functional re-lations.
Fig 2 Co-prediction paradigm. Association between the genesis inferred from their co-occurrence in classification rules.
However, from a methodological perspective, manyquestions remained unanswered. Can the co-predictionapproach identify known genetic relationships? Howcan we quantify the biological significance of the co-prediction networks? What is the impact of data pre-processing on the generated networks? Is this method-ology able to capture knowledge that escapes other
methods? Are the discovered functional relationshipsmeaningful in the human disease context?
To address these questions, we propose in this arti-cle a new network inference protocol, called FuNeL(Functional Network Learning). FuNeL substantiallyextends our previous work [19] by incorporating: (1)statistical filtering of inferred functional relationshipsvia permutation tests, (2) a multi-stage network gener-ation to maximise the knowledge extraction, and (3) aconfigurable feature selection stage to control the sizeof the generated networks.
We first tested FuNeL’s ability to correctly iden-tify functional relationships using a set of syntheticdatasets. Then, we evaluated FuNeL on 8 real-worldtranscriptomics datasets related to different types ofcancer. For each dataset we tested 4 different configu-rations of the protocol and compared the inferred net-works to co-expression networks of equivalent size. Inorder to have an extensive evaluation of our approach,we employed 3 different methods to generate co-expression networks. We systematically looked at thedifferences between co-prediction and co-expressionnetworks from two points of view: (1) the enriched bi-ological terms and (2) the relationships between thegenes known to be associated with a particular type ofcancer. Finally, we used a prostate cancer dataset asa case study and performed a more detailed biologicalanalysis of the enriched terms and the disease relatedgenes. We looked at the largest hubs and the mostcentral nodes in the prostate cancer co-prediction net-works and studied their involvement in the disease. Wefound literature support for the association betweenthese topologically important genes and prostate can-cer, and we further confirmed it with an independenttranscriptomics dataset (not used as a source in the in-ference process). Overall, we found that the FuNeL in-ferred networks: (1) capture relevant biological knowl-edge that is complementary to the knowledge capturedby different co-expression networks, and (2) more ad-equately represent the relationships between genes as-sociated with the disease targeted by each dataset.
Materials and Methods
In this section we describe the proposed network in-ference protocol, the datasets from which we inferredthe networks and the experimental design we used toevaluate it.
The functional network inference protocol
The stages of the co-prediction inference protocol areillustrated in Figure 3. Two of these stages are optional
Lazzarini et al. Page 4 of 16
(1 and 4), they lead to a total of 4 different protocolconfigurations. If the first optional stage (feature se-lection) is performed, the original dataset is reducedto the most relevant attributes. In the second stage arule-based machine learning is used to infer a network.This network is statistically refined in Stage 3, in whicha permutation test is used to filter out non-significantnodes. The final stage, in which the network generationis repeated for the second time, is again optional. Acomplete time complexity analysis of the FuNeL pro-tocol is available in Section 2 of the SupplementaryMaterial.
Fig 3 Stages of the functional network inference protocol.
Feature selection (stage 1) When datasets contain alarge number of attributes, some might be irrelevantto the prediction target and discarding them helps theclassification algorithm to focus its learning effort onthe attributes that matters. Therefore, the feature se-lection is the first stage of the inference process. Topick the relevant attributes we used the support vec-tor machine recursive feature elimination (SVM-RFE)[25]. We opted for the SVM algorithm with a linear
kernel as our preliminary studies suggested that it caneliminate as much as 90% of the original dataset at-tributes, without losing much of the classification ac-curacy (see Section 1 in Supplementary Material).
Rule-based network inference (stage 2) To infer therule-based classification models we used BioHEL [24].It generates sets of classification rules using a geneticalgorithm and is able to work with large datasets. Dueto the stochastic nature of BioHEL’s learning process,each of its runs generates a different rule set. We lever-age this fact by creating a large number of alterna-tive hypotheses of functional relationships via multi-ple runs of the algorithm. For each dataset we runBioHEL 10 000 times and infer the network from theconsensus of all the generated rule sets. To do that, weuse all the pairs of attributes that appear together inthe same classification rule as the network edges (co-prediction paradigm). Then, we score each networknode (attribute) by counting how many times it hasbeen used in the rules (node score).
Permutation test (stage 3) Given a list of edges(attribute-attribute associations) extracted from therule sets, we try to filter out the non-significant nodes.To determine the node significance, we follow a statis-tical analysis procedure based on a permutation test,similar to the one described in [17]. We generate 100permutated datasets by randomly shuffling the classlabels. Next, we infer the co-prediction networks (asin Stage 2) from these permutated datasets. Then,for each node, we calculate a distribution of scoresacross the 100 networks generated from the permu-tated datasets. Using a one-tailed permutation test, weassign to each node a p-value, to estimate how likelyit is to draw its score from the calculated distribution.With this process we make sure that the nodes withhigh scores are really tied to the classes present in thedata, and that the network truly represents functionalrelationships. To decide if a node is statistically signif-icant we use a typical α = 0.05 threshold.
After preliminary experiments we realised, that usingsignificant nodes alone leads to small and dense net-works. To counter that, we relaxed the node pruning toalso keep all direct neighbours of the significant nodes.
Network construction (stage 4) There are two waysto interpret the result of the statistical test (option 2in Figure 3). The first approach is to use the significantnodes as a filter for the inferred relationships (edges)and remove all the edges between two non-significantnodes. The second approach is to use the permuta-tion test as a further feature selection and build a new
Lazzarini et al. Page 5 of 16
rule-based machine learning model using only the sig-nificant nodes. This second run of the learning algo-rithm is then focused only on the statistically impor-tant genes and creates the final network.
Protocol configurations As a result of two indepen-dent optional stages in the FuNeL protocol, there are4 different configurations that it can run with (see Ta-ble 1). We decided to test them all and infer four net-works from each dataset, one per configuration.
Table 1 Protocol configurations used in the experiments.
Config. Description
C1 reduced dataset + 1 stage of network generationC2 original dataset + 1 stage of network generationC3 reduced dataset + 2 stages of network generationC4 original dataset + 2 stages of network generation
Datasets
Synthetic datasets
To verify if FuNeL is able to correctly identify func-tional relationships we tested it on a set of syntheticdatasets. Although there are several generators thatmodel expression data with genetic relationships, suchas GNW used in several DREAM challenges [26], theygenerate unlabeled samples (without phenotype infor-mation, e.g. case vs. control) and the class labels arenecessary to perform the supervised learning at thecore of FuNeL.
For that reason, we decided to use GAMETES instancegenerator [27], designed to create genetic datasets withmulti-locus disease associations, where no fewer thann loci can predict a phenotype (disease status). GA-METES generates genotype data (rather than gene ex-pression data) based on models with specific geneticconstraints, e.g. different heritabilities or frequenciesof the SNPs.
To generate the synthetic datasets, we used a set of2-locus configurations similar to what was employedin a recent work of Li et al. [28] to evaluate permutedrandom forest networks of gene interactions. Specifi-cally, the genetic models varied in terms of heritability(0.001-0.4) and number of attributes (5-25), with fixedallele frequency of 0.2 and 2000 samples per dataset.For each configuration, we selected from 100 000 ran-dom models, two models with extreme value of the easeof detection metric (EDM) (the least and the most dif-ficult). Finally, for each selected model we generated50 datasets, obtaining 4000 datasets in total.
Real-world datasets
We used 8 publicly available human cancer microar-ray datasets (see Table 2). These datasets representa broad range of characteristics in terms of biologi-cal information (different types of cancers), number ofsamples (patients) and number of attributes (genes).For each dataset the attributes were defined by theprobes used in the microarray experiment. Generally,a gene can be represented by more than one probeand extra post-processing step is needed to merge theinformation and generate networks where nodes trulyrepresent genes. We used MADGene [29] to map theAffymetrix probe IDs into HUGO gene IDs, then forall probes mapped to the same gene, we merged theprobes and their connections. If a probe was unmappedit was removed from the network.
Table 2 Description of the source datasets used to infernetworks.
Name Attributes Samples Class labels
Dlbcl [30] 2647 77 Dlbcl; Follicular lymphomaCNS [31] 7129 60 Survivor; FailuresLeukemia [32] 7129 72 AML; ALLLung-Michigan [33] 7129 96 Tumor; NormalLung-Harvard [34] 12534 181 Mesothelioma; ADCAProstate [35] 12600 102 Tumor; NormalAML [36] 12625 54 Remission; RelapseColon-Breast [37] 22283 52 Colon cancer; Breast cancer
While in this instance we focused on transcriptomicsdatasets only, the FuNeL protocol is general and canbe applied to other types of biological data too (pro-teomics, lipidomics, etc.).
Co-expression networks
In this paper we are comparing our FuNeL networksagainst co-expression networks. The co-expressionparadigm identifies similarity of gene expression pat-tern under different experimental conditions. Co-expression edges are an abstraction of functional rela-tionships between genes and do not represent physicalbinding as in protein interaction or gene regulatorynetworks. Two genes are considered to be function-ally related (co-expressed), if their transcript levelsare similar across a set of samples.
In here we employed three well known methods to inferco-expression networks, each one uses a different met-ric to assess gene expressions similarity: Pearson cor-relation coefficient, ARACNE [2] and MIC [38]. In thefollowing subsections we briefly present those methods,for more details check the cited original papers.
Pearson correlation coefficient
Pearson’s correlation coefficient (PCC) is a well knownmeasure of linear dependence between two variables.
Lazzarini et al. Page 6 of 16
Applied to gene expression profiles, it measures thesimilarity in the direction of gene response across sam-ples. Its main disadvantages are the lack of distribu-tional robustness (it assumes data normality) and thesensitivity to outliers. We generated the PCC-basedco-expression networks using the SciPy Python library[39].
ARACNE: Algorithm for the Reconstruction of GeneRegulatory Networks
The ARACNE method [2] measures the dependencebetween two gene expression profiles using mutual in-formation. Mutual information I(X;Y ) estimates en-tropy to quantify the amount of information that Ycontains about X (measured in bits). In contrast tocorrelation, it is able to detect non-linear dependen-cies. ARACNE calculates I(X;Y ) for every pair ofgene expression profiles X and Y , and applies the dataprocessing inequality to remove the majority of indi-rect dependencies. For each triplet X, Y and Z theweakest link is removed, e.g. the edge between X andY is removed if I(X;Y ) ≤ min(I(X;Z), M(Z;Y )) −ε. The tolerance threshold ε is used to adjust forthe variance of the mutual information estimator. Togenerate the ARACNE based networks we used theminet R package [40] with the following parameters:mi.empirical estimator, equalwidth distance and ε = 0.
MIC: Maximal Information Coefficient
The MIC [38] is a recently proposed measure of thestrength of association between two variables, closelyrelated to mutual information. Instead of using a singlediscretisation strategy to bin the compared variables,it chooses individual bins for each variable, such thatvalue of mutual information I(X;Y ) is maximised.Compared to standard estimation of I(X;Y ) valueused in ARACNE, the optimised estimation providedby MIC is able to detect a wider range of non-linearassociations. To generate MIC based networks we usedthe minepy Python library [41] with the following pa-rameters: α = 0.6 and c = 15.
Inference of the co-expression networks counterparts
To fairly compare the co-prediction and co-expressionnetworks generated from the same data, we had tomake sure they match in size. To do that, for everyco-prediction network C with m edges and n nodes,we created two co-expression counterparts:
• SE(C): co-expression network with m edges
• SN(C): co-expression network with n nodes
PCC and MIC methods directly compute the pairwisesimilarity between the gene expressions. Given that,we generated SE(C) using m gene pairs with the high-est similarity coefficient. To build SN(C) we used asmany top gene pairs as needed, to reach at least nnodes (as we included all pairs tied on the similarityvalue, sometimes we end up with a few nodes more).
ARACNE uses a pruning procedure and generates aweighted network, not a list of pairwise similarities.When the resulting network was smaller than m edgesor n nodes, we increased the default tolerance thresh-old ε to obtain a large enough network. This was thecase for the CNS (ε = 0.002) and the Dlbcl datasets(ε = 0.043). Then we used the edge weights to selecttop gene pairs, as in the case of PCC and MIC meth-ods.
Several examples of inferred co-prediction networksand corresponding co-expression networks are visu-alised in Section 7 of the Supplementary Material andare accompanied, in there, by an initial analysis of se-lected topological properties in Section 3.
Enrichment analysis
To understand the biological information capturedby the generated networks we conducted an enrich-ment analysis. This is a statistical method of check-ing whether a set of genes have common character-istics. In our study, the set is defined by the nodesof the generated functional network and is analysedwith PANTHER [42]. Because many statistical testsare performed (one for each term) at the same time,PATHER uses Bonferroni correction for multiple test-ing with α = 0.05. We searched for two categories ofbiological knowledge: Gene Ontology (GO) terms andPANTHER pathways (176 primarily signalling path-ways). From the set of GO term, we selected only themanually curated annotations that were supported byexperimental evidence.
Disease association analysis
To evaluate the predictive power of the generated net-works, and to assess their relevance within a cancer-related context, we analysed the relationships be-tween known disease-associated genes. We used twosources for the disease associations: Malacards (ameta-database of human maladies consolidated from64 independent sources) [43] and the union of severalmanually curated databases (OMIM [44], Orphanet[45], Uniprot [46] and CTD [47]). We looked at twoproperties: (1) the proximity of the disease-associated
Lazzarini et al. Page 7 of 16
Table 3 FuNeL success rate in identification of disease-predicting SNPs. The datasets differed with respect to heritability, number ofSNPs and detection difficulty (L-EDM models were the hardest, H-EDM the easiest).
5 SNP 10 SNP 15 SNP 20 SNP 25 SNP
Her. L-EDM H-EDM L-EDM H-EDM L-EDM H-EDM L-EDM H-EDM L-EDM H-EDM
0.001 6 % 16 % 8 % 18 % 4 % 10 % 4 % 12 % 12 % 16 %0.005 8 % 82 % 0 % 86 % 6 % 80 % 2 % 82 % 8 % 72 %0.01 8 % 96 % 8 % 100 % 8 % 100 % 12 % 100 % 14 % 100 %0.05 14 % 100 % 60 % 100 % 42 % 100 % 34 % 100 % 34 % 100 %0.1 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 %0.2 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 %0.3 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 %0.4 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 % 100 %
genes within a network and (2) the number of tri-angles in a network, containing one or more disease-associated genes.
Higher proximity represents stronger functional rela-tionship between genes involved in the disease. Trian-gles represent groups of attributes used together acrossdifferent prediction rules, and therefore indicate strongmutual relationship between the genes (useful in thediscovery of potential new disease associations). Tri-angles are also the smallest non-trivial motifs that canbe found in a complex network and over-representedmotifs usually identify functional units of biologicalprocesses in cells [48].
The proximity of disease-associated genes was mea-sured using the average shortest path length (SPL).The proximity was defined as a ratio of two distances:average SPL between all pairs of the non-associatedgenes and average SPL between all pairs of disease-associated genes A:
1
n
n∑i=1
wiSPL(CCi \A)
SPL(A),where wi =
|CCi|∑nj=1 |CCj |
As the generated networks often were disconnected(had more than 1 connected component), we intro-duced a weight wi that represents the relative size of aconnected component CCi. Components with less than3 nodes or disease-associated genes were not used inthe calculation.
Results
The main results described in this section are basedon the analysis of 8 real-world datasets. The only ex-ception is the subsection below, which reports the testresults on synthetic datasets.
Identification of predefined relationships in syntheticdatasets
To verify how well FuNeL is able to identify func-tional relationships, we tested it first on synthetic
datasets generated using GAMETES. We used 80 dif-ferent model configurations that varied in heritability,number of SNPs and ease of detection, and tested thesuccess rate on 50 datasets per model. Given the smallnumber of attributes in the synthetic datasets, we usedonly the C2 protocol configuration in the tests (no fea-ture selection, single learning phase). The percentageof successfully identified relationships for each modelis reported in Table 3. We counted as success the pres-ence of an edge between the interacting pair of SNPsin the inferred network.
As expected, a higher success rate was obtained formodels where relationships were easy to detect (H-EDM). The performance increased with higher valuesof heritability and 100% success rate was obtained forheritability values above 0.05 regardless of model diffi-culty. The overall results are similar to those reportedin [28], or even slightly better, as FuNeL’s success ratewas unaffected by the increase in the number of SNPs.
Complementarity of the enriched terms
To test how unique are the biological terms (GO termsand pathways) over-represented in the inferred Fu-NeL networks, we measured an overlap between termsfound for each type of network. We defined the over-lap between terms enriched for networks inferred usingconfigurations Ca and Cb as:
O(Ca, Cb) =c
ua + ub + c
where c is the number of common terms, ua is thenumber of unique terms for Ca and ub is the numberof unique terms for Cb.
Table 4 summaries the pair-wise overlap between the4 different FuNeL configurations. For GO terms we re-ported the average overlap between the biological pro-cess, cellular component and molecular function cat-egories. Although configurations that operate on thesame dataset (C1/C3 and C2/C4) shared the most
Lazzarini et al. Page 8 of 16
terms/pathways, the overlap is quite far from 100%.The observed difference is a result of the second train-ing stage. Configurations used on different datasets(i.e. different set of attributes) resulted in networkssharing less than 40% GO terms and 20% pathways.
Table 4 Average overlap of enriched GO terms and pathwaysbetween different FuNeL configurations. The overlap wasaveraged across all 8 datasets.
Gene Ontology Pathways
C1 C2 C3 C4 C1 C2 C3 C4
C1 — 0.353 0.749 0.405 — 0.186 0.513 0.183C2 — 0.321 0.701 — 0.095 0.591C3 — 0.364 — 0.104C4 — —
Similarly, we analysed the term overlap between co-prediction and co-expression by comparing the Ci net-works with their co-expression counterparts SE (Ci)and SN (Ci) generated with different approaches (seeTable 5). We found the percentage of overlap to be sim-ilar across the different inference methods. The over-lap in enriched terms was never higher than 62% (stillleading to a difference around 40%) and was the largestfor configuration not using feature selection (C2 andC4). In general the percentages were lower for biolog-ical pathways with a minimum of only 10% of sharedterms. Low values of terms overlap indicate that theco-prediction and the co-expression approaches can beseen as complementary. Despite starting from the samedataset, they generate networks expressing differentbiological information.
Table 5 Average overlap of enriched GO terms and pathwaysbetween the co-prediction and co-expression networks. Eachco-expression network Ci was compared to the correspondingco-expression networks SE(Ci) and SN(Ci). The overlap wasaveraged across all 8 datasets.
Co-expression (SE) Co-expression (SN)
Method Cat. C1 C2 C3 C4 C1 C2 C3 C4
PCCGO 0.280 0.414 0.297 0.432 0.315 0.576 0.367 0.488
path. 0.223 0.260 0.258 0.190 0.264 0.400 0.175 0.287
ARACNEGO 0.348 0.621 0.272 0.565 0.333 0.612 0.277 0.535
path. 0.126 0.463 0.139 0.479 0.085 0.423 0.016 0.356
MICGO 0.316 0.513 0.283 0.487 0.300 0.614 0.289 0.527
path. 0.097 0.339 0.142 0.315 0.112 0.469 0.080 0.352
Quantifying the amount of captured biologicalknowledge
The amount of biological knowledge (number of en-riched terms) captured by a network is related to itssize (number of nodes). To fairly compare the networksof different sizes we used the normalised EnrichmentScore (ES):
ES =number of enriched terms
number of nodes
The score assesses if a network contains biologicallyrelated nodes. The higher it is, the larger is a biologicalsimilarity between the nodes of a network.
To have a global view of the performances of each in-ference method in term of ES, we performed a two-stepanalysis for each enrichment category. First, using theES, we ranked the networks generated by each methodin order to identify the best performing one. See Sec-tion 4 of the Supplementary Material for the completeanalysis.
Once we identified the best network for each method,we ranked them together by ES and calculated theiraverage rank across the datasets. The results of thisanalysis are reported in Table 6. MIC performed bestwhen ES was calculated using the GO terms (it wasranked first in each of those categories). When ESwas calculated using the biological pathways, C4 andARACNE SE(C1) shared the highest rank.
Table 6 Average ranks based on the Enrichment Score for thebest performing networks of each inference method. For eachcategory and for each method, we report the network used in theanalysis. The ranks (in brackets) were averaged across all 8datasets, and the highest ranks are shown with bold font. The lastrow reports the average ranks across all the biological categories.The following abbreviations were used for GO categories:biological process (BP), molecular function (MF) and cellularcomponent (CC).
Category FuNeL PCC ARACNE MIC
GO BP C4 (3) SE(C3) (1.5) SN(C3) (4) SN(C3) (1.5)GO MF C3 (3.5) SN(C3) (3.5) SN(C3) (2) SN(C3) (1)GO CC C3 (4) SN(C1) (3) SN(C3) (2) SN(C3) (1)Patwhays C4 (1.5) SN(C2) (3.5) SE(C1) (1.5) SE(C3) (3.5)
Average 3 2.88 2.38 1.75
Table 6 shows that the best performing networks foreach method were mostly C3 co-expression counter-parts, in particular SN(C3). This is consistent withthe result of the topological analysis in Section 3 ofthe Supplementary Material were these networks werefound to have the lowest number of nodes, and suggeststhat smallest networks tend to be more enriched. Thedifference in performance between the FuNeL configu-rations is mainly a result of the application of the sec-ond machine learning phase (the best networks wereC3 and C4).
In Supplementary Table 6 we reported the results ofa similar analysis where we compared the similarity-based inference methods against FuNeL (ranks in thererange from 1 to 12: 4 Ci + 4 SE (Ci) + 4 SN (Ci). Inthis pairwise analysis, FuNeL networks performed sim-ilarly to PCC and ARACNE. We did not observe anyconsistent winner across all the enrichment categories.MIC seems to have better results than FuNeL only forGO categories, as emerged from Table 6, while FuNeLnetworks tend to be more enriched for biological path-ways.
Lazzarini et al. Page 9 of 16
Evaluation of the networks in a disease context
To verify if the topology of the inferred networks isbiologically meaningful, we analysed how it definesthe relationships between genes that are known to beassociated with a disease targeted by each dataset.We expected the disease-associated genes to be moreclosely connected than other genes and to be presentin functional units, such as triangle motifs. We mea-sured the proximity of the disease-associated genes(i.e. how closely connected they are compared withnon-disease-associated genes) and counted the num-ber of triangular relationships present in each network(i.e. the percentage of triangles containing one, two orthree disease-associated genes). We repeated the two-step analysis as presented in Section Quantifying theamount of captured biological knowledge by using thegene-disease metrics for the ranking. The results arereported in Table 7. The detailed results for each in-ference method are available in Section 5 of the Sup-plementary Material.
Table 7 Average ranks based on the disease-associations forthe best performing networks of each inference method. Foreach category and for each method we report the network usedfor the analysis. The ranks (in brackets) were averaged across all8 datasets, and the highest ranks are shown with bold font. Thelast row reports the average ranks across all the categories. Thenumber of disease-associated genes participating in a triangle isdenoted as 1A, 2A and 3A.
Source Cat. FuNeL PCC ARACNE MIC
Curated
1A C2 (1) SN(C2) (4) SN(C3) (2.5) SN(C2) (2.5)2A C3 (1) SN(C3) (2) SE(C2) (3) SN(C2) (4)3A C1 (2) SN(C1) (3) SE(C4) (4) SE(C2) (1)Proximity C2 (1) SN(C3) (2.5) SE(C4) (2.5) SE(C2) (4)
Average 1.25 2.88 3 2.88
Malacards
1A C2 (1) SN(C2) (4) SN(C4) (3) SE(C4) (2)2A C2 (1.5) SN(C4) (4) SE(C4) (1.5) SN(C2) (3)3A C3 (2) SN(C4) (3) SE(C2) (4) SN(C2) (1)Proximity C2 (1) SE(C4) (4) SE(C4) (3) SE(C2) (2)
Average 1.78 3.75 2.88 2
The average ranks, for both sources of disease associa-tions, suggest that co-prediction outperforms the otherinference paradigms. The proximity of the disease-associated genes was in general higher in C2 network.Therefore, the co-prediction paradigm has identifiedthe core elements of the network more accurately. Thisresult highlights the benefits of including functional in-formation, whenever these are available, in the networkinference process (FuNeL is using the class labels as-signed to the samples of the dataset), in contrast tothe co-expression approach solely based on gene ex-pression similarity (unsupervised).
There is also a clear difference in the number ofdisease-associated genes participating in the triangles;co-prediction networks were ranked higher than the co-expression networks. The only category in which MIC
had a higher rank was 3A. However, considering thatthere were not many triangles with disease-associatedgenes, many ties affected the ranks in this category.Overall, these results demonstrate the higher predic-tive potential of the FuNeL networks in identifying newdisease associations.
Prostate cancer case study: enriched terms
To compare in detail the difference in biological knowl-edge captured by the co-prediction and co-expressionnetworks, we followed our global analysis with a casestudy focused on a dataset targeting a single disease— prostate cancer [35]. We were especially interestedin specific knowledge captured by one paradigm butnot the other.
In Figures 4 and 5 we compared the co-prediction andPCC co-expression networks inferred from the prostatecancer dataset. We focused on unique GO terms andpathways, enriched only in one type of networks. Forthe sake of readability we filtered out the generic GOterms (with depth < 9 in the GO hierarchical struc-ture). C2 was the network with the largest number ofunique terms, followed by C4 and SN(C2). We found16 GO terms and 21 pathways unique to co-predictionnetworks and only 3 GO terms and 4 pathways uniqueto co-expression networks. A similar disproportion infavour of the co-prediction networks was found in com-parison with MIC and ARACNE networks (see Sup-plementary Figures 2 and 3).
We found several of the unique GO terms enrichedin the co-prediction networks to be related to prostatecancer. The role of the Protein ubiquination in prostatecancer was recently analysed and showed an impact forits treatments [49]. ERK pathway is involved in themotility of prostate cancer cells [50]. Prostate cancercells seems to alter the nature of their calcium influxto promote growth and acquire apoptotic resistance[51]. Furthermore, the role of calcium homeostasis inthe majority of the cell-signaling pathways involvedin carcinogenesis has been well established, prostatecancer included [52].
A number of enriched pathways specific to co-predictionnetworks are also highly relevant to the prostate can-cer. Several studies demonstrated the involvement ofthe JAK/STAT pathway in the prostate cancer devel-opment [53, 54]. There is multiple evidence suggest-ing that one of the major aging-associated influenceson prostate carcinogenesis is oxidative stress and itscumulative impact on DNA damage [55, 56]. Finally,FAS (also called Apo1 or CD95) plays a central role inthe physiological regulation of programmed cell death
Lazzarini et al. Page 10 of 16
C1 C2 C3 C4
SE(C1)
SE(C2)
SE(C3)
SE(C4)
SN(C1)
SN(C2)
SN(C3)
SN(C4)
positive regulation of peptidyltyrosine phosphorylation (GO:0050731)
regulation of MAP kinase activity (GO:0043405)
positive regulation of cytosolic calcium ion concentration (GO:0007204)
negative regulation of protein phosphorylation (GO:0001933)
regulation of calcium ion transport (GO:0051924)
regulation of metal ion transport (GO:0010959)
positive regulation of T cell proliferation (GO:0042102)
positive regulation of ERK1 and ERK2 cascade (GO:0070374)
cellular calcium ion homeostasis (GO:0006874)
regulation of protein ubiquitination (GO:0031396)
positive regulation of neurogenesis (GO:0050769)
regulation of epithelial cell apoptotic process (GO:1904035)
regulation of focal adhesion assembly (GO:0051893)
regulation of epithelial cell migration (GO:0010632)
positive regulation of epithelial cell migration (GO:0010634)
positive regulation of neuron differentiation (GO:0045666)
regulation of neuron differentiation (GO:0045664)
calcium ion homeostasis (GO:0055074)
regulation of neuron projection development (GO:0010975)
GO biological process terms
Fig 4 Number of unique enriched GO terms (biological process) for each network configuration (generated from the prostatecancer dataset). On the x-axis we show the 12 investigated networks. On the y-axis we show the names of enriched terms unique toco-prediction or PCC co-expression networks. Red terms are associated with co-expression networks, blue with co-prediction. Emptycolumns indicate networks with no unique terms.
C1 C2 C3 C4
SE(C1)
SE(C2)
SE(C3)
SE(C4)
SN(C1)
SN(C2)
SN(C3)
SN(C4)
VEGF signaling pathway (P00056)
Insulin/IGF pathwaymitogen activated protein kinase kinase/MAP kinase cascade (P00032)
Axon guidance mediated by semaphorins (P00007)
Huntington disease (P00029)
Beta1 adrenergic receptor signaling pathway (P04377)
5HT4 type receptor mediated signaling pathway (P04376)
Dopamine receptor mediated signaling pathway (P05912)
5HT2 type receptor mediated signaling pathway (P04374)
5HT1 type receptor mediated signaling pathway (P04373)
Opioid proenkephalin pathway (P05915)
Opioid prodynorphin pathway (P05916)
Opioid proopiomelanocortin pathway (P05917)
Cortocotropin releasing factor receptor signaling pathway (P04380)
Angiotensin IIstimulated signaling through G proteins and betaarrestin (P05911)
Insulin/IGF pathwayprotein kinase B signaling cascade (P00033)
Beta3 adrenergic receptor signaling pathway (P04379)
Beta2 adrenergic receptor signaling pathway (P04378)
Oxidative stress response (P00046)
Oxytocin receptor mediated signaling pathway (P04391)
Metabotropic glutamate receptor group II pathway (P00040)
Muscarinic acetylcholine receptor 2 and 4 signaling pathway (P00043)
JAK/STAT signaling pathway (P00038)
Alpha adrenergic receptor signaling pathway (P00002)
Enkephalin release (P05913)
FAS signaling pathway (P00020)
Biological pathways
Fig 5 Number of unique enriched biological pathways for each network configuration (generated from the prostate cancer
dataset). On the x-axis we show the 12 investigated networks. On the y-axis we show the names of enriched pathways unique toco-prediction or PCC co-expression networks. Red terms are associated with co-expression networks, blue with co-prediction. Emptycolumns indicate networks with no unique pathways.
Lazzarini et al. Page 11 of 16
Fig 6 Overlap of enriched terms between the best performing networks in the disease-association analysis (curated databases).On the left the overlap of GO terms (including all 3 categories: BP, CC and MF), on the right the overlap of pathways.
and has been implicated in the pathogenesis of vari-ous malignancies and diseases of the immune systemincluding prostate cancer [57].
We also performed an additional analysis of the bi-ological terms related to the hubs (highly connectednodes) of the inferred networks. A node v was consid-ered to be a hub if its degree was at least one standarddeviation above the mean network degree. To comparethe networks, we used the 10 most frequent Gene On-tology terms (biological processes with at least depth10) shared among each network’s hubs. We found 16unique terms for co-prediction networks, 19 uniqueterms for PCC co-expression networks and 11 com-mon terms. These results further highlight that somebiological terms are exclusively associated either withco-prediction or co-expression networks. The completeanalysis (method by method) is available in the Sup-plementary Material (Supplementary Figures 4, 5 and6).
A further analysis of term overlap was conducted us-ing only the best performing networks in the curateddisease-association analysis (namely C2 for FuNeL,SN(C3) for PCC, SE(C4) for ARACNE and SE(C2)for MIC, see Section 5 of the Supplementary Mate-rial for details). In Figure 6 we show the overlap ofGO terms (including all three GO categories) andpathways across networks from different inference al-gorithms. In both categories FuNeL had much largernumber of unique terms than the co-expression meth-ods and it shared the largest number of terms withARACNE. In total 122 common GO terms were foundbetween all the methods, while there was only 1 com-
mon pathway. Figure 6 further highlights the comple-mentarity between the co-prediction and co-expressionapproaches in terms of captured biological knowledge.
Prostate cancer case study: disease associations
We searched the literature and the public cancerdatabases (not used in the inference process), to verifyif key nodes in the generated networks are associatedwith prostate cancer. As a measure of node importancewe used the node degree (number of connections) andthe betweenness centrality (number of shortest pathsbetween all pair of nodes pass through a given node).
Literature analysis We picked the top 3 most con-nected nodes (hubs) for each of the four co-predictionnetworks. The set contained six genes: GSTM2, NELL2,CFD, PTGDS, PAGE4 and LMO3. All the genes fromthis set, except LMO3, were also found to be the mostcentral nodes (with highest betweenness centrality).
Almost all these genes are related with prostate cancer:
• NELL2 contributes to alterations in epithelial-stromal homeostasis in benign prostatic hyperpla-sia and codes for a novel prostatic growth factor[58], and is also an indicator of expression changesin cancer samples [59],
• CFD (adipsin gene) is over expressed in PPperiprostatic adipose tissue of prostate cancer pa-tients [60],
Lazzarini et al. Page 12 of 16
• PTGDS (and other 2 genes) are expressed at con-sistently lower levels in clinical prostate cancer tis-sues and form a signature that predicts biochem-ical relapse [61],
• PAGE4 modulates androgen receptor signaling,promoting the progression to advanced lethalprostate cancer [62], and has a significantly lowerexpression level in patients with prostate recur-rent disease [63],
• LMO3 interacts with p53, a well known gene tu-mour suppressor in prostate cancer [64].
The only gene without literature support was GSTM2.It might represent a good target for further experimen-tal verification.
Validation on independent data To further validatethe biological significance of the inferred networks,we used an independent prostate cancer dataset [65]from the cBioPortal for Cancer Genomics [66]. Weanalysed the top 10 hubs (nodes with highest degree)and the top 10 central nodes (with highest between-ness centrality) in the co-prediction network that bet-ter performed in the gene-disease association analy-sis using the curated databases: C2 (see Supplemen-tary Table 8a). The genes with highest degree were:PTGDS, PAGE4, NELL2, GSTM2, PARM1, MAF,LMO3, COL4A6, RBP1 and ABL1. For the between-ness centrality, the set was almost identical, only RBP1was replaced by MYH11. On average the expressionin samples was altered in 31.8% cases for hubs andin 35.6% cases for central nodes. The most alteredgenes were found to be downregulated at the mRNAlevel: COL4A6 (65%), MYH11 (58%), PARM1 (53%)and GSTM2 (52%). In addition, genomic alterationsin several key genes have been found to be strongly co-occurent (e.g. PTGDS – GSTM2, PAGE4 – COL4A6,PAGE4 – RBP1, etc.).
When we repeated this analysis for the co-expressionnetworks that were best ranked in the gene-diseaseanalysis using the curated databases (SN(C3) forPCC, SE(C4) for ARACNE and SE(C2) for MIC),we found that on average the alteration level was con-sistently lower, at most half of the co-prediction keygenes. The percentages of alterations are representedas boxplots in Figure 7, while the average alterationsare reported in Table 8. As Figure 7 shows, our methodis able to identify many more genes with higher per-centage of alteration than other methods. Therefore,the topologically important nodes in the best co-prediction network represent genes more strongly re-lated to the prostate cancer, with over two times morefrequent genomic alterations.
Fig 7 Distribution of the percentage genomic alteration inthe samples of an independent dataset for top 10 hubs andcentral nodes. The topologically important genes wereselected from the best performing networks in thedisease-association analysis on curated datasets.
Table 8 Average percentage of genomic alteration for top hubsand central nodes in the independent dataset.
Genes FuNeL PCC ARACNE MIC
Hubs 31.8 % 14.2 % 12.3 % 15.2 %Central nodes 35.6 % 14.7 % 12.2 % 17.1 %
The detailed list of genomic alterations for top 10 hubsand top 10 central nodes for each analysed networkis shown in Section 6 of the Supplementary Material(Figures 7–14).
Discussion
We proposed FuNeL, a protocol to infer functionalnetworks based on the co-prediction paradigm wherethe structure of a rule-based machine learning model(in this paper the rules of a classification algorithmcalled BioHEL) is used to identify relationships be-tween genes. We tested FuNeL on synthetic datasetsand obtained a high success rate in identifying pair-wise relationships between attributes. Encouraged bythis result, we hypothesised that a rule-based machinelearning model, with its complex knowledge represen-tation, might be used to identify biologically mean-ingful relationships that escape the standard inferencemethods.
To test this hypothesis, we evaluated 4 different con-figurations of the inference protocol using 8 cancer-related transcriptomics datasets. We compared Fu-NeL with other 3 co-expression inference methodsby using networks of matching size generated fromthe same data. We looked at the differences, betweenco-prediction and co-expression, from three points of
Lazzarini et al. Page 13 of 16
view: basic topological properties, enriched biologi-cal terms and relationships between known disease-associated genes.
The comparison of networks topology (see Section 3 ofthe Supplementary Material) revealed the influence ofthe protocol options. Not surprisingly, both the featureselection and the second training phase reduced thesize of the networks, but at the same time, increasedthe clustering coefficient and the number of connec-tions. The clustering coefficient was found to be lowerin almost all the ARACNE networks, probably dueto the pruning procedure, it was also lower in manyMIC networks. Moreover, when feature selection wasapplied, the resulting networks had higher clusteringcoefficient than PCC co-expression networks with thesame number of edges. Interestingly, all co-expressionnetworks were less compact, with up to 3 times higherdiameter for PCC and ARACNE and up to 7 timeshigher for MIC.
The differences in networks topology translated to dif-ferences in contained biological information. The over-lap between enriched GO terms and pathways acrossprotocol configurations was generally low, indicatingthat different configurations infer networks that cap-ture different biological knowledge. The same termsoverlap between the co-prediction networks and theirequivalent co-expression counterparts was even lower,never exceeding 62%. We interpret that as evidence,that the biological knowledge captured by the twoparadigms is not completely redundant, but in a largepart complementary.
The most apparent differences between the networkswere observed during the analysis of the connectionsbetween genes known to be related to a specific dis-ease. The disease-associated genes were more closelyconnected (higher proximity) in the co-prediction net-works, which means that the disease-related nodes ofthe network were closer to its core. We also found thatthe number of functional units (triangle motifs), thatcan identify new gene-disease associations, was higherin the co-prediction networks. Therefore, we concludethat the co-prediction networks better capture the ab-stract concept of functional relationship.
The prostate cancer case study further confirmed thisconclusion. We found enriched GO terms and biologi-cal pathways, unique to the co-prediction networks, tobe reported in the literature as related to prostate can-cer. Furthermore, FuNeL generated networks enrichedwith knowledge totally missed by all the co-expressionnetworks when using the prostate cancer dataset. Wealso found that genes corresponding to the topologi-cally important nodes in the co-prediction networks:
(1) were altered in a high percentage of tumour sam-ples in an independent cancer transcriptomic study,and (2) were already associated with prostate can-cer according to the specialised literature. Therefore,the co-prediction networks not only capture biologi-cal knowledge complementary to the co-expression net-works, but also highlights better the important genesinvolved in the disease process.
The superior performance of FuNeL networks in identi-fying the disease-associated genes is likely a result of ef-fective use of the class labels of the samples, which thesimilarity-based methods ignore. Although it would betempting to attribute this performance difference en-tirely to the use of supervised learning in FuNeL, itwould be an overstatement, as the knowledge of ex-plicit links between genes and diseases is not availableto it in training. Our hypothesis is that this is rather aresult of differences in expression values of the disease-associated genes, which taken together are able to dis-criminate between sample phenotypes.
Given that our co-prediction networks were found tobe not only biologically meaningful, but also comple-mentary to similarity-based functional networks, webelieve that network inference based on machine learn-ing models deserves to be studied in more detail in thefuture. In here we only touched the subject of featureselection and network post-processing, and althoughwe now know they indeed influence the network topol-ogy and its biological interpretation, there are manystrategies to choose from in that respect.
At the same time, the machine learning step in theFuNeL protocol does not have to be limited to therule-based machine learning methods. We can imagineunsupervised methods, such as the Apriori algorithmfor association rule learning, or other supervised meth-ods, such as decision tree algorithms (e.g. C4.5 or ran-dom forest), replacing BioHEL in the FuNeL protocol.Some adjustment would be necessary to extract theknowledge from a different model representation, butthe rest of the protocol could remain unchanged. Forexample in the case of the decision trees, relationshipscould be inferred between attributes that share thesame path from the root to the leaves of a tree. Thispotential flexibility in the choice of a learning algo-rithm, together with the ability to apply the protocolto different types of data, becomes important in thecontext of results correctness. As has been discussedto a great length in [67], when methods or data usedin the network inference process are tightly controlled,some results will replicate more easily than others notbecause they are correct, but due to a replicable bias.Therefore a diversity in methods and data is a neces-sary condition to be able to converge on the scientifictruth.
Lazzarini et al. Page 14 of 16
Finally, in terms of testing new functional networks,there is a limit of how thorough and complete a man-ual literature analysis can be, which leads to a greatneed of synthetic or experimentally validated bench-marks, similar to those proposed for protein-proteininteraction networks or gene regulatory networks. Al-though we understand that this would be a difficultand challenging task, we see this as a necessary stepon the way to refining the functional inference meth-ods.
Conclusions
We presented FuNeL: a protocol for the inference offunctional networks from rule-based machine learningmodels. FuNeL is based on the co-prediction paradigm,which hypothesises that genes used together with arule-based machine learning model, are more likely tobe functionally related. We verified that FuNeL cor-rectly identifies relationships in synthetic datasets andwe thoroughly compared FuNeL to three co-expressioninference methods: PCC, ARACNE and MIC, on 8real-world datasets. We contrasted the different ap-proaches by looking at the inferred networks topol-ogy, enriched biological terms and the relationshipsbetween genes associated with cancer. We found thatFuNeL networks capture relevant biological knowledgethat is complementary to what is captured by theco-expression approaches, and demonstrated that Fu-NeL networks are better at identifying relationshipsbetween genes with known disease associations.
Availability of data and material
The datasets used for the analysis were collected from the following public
domain resources:
Dlbcl http://ico2s.org/datasets/microarray.html
CNS http://datam.i2r.a-star.edu.sg/datasets/krbd/NervousSystem/NervousSystem.html
Leukemia http://datam.i2r.a-star.edu.sg/datasets/krbd/Leukemia/ALLAML.html
Lung-Michigan http://datam.i2r.a-star.edu.sg/datasets/krbd/LungCancer/LungCancer-Michigan.html
Lung-Harvard http://datam.i2r.a-star.edu.sg/datasets/krbd/LungCancer/LungCancer-Harvard2.html
Prostate http://datam.i2r.a-star.edu.sg/datasets/krbd/ProstateCancer/ProstateCancer.html
AML http://www.biolab.si/supp/bi-cancer/projections/info/AMLGSE2191.html
Colon-Breast http://www.biolab.si/supp/bi-cancer/projections/info/BC_CCGSE3726_frozen.html
The FuNeL source code is publicly available:
Project name: FuNeL
Project home page: http://ico2s.org/software/funel.html
Operating system(s): GNU/Linux
Programming language: Python, R
License: GNU GPLv3
Author’s contributions
NL, PW, NK and JB designed the experiments. NL and PW performed the
experiments. NL, PW, NK and JB analysed the data. RH and SW
contributed to the biological validation of the results. NL, PW, NK and JB
wrote the paper. All authors read and approved the final manuscript.
Acknowledgements
We thank Joseph Mullen for the help provided in finding the gene-disease
associations. This work made use of the facilities of N8 HPC Centre of
Excellence, provided and funded by the N8 consortium and EPSRC
[EP/K000225/1]. The Centre is co-ordinated by the Universities of Leeds
and Manchester.
Funding
This work was supported by the Engineering and Physical Sciences Research
Council [EP/L001489/2, EP/J004111/2, EP/I031642/2, EP/N031962/1].
The funding agency was not involved with the design of the study, analysis
and interpretation of data or in the writing of the manuscript.
Author details
1 Interdisciplinary Computing and Complex BioSystems (ICOS) research
group, School of Computing Science, Newcastle University, Newcastle upon
Tyne, UK. 2 Clinical and Experimental Pharmacology Group, Cancer
Research UK Manchester Institute, University of Manchester, Manchester,
UK. 3 Northern Institute for Cancer Research, Medical School, Newcastle
University, Newcastle upon Tyne, UK.
References
1. Bansal, M., Belcastro, V., Ambesi-Impiombato, A., di Bernardo, D.:
How to infer gene networks from expression profiles. Molecular
Systems Biology 3(1) (2007). doi:10.1038/msb4100120
2. Margolin, A.A., Nemenman, I., Basso, K., Wiggins, C., Stolovitzky, G.,
Dalla Favera, R., Califano, A.: ARACNE: An algorithm for the
reconstruction of gene regulatory networks in a mammalian cellular
context. BMC Bioinformatics 7(Suppl 1), 1–15 (2006).
doi:10.1186/1471-2105-7-S1-S7
3. Barzel, B., Barabasi, A.-L.: Network link prediction by global silencing
of indirect correlations. Nature biotechnology 31(8), 720–725 (2013).
doi:10.1038/nbt.2601
4. Huynh-Thu, V.A., Irrthum, A., Wehenkel, L., Geurts, P.: Inferring
regulatory networks from expression data using tree-based methods.
PLoS ONE 5(9), 1–10 (2010). doi:10.1371/journal.pone.0012776
5. Childs, K.L., Davidson, R.M., Buell, C.R.: Gene Coexpression Network
Analysis as a Source of Functional Annotation for Rice Genes. PLoS
ONE 6(7), 22196 (2011). doi:10.1371/journal.pone.0022196
6. Presson, A.P., Sobel, E.M., Papp, J.C., Suarez, C.J., Whistler, T.,
Rajeevan, M.S., Vernon, S.D., Horvath, S.: Integrated weighted gene
co-expression network analysis with an application to chronic fatigue
syndrome. BMC Systems Biology 2(1), 1–12 (2008).
doi:10.1186/1752-0509-2-95
7. Ray, M., Jianhua, R., Weixiong, Z.: Variations in the transcriptome of
Alzheimer’s disease reveal molecular networks involved in
cardiovascular diseases. Genome Biology 9(10), 148 (2008).
doi:10.1186/gb-2008-9-10-r148
8. Ransbotyn, V., Yeger-Lotem, E., Basha, O., Acuna, T., Verduyn, C.,
Gordon, M., Chalifa-Caspi, V., Hannah, M.A., Barak, S.: A
combination of gene expression ranking and co-expression network
analysis increases discovery rate in large-scale mutant screens for novel
Arabidopsis thaliana abiotic stress genes. Plant Biotechnology Journal
13(4), 501–513 (2015). doi:10.1111/pbi.12274
9. Yang, Y., Han, L., Yuan, Y., Li, J., Hei, N., Liang, H.: Gene
co-expression network analysis reveals common system-level properties
of prognostic genes across cancer types. Nature Communications 5(2014). doi:10.1038/ncomms4231
10. Kommadath, A., Bao, H., Arantes, A.S., Plastow, G.S., Tuggle, C.K.,
Bearson, S.M.D., Luo Guan, L., Stothard, P.: Gene co-expression
network analysis identifies porcine genes associated with variation in
Salmonella shedding. BMC Genomics 15(1), 1–15 (2014).
doi:10.1186/1471-2164-15-452
11. Wei, S.-N., Zhao, W.-J., Zeng, X.-J., Kang, Y.-M., Du, J., Li, H.-H.:
Microarray and co-expression network analysis of genes associated with
acute doxorubicin cardiomyopathy in mice. Cardiovascular Toxicology
15(4), 377–393 (2015). doi:10.1007/s12012-014-9306-7
Lazzarini et al. Page 15 of 16
12. Silva, A.T., Ribone, P.A., Chan, R.L., Ligterink, W., Hilhorst, H.W.M.:
A predictive co-expression network identifies novel genes controlling
the seed-to-seedling phase transition in Arabidopsis thaliana. Plant
Physiology 170(4), 2218–2231 (2016). doi:10.1104/pp.15.01704
13. Mordelet, F., Vert, J.-P.: ProDiGe: Prioritization Of Disease Genes
with multitask machine learning from positive and unlabeled examples.
BMC Bioinformatics 12(1), 1–15 (2011).
doi:10.1186/1471-2105-12-389
14. Martınez-Ballesteros, M., Nepomuceno-Chamorro, I.A., Riquelme, J.C.:
Inferring gene-gene associations from Quantitative Association Rules.
In: ISDA, pp. 1241–1246 (2011). doi:10.1109/ISDA.2011.6121829
15. Nepomuceno-Chamorro, I.A., Aguilar-Ruiz, J.S., Riquelme, J.C.:
Inferring gene regression networks with model trees. BMC
bioinformatics 11(1), 517 (2010). doi:10.1186/1471-2105-11-517
16. Yoshida, M., Koike, A.: SNPInterForest: a new method for detecting
epistatic interactions. BMC bioinformatics 12(1), 469 (2011).
doi:10.1186/1471-2105-12-469
17. Urbanowicz, R.J., Granizo-Mackenzie, A., Moore, J.H.: An Analysis
Pipeline with Statistical and Visualization-Guided Knowledge
Discovery for Michigan-Style Learning Classifier Systems. IEEE Comp.
Int. Mag. 7(4), 35–45 (2012). doi:10.1109/MCI.2012.2215124
18. Urbanowicz, R.J., Andrew, A.S., Karagas, M.R., Moore, J.H.: Role of
genetic heterogeneity and epistasis in bladder cancer susceptibility and
outcome: a learning classifier system approach. Journal of the
American Medical Informatics Association : JAMIA 20(4), 603–612
(2013). doi:10.1136/amiajnl-2012-001574
19. Bassel, G.W., Glaab, E., Marquez, J., Holdsworth, M.J., Bacardit, J.:
Functional Network Construction in Arabidopsis Using Rule-Based
Machine Learning on Large-Scale Data Sets. The Plant Cell Online
23(9), 3101–3116 (2011). doi:10.1105/tpc.111.088153
20. Glaab, E., Bacardit, J., Garibaldi, J.M., Krasnogor, N.: Using
Rule-Based Machine Learning for Candidate Disease Gene
Prioritization and Sample Classification of Cancer Gene Expression
Data. PLoS ONE 7(7), 39932 (2012).
doi:10.1371/journal.pone.0039932
21. Swan, A.L., Hillier, K.L., Smith, J.R., Allaway, D., Liddell, S.,
Bacardit, J., Mobasheri, A.: Analysis of mass spectrometry data from
the secretome of an explant model of articular cartilage exposed to
pro-inflammatory and anti-inflammatory stimuli using machine
learning. BMC musculoskeletal disorders 14(1), 349 (2013).
doi:10.1186/1471-2474-14-349
22. Fainberg, H.P., Bodley, K., Bacardit, J., Li, D., Wessely, F., Mongan,
N.P., Symonds, M.E., Clarke, L., Mostyn, A.: Reduced Neonatal
Mortality in Meishan Piglets: A Role for Hepatic Fatty Acids? PLoS
ONE 7(11), 49101 (2012). doi:10.1371/journal.pone.0049101
23. Bacardit, J., Widera, P., Marquez-Chamorro, A., Divina, F.,
Aguilar-Ruiz, J.S., Krasnogor, N.: Contact map prediction using a
large-scale ensemble of rule sets and the fusion of multiple predicted
structural features. Bioinformatics 28(19), 2441–2448 (2012).
doi:10.1093/bioinformatics/bts472
24. Bacardit, J., Burke, E.K., Krasnogor, N.: Improving the scalability of
rule-based evolutionary learning. Memetic Computing 1(1), 55–67
(2009). doi:10.1007/s12293-008-0005-4
25. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for
cancer classification using Support Vector Machines. Machine Learning
46(1-3), 389–422 (2002). doi:10.1023/A:1012487302797
26. Schaffter, T., Marbach, D., Floreano, D.: GeneNetWeaver: in silico
benchmark generation and performance profiling of network inference
methods. Bioinformatics 27(16), 2263–2270 (2011).
doi:10.1093/bioinformatics/btr373
27. Urbanowicz, R.J., Kiralis, J., Sinnott-Armstrong, N.A., Heberling, T.,
Fisher, J.M., Moore, J.H.: GAMETES: a fast, direct algorithm for
generating pure, strict, epistatic models with random architectures.
BioData Mining 5(1), 1–14 (2012). doi:10.1186/1756-0381-5-16
28. Li, J., Malley, J.D., Andrew, A.S., Karagas, M.R., Moore, J.H.:
Detecting gene-gene interactions using a permutation-based random
forest method. BioData Mining 9(1), 1–17 (2016).
doi:10.1186/s13040-016-0093-5
29. Baron, D., Bihouee, A., Teusan, R., Dubois, E., Savagner, F.,
Steenman, M., Houlgatte, R., Ramstein, G.: MADGene: retrieval and
processing of gene identifier lists for the analysis of heterogeneous
microarray datasets. Bioinformatics 27(5), 725–726 (2011).
doi:10.1093/bioinformatics/btq710
30. Shipp, M.A., Ross, K.N., Tamayo, P., Weng, A.P., Kutok, J.L., Aguiar,
R.C., Gaasenbeek, M., Angelo, M., Reich, M., Pinkus, G.S., Ray, T.S.,
Koval, M.A., Last, K.W., Norton, A., Lister, T.A., Mesirov, J.,
Neuberg, D.S., Lander, E.S., Aster, J.C., Golub, T.R.: Diffuse large
B-cell lymphoma outcome prediction by gene-expression profiling and
supervised machine learning. Nature medicine 8(1), 68–74 (2002).
doi:10.1038/nm0102-68
31. Pomeroy, S.L., Tamayo, P., Gaasenbeek, M., Sturla, L.M., Angelo, M.,
McLaughlin, M.E., Kim, J.Y.H., Goumnerova, L.C., Black, P.M., Lau,
C., Allen, J.C., Zagzag, D., Olson, J.M., Curran, T., Wetmore, C.,
Biegel, J.A., Poggio, T., Mukherjee, S., Rifkin, R., Califano, A.,
Stolovitzky, G., Louis, D.N., Mesirov, J.P., Lander, E.S., Golub, T.R.:
Prediction of central nervous system embryonal tumour outcome based
on gene expression. Nature 415(6870), 436–442 (2002).
doi:10.1038/415436a
32. Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M.,
Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A.,
Bloomfield, C.D., Lander, E.S.: Molecular Classification of Cancer:
Class Discovery and Class Prediction by Gene Expression Monitoring.
Science 286(5439), 531–537 (1999). doi:10.1126/science.286.5439.531
33. Beer, D.G., Kardia, S.L., Huang, C.-C.C., Giordano, T.J., Levin, A.M.,
Misek, D.E., Lin, L., Chen, G., Gharib, T.G., Thomas, D.G., Lizyness,
M.L., Kuick, R., Hayasaka, S., Taylor, J.M., Iannettoni, M.D.,
Orringer, M.B., Hanash, S.: Gene-expression profiles predict survival of
patients with lung adenocarcinoma. Nature medicine 8(8), 816–824
(2002). doi:10.1038/nm733
34. Gordon, G.J., Jensen, R.V., Hsiao, L.-L., Gullans, S.R., Blumenstock,
J.E., Ramaswamy, S., Richards, W.G., Sugarbaker, D.J., Bueno, R.:
Translation of Microarray Data into Clinically Relevant Cancer
Diagnostic Tests Using Gene Expression Ratios in Lung Cancer and
Mesothelioma. Cancer Research 62(17), 4963–4967 (2002)
35. Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C.,
Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P., Lander, E.S.,
Loda, M., Kantoff, P.W., Golub, T.R., Sellers, W.R.: Gene expression
correlates of clinical prostate cancer behavior. Cancer Cell 1(2),
203–209 (2002). doi:10.1016/S1535-6108(02)00030-2
36. Yagi, T., Morimoto, A., Eguchi, M., Hibi, S., Sako, M., Ishii, E.,
Mizutani, S., Imashuku, S., Ohki, M., Ichikawa, H.: Identification of a
gene expression signature associated with pediatric AML prognosis.
Blood 102(5), 1849–1856 (2003). doi:10.1182/blood-2003-02-0578
37. Chowdary, D., Lathrop, J., Skelton, J., Curtin, K., Briggs, T., Zhang,
Y., Yu, J., Wang, Y., Mazumder, A.: Prognostic gene expression
signatures can be measured in tissues collected in RNAlater
preservative. The Journal of Molecular Diagnostics 8(1), 31–39 (2006).
doi:10.2353/jmoldx.2006.050056
38. Reshef, D.N., Reshef, Y.A., Finucane, H.K., Grossman, S.R., McVean,
G., Turnbaugh, P.J., Lander, E.S., Mitzenmacher, M., Sabeti, P.C.:
Detecting novel associations in large data sets. Science 334(6062),
1518–1524 (2011). doi:10.1126/science.1205438
39. Jones, E., Oliphant, T., Peterson, P., et al.: SciPy: Open source
scientific tools for Python (2001–). http://www.scipy.org/
40. Meyer, P.E., Lafitte, F., Bontempi, G.: minet: A R/Bioconductor
package for inferring large transcriptional networks using mutual
information. BMC Bioinformatics 9(1), 1–10 (2008).
doi:10.1186/1471-2105-9-461
41. Albanese, D., Filosi, M., Visintainer, R., Riccadonna, S., Jurman, G.,
Furlanello, C.: minerva and minepy: a C engine for the MINE suite and
its R, Python and MATLAB wrappers. Bioinformatics 29(3), 407–408
(2013). doi:10.1093/bioinformatics/bts707
42. Thomas, P.D., Campbell, M.J., Kejariwal, A., Mi, H., Karlak, B.,
Daverman, R., Diemer, K., Muruganujan, A., Narechania, A.:
PANTHER: A library of protein families and subfamilies indexed by
function. Genome Research 13(9), 2129–2141 (2003).
doi:10.1101/gr.772403
43. Rappaport, N., Nativ, N., Stelzer, G., Twik, M., Guan-Golan, Y.,
Iny Stein, T., Bahir, I., Belinky, F., Morrey, C.P., Safran, M., Lancet,
D.: MalaCards: an integrated compendium for diseases and their
annotation. Database 2013 (2013). doi:10.1093/database/bat018
Lazzarini et al. Page 16 of 16
44. Hamosh, A., Scott, A.F., Amberger, J.S., Bocchini, C.A., McKusick,
V.A.: Online Mendelian Inheritance in Man (OMIM), a knowledgebase
of human genes and genetic disorders. Nucleic Acids Research
33(suppl 1), 514–517 (2005). doi:10.1093/nar/gki033
45. INSERM: Orphanet: an online database of rare diseases and orphan
drugs (1997). http://www.orpha.net/
46. Magrane, M., Consortium, U.: UniProt Knowledgebase: a hub of
integrated protein data. Database 2011 (2011).
doi:10.1093/database/bar009
47. Davis, A.P., Grondin, C.J., Lennon-Hopkins, K., Saraceni-Richards, C.,
Sciaky, D., King, B.L., Wiegers, T.C., Mattingly, C.J.: The
Comparative Toxicogenomics Database’s 10th year anniversary: update
2015. Nucleic Acids Research 43(D1), 914–920 (2015).
doi:10.1093/nar/gku935
48. Tran, N.H., Choi, K.P., Zhang, L.: Counting motifs in the human
interactome. Nature communications 4 (2013).
doi:10.1038/ncomms3241
49. Chen, Z., Lu, W.: Roles of Ubiquitination and SUMOylation on
Prostate Cancer: Mechanisms and Clinical Implications. International
Journal of Molecular Sciences 16(3), 4560–4580 (2015).
doi:10.3390/ijms16034560
50. McCubrey, J.A., Steelman, L.S., Chappell, W.H., Abrams, S.L., Wong,
E.W.T., Chang, F., Lehmann, B., Terrian, D.M., Milella, M., Tafuri,
A., Stivala, F., Libra, M., Basecke, J., Evangelisti, C., Martelli, A.M.,
Franklin, R.A.: Roles of the Raf/MEK/ERK pathway in cell growth,
malignant transformation and drug resistance. Biochimica et
Biophysica Acta (BBA) - Molecular Cell Research 1773(8), 1263–1284
(2007). doi:10.1016/j.bbamcr.2006.10.001. Mitogen-Activated Protein
Kinases: New Insights on Regulation, Function and Role in Human
Disease
51. Monteith, G.R.: Prostate cancer cells alter the nature of their calcium
influx to promote growth and acquire apoptotic resistance. Cancer cell
26(1), 19–32 (2014). doi:10.1016/j.ccr.2014.06.015
52. Flourakis, M., Prevarskaya, N.: Insights into Ca2+ homeostasis of
advanced prostate cancer cells. Biochimica et Biophysica Acta (BBA)
- Molecular Cell Research 1793(6), 1105–1109 (2009).
doi:10.1016/j.bbamcr.2009.01.009. 10th European Symposium on
Calcium
53. Kwon, E.M., Holt, S.K., Fu, R., Kolb, S., Williams, G., Stanford, J.L.,
Ostrander, E.A.: Androgen metabolism and JAK/STAT pathway genes
and prostate cancer risk. Cancer Epidemiology 36(4), 347–353 (2012).
doi:10.1016/j.canep.2012.04.002
54. Barton, B.E., Karras, J.G., Murphy, T.F., Barton, A., Huang, H.F.-S.:
Signal transducer and activator of transcription 3 (STAT3) activation
in prostate cancer: Direct STAT3 inhibition induces apoptosis in
prostate cancer lines. Molecular Cancer Therapeutics 3(1), 11–20
(2004)
55. Minelli, A., Bellezza, I., Conte, C., Culig, Z.: Oxidative stress-related
aging: A role for prostate cancer? Biochimica et Biophysica Acta
(BBA) - Reviews on Cancer 1795(2), 83–91 (2009).
doi:10.1016/j.bbcan.2008.11.001
56. Khandrika, L., Kumar, B., Koul, S., Maroni, P., Koul, H.K.: Oxidative
stress in prostate cancer. Cancer Letters 282(2), 125–136 (2009).
doi:10.1016/j.canlet.2008.12.011
57. Drewa, T., Wolski, Z., Skok, Z., Czajkowski, R., Wisniewska, H.: The
FAS-related apoptosis signaling pathway in the prostate intraepithelial
neoplasia and cancer lesions. Acta Poloniae Pharmaceutica 63(4),
311–315 (2006)
58. DiLella, A.G., Toner, T.J., Austin, C.P., Connolly, B.M.: Identification
of Genes Differentially Expressed in Benign Prostatic Hyperplasia.
Journal of Histochemistry and Cytochemistry 49(5), 669–670 (2001).
doi:10.1177/002215540104900517
59. Luo, J., Dunn, T.A., Ewing, C.M., Walsh, P.C., Isaacs, W.B.:
Decreased gene expression of steroid 5 alpha-reductase 2 in human
prostate cancer: Implications for finasteride therapy of prostate
carcinoma. The Prostate 57(2), 134–139 (2003).
doi:10.1002/pros.10284
60. Ribeiro, R., Monteiro, C., Silvestre, R., Castela, A., Coutinho, H.,
Fraga, A., Prıncipe, P., Lobato, C., Costa, C., Cordeiro-da-Silva, A.,
Lopes, J.M., Lopes, C., Medeiros, R.: Human periprostatic white
adipose tissue is rich in stromal progenitor cells and a potential source
of prostate tumor stroma. Experimental Biology and Medicine
237(10), 1155–1162 (2012). doi:10.1258/ebm.2012.012131
61. Thompson, V.C., Day, T.K., Bianco-Miotto, T., Selth, L.A., Han, G.,
Thomas, M., Buchanan, G., Scher, H.I., Nelson, C.C., Greenberg,
N.M., Butler, L.M., Tilley, W.D.: A gene signature identified using a
mouse model of androgen receptor-dependent prostate cancer predicts
biochemical relapse in human disease. Int J Cancer 131(3), 662–672
(2012). doi:10.1002/ijc.26414
62. Sampson, N., Ruiz, C., Zenzmaier, C., Bubendorf, L., Berger, P.:
PAGE4 Positivity Is Associated with Attenuated AR Signaling and
Predicts Patient Survival in Hormone-Naive Prostate Cancer. The
American Journal of Pathology 181(4), 1443–1454 (2012).
doi:10.1016/j.ajpath.2012.06.040
63. Shiraishi, T., Terada, N., Zeng, Y., Suyama, T., Luo, J., Trock, B.,
Kulkarni, P., Getzenberg, R.: Cancer/Testis antigens as potential
predictors of biochemical recurrence of prostate cancer following
radical prostatectomy. Journal of Translational Medicine 9(1), 153
(2011). doi:10.1186/1479-5876-9-153
64. Larsen, S., Yokochi, T., Isogai, E., Nakamura, Y., Ozaki, T.,
Nakagawara, A.: LMO3 interacts with p53 and inhibits its
transcriptional activity. Biochemical and Biophysical Research
Communications 392(3), 252–257 (2010).
doi:10.1016/j.bbrc.2009.12.010
65. Taylor, B.S., Schultz, N., Hieronymus, H., Gopalan, A., Xiao, Y.,
Carver, B.S., Arora, V.K., Kaushik, P., Cerami, E., Reva, B., Antipin,
Y., Mitsiades, N., Landers, T., Dolgalev, I., Major, J.E., Wilson, M.,
Socci, N.D., Lash, A.E., Heguy, A., Eastham, J.A., Scher, H.I., Reuter,
V.E., Scardino, P.T., Sander, C., Sawyers, C.L., Gerald, W.L.:
Integrative Genomic Profiling of Human Prostate Cancer. Cancer Cell
18(1), 11–22 (2010). doi:10.1016/j.ccr.2010.05.026
66. Cerami, E., Gao, J., Dogrusoz, U., Gross, B.E., Sumer, S.O., Aksoy,
B.A., Jacobsen, A., Byrne, C.J., Heuer, M.L., Larsson, E., Antipin, Y.,
Reva, B., Goldberg, A.P., Sander, C., Schultz, N.: The cBio Cancer
Genomics Portal: An Open Platform for Exploring Multidimensional
Cancer Genomics Data. Cancer Discovery 2(5), 401–404 (2012).
doi:10.1158/2159-8290.CD-12-0095
67. Verleyen, W., Ballouz, S., Gillis, J.: Positive and negative forms of
replicability in gene network analysis. Bioinformatics 32(7), 1065–1073
(2016). doi:10.1093/bioinformatics/btv734
Supplementary MaterialFunctional networks inference from machine learning models
Nicola Lazzarini, Paweł Widera, Stuart Williamson, Rakesh Heer, Natalio Krasnogor and Jaume Bacardit
1 Classification accuracy under feature selection
To choose the default percentage of attributes retained in the feature selection procedure, we performed apreliminary analysis using all 8 transcriptomic datasets from the main article. We evaluated how the Bio-HEL classification accuracy changes with the number of selected features (using linear SVM-RFE). The ac-curacy was measured using a standard 10 cross-fold validation. The full experiment (not reported here) used100%, 90%, 80%, ..., 10% of the original dataset attributes.
We found that even when only 10% of the attributes are retained, the classification accuracy remains almostunchanged. Specifically, with 10% of the original attributes the accuracy increased for 2 datasets, slightlydecreased for 3 datasets and remained unaltered for the other datasets. The exact results are reported inSupplementary Table 1 below:
Supplementary Table 1: BioHEL classification accuracy for each dataset, in 10-fold cross-validation experiments onthe original and reduced set of attributes (before and after the feature selection). Linear SVM-RFE was used to selectbest 10% of the attributes.
dataset all attributes 10% attributesDlbcl 0.871 0.886CNS 0.473 0.451Leukemia 0.945 0.945Lung-Michigan 0.980 0.980Lung-Harvard 0.978 0.964Prostate 0.892 0.892AML 0.637 0.592Colon-Breast 0.903 0.940
Given that we were able to maintain good classification accuracy despite large reduction in number of usedattributes, we decided to use 10% attributes as a default setting for the FuNeL feature selection procedure.However, this FuNeL parameter is under the user control and the default setting can be changed.
2 Time complexity
The FuNeL protocol has four stages (see Figure 2 in the main article): (1) feature selection (optional), (2)rule-based network generation, (3) permutation test and (4) second rule-based network generation (optional).
The running time for the whole pipeline depends on the rule set generation time (execution time of BioHEL),as the optional feature selection stage can be seen as running in constant time. Two main factors that influencethe rule set generation time are: (1) the number of attributes and (2) the number of samples.
We performed an execution time analysis of BioHEL using the largest (in terms of number of attributes) Colon-Breast dataset (Chowdary et al., 2006). In the feature selection stage we retained: 20, 200, 2000, 10 000 and20 000 attributes. From each of these 5 datasets we generated 100 random subsets of 50, 40, 30, 20 and 10samples. Finally, we ran BioHEL 1000 times to obtain 1000 rule sets for each dataset.
Supplementary Figure 1 shows the running times averaged across 100 000 runs (1000 runs for each of the 100datasets).
1
0 5000 10000 15000 20000attributes
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
seco
nds
BioHel execution times
50 samples40 samples30 samples20 samples10 samples
Supplementary Figure 1: Average execution times of a single BioHEL run for a given number of samples andattributes.
The total execution time of FuNeL configurations C1 and C2 is calculated as:
T1 = (rule_sets× t(atts1, samples)) + (permutation_runs× t(atts1, samples)) (1)
where rule_sets is the number of inferred rule sets, permutation_runs is the number of randomised datasetsused in the permutation test and t(atts1, samples) represents execution time of a single BioHEL run, thatlinearly depends on the size of a dataset measured in number of attributes and samples.
Configurations C3 and C4 require an additional run of BioHEL (step 4), and their total execution time is:
T2 = T1 + (rule_sets× t(atts2, samples)) (2)
where atts2 is the number of attributes after the permutation test (atts1 ≤ atts2).
It is important to notice that each run of BioHEL is independent, thus the generation of the rule sets can betrivially parallelised without any extra overhead. Given n computational cores, the total execution times couldbe reduced to:
Treal1 = T1
nTreal2 = T2
n(3)
2
3 Comparison of networks topological properties
The network topology refers to the spatial arrangements of its elements. The analysis of topological propertiestells us how different nodes are connected to each other and how their communication paths look like. Thereare many aspects and characteristics that can be evaluated in a network. For simplicity we report just fourmetrics: number of nodes, number of edges, clustering coefficient and diameter.
The clustering coefficient is a measure of degree to which nodes in a network tend to cluster together. Itexpresses the likelihood that any two nodes with a common neighbour are themselves connected. The diameterindicates the maximum distance between two nodes in the network.
We compared the topology of the networks built with two different approaches: co-prediction and co-expression.For each generated network we calculated the topological properties described above. We compare the FuNeLnetworks with co-expression networks inferred with different methods. The Supplementary Tables 2 to 4 belowshow the results for the PCC, ARACNE and MIC networks.
Supplementary Table 2: Topological properties for co-prediction and PCC co-expression networks generated for all8 datasets.
Co-prediction Co-expression (SE) Co-expression (SN)Dataset Cat. C1 C2 C3 C4 SE(C1) SE(C2) SE(C3) SE(C4) SN(C1) SN(C2) SN(C3) SN(C4)
Leukemia
Nodes 421 1480 294 988 683 873 843 941 422 1482 293 979Edges 1529 2294 2154 2646 1529 2294 2154 2646 680 7145 409 2870Clust.Coef. 0.712 0.155 0.589 0.33 0.333 0.348 0.354 0.341 0.323 0.388 0.303 0.344Diameter 5 6 4 6 24 18 19 22 16 18 9 20
LungH
Nodes 429 1419 382 1030 578 930 955 1214 432 1413 384 1027Edges 1068 2317 2398 3410 1068 2317 2398 3410 617 4302 476 2650Clust.Coef. 0.344 0.298 0.43 0.404 0.356 0.373 0.376 0.372 0.341 0.386 0.296 0.376Diameter 5 8 5 7 10 23 23 21 6 22 6 23
LungM
Nodes 91 919 48 247 76 280 59 119 90 915 50 248Edges 134 1858 78 410 134 1858 78 410 224 13574 64 1510Clust.Coef. 0.379 0.262 0.418 0.457 0.465 0.525 0.446 0.514 0.539 0.523 0.493 0.511Diameter 3 5 3 3 6 11 6 6 5 14 6 12
CNS
Nodes 501 4257 494 3538 945 2152 1616 2607 501 4261 488 3532Edges 4302 25069 12769 40840 4302 25069 12769 40840 1553 171052 1502 90395Clust.Coef. 0.743 0.255 0.521 0.302 0.354 0.389 0.367 0.400 0.346 0.427 0.35 0.421Diameter 4 7 4 6 21 15 23 13 12 13 14 12
Dlbcl
Nodes 201 1699 201 1617 207 1411 1238 1790 200 1699 200 1614Edges 848 10471 7351 33170 848 10471 7351 33170 832 24280 832 17865Clust.Coef. 0.872 0.574 0.642 0.453 0.508 0.438 0.411 0.51 0.501 0.504 0.501 0.481Diameter 3 5 3 5 2 16 17 14 2 14 2 15
GSE2191
Nodes 890 4802 846 3561 837 1848 1239 1750 897 4799 839 3553Edges 3290 13424 6469 12074 3290 13424 6469 12074 3711 90410 3292 47806Clust.Coef. 0.488 0.082 0.317 0.291 0.377 0.409 0.394 0.4 0.382 0.415 0.377 0.417Diameter 5 9 5 9 25 23 19 21 21 13 25 15
GS3726
Nodes 668 2077 524 1170 879 1300 992 1367 759 2300 1739 3440Edges 1761 3255 2051 3502 1761 3255 2051 3502 1471 9808 5479 26761Clust.Coef. 0.134 0.0077 0.307 0.109 0.226 0.23 0.223 0.233 0.213 0.287 0.254 0.346Diameter 8 10 7 8 20 26 20 25 26 15 19 15
Prostate
Nodes 938 4290 704 2277 356 543 322 448 920 4298 702 2287Edges 3796 10175 3090 6546 3796 10175 3090 6546 33250 914829 16934 24427Clust.Coef. 0.328 0.245 0.29 0.25 0.565 0.607 0.541 0.584 0.655 0.703 0.641 0.711Diameter 7 10 6 8 6 9 5 8 8 11 7 12
3
Supplementary Table 3: Topological properties for co-prediction and ARACNE co-expression networks generated forall 8 datasets.
Co-prediction Co-expression (SE) Co-expression (SN)Dataset Cat. C1 C2 C3 C4 SE(C1) SE(C2) SE(C3) SE(C4) SN(C1) SN(C2) SN(C3) SN(C4)
Leukemia
Nodes 421 1480 294 988 1024 1426 1356 1577 422 1480 294 989Edges 1529 2294 2154 2646 1529 2294 2154 2646 512 2416 327 1479Clust.Coef. 0.712 0.155 0.589 0.330 0.002 0.002 0.002 0.002 0.000 0.002 0.000 0.002Diameter 5 6 4 6 17 19 22 19 9 17 11 19
LungH
Nodes 429 1419 382 1030 907 1614 1653 2066 429 1419 382 1030Edges 1068 2317 2398 3410 1068 2317 2398 3410 435 1924 375 1250Clust.Coef. 0.344 0.298 0.430 0.404 0.007 0.006 0.006 0.005 0.013 0.006 0.012 0.007Diameter 5 8 5 7 23 16 15 13 14 18 10 18
LungM
Nodes 91 919 48 247 143 1321 96 370 91 920 48 247Edges 134 1858 78 410 134 1858 78 410 72 1127 34 259Clust.Coef. 0.379 0.262 0.418 0.475 0.000 0.002 0.000 0.009 0.000 0.005 0.000 0.014Diameter 3 5 3 3 13 17 11 18 11 17 5 11
CNS
Nodes 501 4257 494 3538 2002 4509 3581 5342 502 4257 494 3538Edges 4302 25069 12769 40840 4302 25069 12769 41661 513 20409 505 12358Clust.Coef. 0.743 0.255 0.521 0.302 0.004 0.005 0.006 0.026 0.004 0.005 0.004 0.005Diameter 4 7 4 6 12 8 12 7 20 9 20 12
Dlbcl
Nodes 201 1699 201 1617 380 1452 1191 2236 201 1699 201 1617Edges 848 10471 7351 33170 848 10471 7351 33890 269 14149 269 12903Clust.Coef. 0.872 0.574 0.642 0.453 0.136 0.126 0.140 0.176 0.113 0.110 0.113 0.115Diameter 3 5 3 5 13 9 11 5 12 8 12 9
GSE2191
Nodes 890 4802 846 3561 2574 5226 3846 5027 890 4802 846 3561Edges 3290 13424 6469 12074 3290 13424 6469 12076 846 10671 794 5564Clust.Coef. 0.488 0.082 0.317 0.291 0.002 0.002 0.002 0.002 0.004 0.002 0.004 0.002Diameter 5 9 5 9 19 13 16 13 30 15 30 17
GS3726
Nodes 668 2077 524 1170 1362 2167 1546 2279 668 2166 524 1170Edges 1761 3255 2051 3502 1761 3255 2051 3502 787 3250 597 1455Clust.Coef. 0.134 0.077 0.307 0.109 0.024 0.053 0.029 0.050 0.014 0.053 0.016 0.021Diameter 8 10 7 8 20 20 19 18 15 20 13 19
Prostate
Nodes 938 4290 704 2277 2760 6805 2268 4575 939 4290 704 2277Edges 3796 10175 3090 6546 3796 10175 3090 6546 1300 6095 1017 3102Clust.Coef. 0.328 0.245 0.290 0.250 0.005 0.003 0.006 0.003 0.001 0.003 0.002 0.005Diameter 7 10 6 8 13 13 15 13 12 13 9 15
4
Supplementary Table 4: Topological properties for co-prediction and MIC co-expression networks generated for all 8datasets.
Co-prediction Co-expression (SE) Co-expression (SN)Dataset Cat. C1 C2 C3 C4 SE(C1) SE(C2) SE(C3) SE(C4) SN(C1) SN(C2) SN(C3) SN(C4)
Leukemia
Nodes 421 1480 294 988 640 807 780 896 421 1480 294 989Edges 1529 2294 2154 2646 1529 2294 2155 2647 749 6173 432 3096Clust.Coef. 0.712 0.155 0.589 0.330 0.162 0.182 0.180 0.179 0.138 0.180 0.127 0.175Diameter 5 6 4 6 18 29 27 29 10 17 8 18
LungH
Nodes 429 1419 382 1030 384 685 703 944 429 1419 382 1030Edges 1068 2317 2398 3410 1068 2317 2399 3410 1264 5867 1045 3841Clust.Coef. 0.344 0.298 0.430 0.404 0.349 0.308 0.305 0.302 0.339 0.282 0.343 0.305Diameter 5 8 5 7 9 13 13 17 7 18 9 19
LungM
Nodes 91 919 48 247 118 626 79 219 91 919 48 247Edges 134 1858 78 410 134 1858 78 410 93 3109 38 484Clust.Coef. 0.379 0.262 0.418 0.475 0.212 0.272 0.213 0.306 0.208 0.235 0.153 0.302Diameter 3 5 3 3 8 18 7 8 6 14 3 7
CNS
Nodes 501 4257 494 3538 1424 3104 2357 3725 501 4257 495 3538Edges 4302 25069 12769 40840 4305 25131 12771 40850 704 62208 694 36027Clust.Coef. 0.743 0.255 0.521 0.302 0.124 0.154 0.144 0.159 0.089 0.162 0.091 0.161Diameter 4 7 4 6 17 11 12 10 11 10 11 10
Dlbcl
Nodes 201 1699 201 1617 475 1140 1047 1453 203 1699 203 1617Edges 848 10471 7351 33170 848 10471 7362 33172 196 74773 196 59307Clust.Coef. 0.872 0.574 0.642 0.453 0.111 0.240 0.219 0.319 0.082 0.381 0.082 0.366Diameter 3 5 3 5 21 13 16 15 11 11 11 11
GSE2191
Nodes 890 4802 846 3561 1700 4129 2617 3883 890 4803 846 3563Edges 3290 13424 6469 12074 3299 13433 6469 12207 1380 17797 1271 10540Clust.Coef. 0.488 0.082 0.317 0.291 0.109 0.095 0.098 0.098 0.120 0.095 0.118 0.099Diameter 5 9 5 9 22 15 18 16 19 15 21 15
GS3726
Nodes 668 2077 524 1170 1271 1921 1357 1996 672 2152 526 1172Edges 1761 3255 2051 3502 1890 3261 2056 3524 852 3921 538 1705Clust.Coef. 0.134 0.077 0.307 0.109 0.110 0.100 0.104 0.100 0.126 0.099 0.121 0.117Diameter 8 10 7 8 23 29 23 28 14 24 14 24
Prostate
Nodes 938 4290 704 2277 687 839 667 773 964 4290 712 2277Edges 3796 10175 3090 6546 3981 10186 3777 8257 15928 1763794 5254 308709Clust.Coef. 0.328 0.245 0.290 0.250 0.167 0.278 0.169 0.265 0.313 0.758 0.218 0.661Diameter 7 10 6 8 7 8 7 8 7 8 7 9
When analysing FuNeL networks we observed, as expected, that configurations having feature selection (C1 andC3) lead to networks with a smaller number of nodes than when the original set of attributes is used (C2 andC4). Furthermore, the second phase of machine learning modeling (C3 and C4) tends to reduce the number ofnodes as it uses a reduced set of attributes as input (only significant nodes and their neighbours from the firsttraining phase), while increasing both clustering coefficient and number of edges.
When comparing FuNeL and co-expression networks we notice that the ARACNE SE counterparts have ingeneral more nodes. The same patter can be found in SE(C2 and SE(C4) counterparts generated with PCCand MIC, while it’s not true for the SE-networks based on configurations that use feature selection (C1 andC2). Conversely, SN -networks differ according to the inference method used. In fact ARACNE generated SNcounterparts with less edges, while this is true only for SN(C1) and SN(C3) inferred with MIC and PCC.The clustering coefficient is constantly lower in ARACNE networks than in FuNeL, this is probably due to thepruning phase operated by the method. A similar trend can be noticed for MIC networks with some exceptions(e.g Prostate SN(C2) and SN(C4)). A more balanced situation occurs when FuNeL is contrasted with PCC,in fact networks generated with feature selection (C1 and C3) have a lower coefficient than their co-expressioncounterparts. Finally a clear pattern emerge when analysing the diameter of the networks. Co-predictionnetworks are always more compact than co-expression counterparts having up to 3 time lower diameter for MICand PCC and up to 7 time lower for ARACNE.
5
4 Enrichment Score analysis
In this section we report the network average rankings, based on the Enrichment Score, across the 8 datasetsfor each inferring method. The networks are ranked between 1 and N (where N = 4 for FuNeL and N = 8for PCC, ARACNE and MIC: 4 SE(Ci) + 4 SN(Ci)). We considered Gene Ontology terms (biological process(BP), molecular function (MF) and cellular component (CC)) and biological pathways. The last row of eachtable represents the average rank across different biological categories.
Supplementary Table 5: Average network ranks for each method across the 8 datasets (based on ES).
Cat. C1 C2 C3 C4
GO BP 3 4 2 1GO MF 4 2.5 1 2.5GO CC 2 4 1 3Pathways 4 2 3 1
Average 3.25 3.125 1.75 1.88
(a) FuNeL networks
PCC (SE) PCC (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4
GO BP 2 4 1 3 6 7 5 8GO MF 8 3.5 5 6 7 3.5 1 2GO CC 2 5 4 6 1 8 3 7Pathways 6 5 4 3 7 1 8 2Average 4.5 4.38 3.5 4.5 5.25 4.88 4.25 4.75
(b) PCC networks
ARACNE (SE) ARACNE (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4
GO BP 4 7 5.5 8 2 5.5 1 3GO MF 4 8 4 7 2 6 1 4GO CC 3 7 5 8 2 6 1 4Pathways 1 4 2 6 7 5 8 3Average 3 6.5 4.13 7.25 3.25 5.63 2.75 3.5
(c) ARACNE networks
MIC (SE) MIC (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4
GO BP 2 6 4 5 3 8 1 7GO MF 1 6 4 7 3 8 2 5GO CC 3 5.5 4 5.5 2 8 1 7Pathways 2.5 4 1 5.5 8 5.5 7 2.5Average 2.13 5.38 3.25 5.75 4 7.38 2.75 5.38
(d) MIC networks
We also compared the generated networks against FuNeL. The networks are ranked from 1 to 12: 4 Ci + 4SE(Ci) + 4 SN (Ci). The ranks in Supplementary Table 6 are averaged across the 8 datasets, for each biologicalcategory and for each network. The row-wise rank is given in brackets and the highest ranks are shown withbold font. The following abbreviations were used for GO categories: biological process (BP), molecular function(MF) and cellular component (CC).
Supplementary Table 6: Average network ranks (co-prediction vs. co-expression).
Co-prediction Co-expression (SE) Co-expression (SN)Method Cat. C1 C2 C3 C4 SE(C1) SE(C2) SE(C3) SE(C4) SN(C1) SN(C2) SN(C3) SN(C4)
PCC
GO BP 6.06 (6) 7.00 (7.5) 7.00 (7.5) 5.88 (3.5) 5.88 (3.5) 5.88 (3.5) 5.12 (1) 5.88 (3.5) 7.06 (9) 7.75 (12) 7.12 (10) 7.38 (11)GO MF 7.81 (11) 5.38 (2) 6.19 (5) 5.62 (3) 9.12 (12) 6.50 (7.5) 6.50 (7.5) 7.38 (10) 7.19 (9) 6.25 (6) 4.31 (1) 5.75 (4)GO CC 4.31 (5) 11.00 (12) 4.19 (4) 9.00 (10) 3.88 (2) 6.25 (6.5) 6.25 (6.5) 8.12 (8) 3.19 (1) 9.38 (11) 4.06 (3) 8.38 (9)Pathways 8.12 (10.5) 4.75 (2) 8.12 (10.5) 4.38 (1) 6.94 (8) 6.50 (6) 6.69 (7) 5.88 (5) 7.62 (9) 5.00 (3) 8.50 (12) 5.50 (4)
ARACNE
GO BP 6.69 (7) 6.25 (5.5) 6.94 (10) 4.75 (1) 6.25 (5.5) 8.12 (11) 6.88 (8.5) 9.25 (12) 5.19 (3) 6.88 (8.5) 5.06 (2) 5.75 (4)GO MF 7.44 (10) 6.50 (8) 6.19 (6.5) 5.69 (3) 6.00 (5) 8.75 (12) 5.62 (2) 8.62 (11) 5.81 (4) 7.00 (9) 4.19 (1) 6.19 (6.5)GO CC 4.31 (4) 10.75 (12) 3.44 (3) 8.25 (8) 5.50 (5) 9.38 (10) 6.75 (7) 10.12 (11) 2.44 (2) 8.88 (9) 2.31 (1) 5.88 (6)Pathways 7.88 (10.5) 5.38 (3) 7.88 (10.5) 5.25 (2) 4.88 (1) 6.25 (6) 5.50 (4) 7.00 (8) 7.56 (9) 6.50 (7) 8.38 (12) 5.56 (5)
MIC
GO BP 7.44 (8.5) 7.88 (11) 7.44 (8.5) 6.38 (5.5) 4.12 (2) 7.00 (7) 6.00 (4) 6.38 (5.5) 4.81 (3) 9.12 (12) 3.94 (1) 7.50 (10)GO MF 8.06 (10) 8.50 (12) 7.19 (8) 7.75 (9) 3.62 (1) 6.50 (5.5) 5.12 (3) 6.75 (7) 5.31 (4) 8.12 (11) 4.56 (2) 6.50 (5.5)GO CC 5.19 (6) 11.62 (12) 4.44 (4) 10.12 (10) 4.25 (3) 7.12 (7.5) 5.12 (5) 7.12 (7.5) 3.19 (2) 10.38 (11) 1.69 (1) 7.75 (9)Pathways 8.75 (12) 4.75 (1) 7.50 (10) 5.50 (3) 6.00 (5) 6.50 (6) 5.00 (2) 6.62 (7) 7.56 (11) 7.00 (9) 6.94 (8) 5.88 (4)
6
5 Disease association analysis
In this section we report the network average rankings across the 8 datasets for every inferring method basedon the gene-disease association properties: participation in triangular relationship and proximity. We used twosources for the disease associations: Malacards (Rappaport et al., 2013) (a meta-database of human maladiesconsolidated from 64 independent sources) and manually curated databases (OMIM (Hamosh et al., 2005),Orphanet (INSERM, 1997), Uniprot (Magrane and Consortium, 2011) and CTD (Davis et al., 2015)). Thenetworks are ranked between 1 and N (where N = 4 for FuNeL and N = 8 for PCC, ARACNE and MIC: 4SE(Ci) + 4 SN(Ci)). The number of disease-associated genes participating in a triangle is denoted as 1A, 2Aand 3A. The last row of each table represents the average rank across different metrics.
Supplementary Table 7: Average network ranks for each method across the 8 datasets (based on diseaseassociations from Malacards).
Cat. C1 C2 C3 C4
1A 3 1 4 22A 4 1 2 33A 3 3 1 3Proximity 3 1 4 2
Average 3.25 1.5 2.75 2.5
(a) FuNeL networks
PCC (SE) PCC (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4
1A 8 5 6 3 4 1 2 72A 7 4 5 3 8 2 6 13A 6.5 6.5 6.5 6.5 3 2 4 1Proximity 1.5 5 3 1.5 7 6 8 4Average 5.75 5.13 5.13 3.5 5.5 2.75 5 3.25
(b) PCC networks
ARACNE (SE) ARACNE (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4
1A 4 8 7 5.5 2.5 2.5 5.5 12A 7 6 8 1 4 3 2 53A 5.5 1 5.5 2 5.5 5.5 5.5 5.5Proximity 7 3 5 1 6 2 8 4Average 5.88 4.5 6.38 2.38 4.5 3.25 5.25 3.88
(c) ARACNE networks
MIC (SE) MIC (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4
1A 6 3 7.5 1 5 2 7.5 42A 8 3 5 2 7 1 6 43A 7 2 8 6 5 1 3 4Proximity 6 1 2 5 7 4 8 3Average 6.75 2.25 5.63 3.5 6 2 6.13 3.75
(d) MIC networks
Supplementary Table 8: Average network ranks for each method across the 8 datasets (based on diseaseassociations from curated databases).
Cat. C1 C2 C3 C4
1A 3 1 4 22A 3 4 1 23A 1 2.5 2.5 4Proximity 4 1 3 2
Average 2.75 2.13 2.67 2.5
(a) FuNeL networks
PCC (SE) PCC (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4
1A 8 4 6.5 1.5 6.5 1.5 3 52A 2 7 5.5 3 4 5.5 1 83A 5 7 4 8 1 6 3 2Proximity 4 8 2.5 7 5 2.5 1 6Average 4.5 6.5 4.63 4.88 5.13 3.88 2 5.25
(b) PCC networks
ARACNE (SE) ARACNE (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4
1A 3 5 8 6 2 4 1 72A 5 1 2 3 7.5 6 7.5 43A 4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5Proximity 6 5 4 1 7.5 3 7.5 2Average 4.63 3.88 4.63 3.63 5.38 4.38 5.13 4.38
(c) ARACNE networks
MIC (SE) MIC (SN)Cat. C1 C2 C3 C4 C1 C2 C3 C4
1A 6 2 4 3 8 1 7 52A 7 3 8 1.5 5.5 1.5 5.5 43A 4 1 8 2 6 5 7 3Proximity 5.5 1 3 4 7 2 8 5.5Average 5.63 1.75 5.75 2.63 6.63 2.38 6.88 4.38
(d) MIC networks
7
6 Case study: prostate cancer dataset
In this section we report the additional results from the analysis performed using the prostate dataset (Singhet al., 2002) as a case study. In particular we show: 1) the overlap of enriched terms between co-prediction andco-expression networks, 2) the overlap between GO terms associated to the hubs of the networks generated withdifferent methods and FuNeL and 3) the average percentages of alteration for key nodes of both co-predictionand co-expression networks in an independent dataset.
6.1 Overlap of the enriched terms
We performed an analysis on the enriched terms of each network to highlight the complementary nature of theco-prediction and the co-expression paradigm. We generated heatmaps showing the unique terms associatedonly to co-prediction or co-expression networks. The main manuscript includes the comparison between FuNeLand PCC networks, in here we report the analysis performed considering ARACNE (Supplementary Figure 2)and MIC (Supplementary Figure 3) networks. For the sake of readability we filtered out the generic GO terms(with depth < 9 in the GO hierarchical structure).
Supplementary Figure 2: Number of non-common enriched GO terms (biological process) for eachnetwork configuration (generated from the prostate cancer dataset). On the x-axis we show the 12 investigatednetworks. On the y-axis we show the names of enriched terms unique to co-prediction or ARACNE co-expressionnetworks. Red terms are associated with co-expression networks, blue with co-prediction. Empty columns indicatenetworks with no unique terms.
Supplementary Figure 3: Number of non-common enriched GO terms (biological process) for eachnetwork configuration (generated from the prostate cancer dataset). On the x-axis we show the 12 investigatednetworks. On the y-axis we show the names of enriched terms unique to co-prediction or MIC co-expression networks.Red terms are associated with co-expression networks, blue with co-prediction. Empty columns indicate networks withno unique terms.
8
When comparing ARACNE and FuNeL, we found 16 unique pathways for co-prediction networks and 8 forco-expression. In terms of unique GO terms, the overlap was more balanced, 7 for co-prediction networks and9 for co-expression networks. C2 and C4, generated without feature selection, had the largest number of uniquepathways, while SE(C2) had the highest number of terms for ARACNE. The comparison of FuNeL with MICgenerated many empty columns (Supplementary Figure 3) for the GO terms because several networks resultedhaving no unique enriched terms. All the 15 unique GO terms related to MIC were associated to SN(C2) (andwith SN(C4) in two cases), conversely FuNeL had more networks sharing the 12 unique terms. Finally, asnoticed for in the ARACNE comparison, FuNeL networks are more enriched in biological pathways: 16 against8 unique terms for MIC co-expression.
6.2 Overlap of hub-related terms
We also analysed the gene associated to the hubs of each network in order to compare the biological knowledgeassociated to them. A node v was considered to be a hub if its degree was at least one standard deviation abovethe mean network degree, that is if:
d(v) > µd + σd (4)
where d(v) is a degree of the node v, and µd and σd are the mean and standard deviation of a network nodedegree distribution.
To compare the networks, we used the 10 most frequent GO terms (biological processes) shared among eachnetwork’s hubs. Supplementary Figures 4 to 6 show the terms-overlap analysis between FuNeL networks andPCC, ARACNE and MIC respectively. To make this analysis more specific we have discarded the most generic/ most common terms (which could be be associated with many genes), we considered only the GO termssituated at level 10 of the GO hierarchy or lower.
Blue terms were found only in co-prediction networks, red terms were found only in co-expression networks,and green terms were found in both. In Supplementary Table 9 we summarise the number of unique andcommon terms shared between networks created with different approaches. This analysis further highlights thecomplementary nature of co-prediction and co-expression approach, the terms that are paradigm-specif alwaysoutnumber the common ones.
Supplementary Table 9: Unique and common terms from networks’ hubs
Terms PCC ARACNE MICCo-prediction 16 18 16Co-expression 19 20 19Common 11 9 11
9
C1 C2 C3 C4
SE(C1)
SE(C2)
SE(C3)
SE(C4)
SN(C1)
SN(C2)
SN(C3)
SN(C4)
positive regulation of ERK1 and ERK2 cascade (GO:0070374)
TRIFdependent tolllike receptor signaling pathway (GO:0035666)
tolllike receptor TLR6:TLR2 signaling pathway (GO:0038124)
positive regulation of neuron projection development (GO:0010976)
epidermal growth factor receptor signaling pathway (GO:0007173)
tolllike receptor TLR1:TLR2 signaling pathway (GO:0038123)
positive regulation of cytosolic calcium ion concentration (GO:0007204)
tolllike receptor 2 signaling pathway (GO:0034134)
positive regulation of peptidyltyrosine phosphorylation (GO:0050731)
tolllike receptor 3 signaling pathway (GO:0034138)
activation of MAPKK activity (GO:0000186)
positive regulation of phosphatidylinositol 3kinase signaling (GO:0014068)
positive regulation of MAP kinase activity (GO:0043406)
regulation of intracellular pH (GO:0051453)
positive regulation of calcineurinNFAT signaling cascade (GO:0070886)
tolllike receptor 4 signaling pathway (GO:0034142)
MyD88dependent tolllike receptor signaling pathway (GO:0002755)
tolllike receptor 5 signaling pathway (GO:0034146)
tolllike receptor signaling pathway (GO:0002224)
positive regulation of the force of heart contraction (GO:0098735)
calcium ion transport (GO:0006816)
regulation of Rho protein signal transduction (GO:0035023)
positive regulation of sodium ion export from cell (GO:1903278)
positive regulation of collateral sprouting (GO:0048672)
negative regulation of activated T cell proliferation (GO:0046007)
regulation of proteasomal ubiquitindependent protein catabolic process (GO:0032434)
spliceosomal trisnRNP complex assembly (GO:0000244)
MyD88independent tolllike receptor signaling pathway (GO:0002756)
positive regulation of cAMP biosynthetic process (GO:0030819)
activation of MAPK activity (GO:0000187)
stimulatory Ctype lectin receptor signaling pathway (GO:0002223)
negative regulation of ERK1 and ERK2 cascade (GO:0070373)
transforming growth factor beta receptor complex assembly (GO:0007181)
activation of protein kinase A activity (GO:0034199)
negative regulation of neuron projection development (GO:0010977)
negative regulation of histone deacetylation (GO:0031064)
negative regulation of activationinduced cell death of T cells (GO:0070236)
activation of phospholipase C activity (GO:0007202)
regulation of skeletal muscle contraction by regulation of release of sequestered calcium ion (GO:0014809)
positive regulation of protein autophosphorylation (GO:0031954)
signal transduction involved in DNA damage checkpoint (GO:0072422)
regulation of cardiac muscle contraction by regulation of the release of sequestered calcium ion (GO:0010881)
positive regulation of peptidylserine phosphorylation (GO:0033138)
negative regulation of BMP signaling pathway (GO:0030514)
T cell receptor signaling pathway (GO:0050852)
positive regulation of glucocorticoid receptor signaling pathway (GO:2000324)
Supplementary Figure 4: Top 10 most frequent biological processes from Gene Ontology found in the network hubswhen comparing FuNeL and PCC co-expression networks.
10
C1 C2 C3 C4
SE(C1)
SE(C2)
SE(C3)
SE(C4)
SN(C1)
SN(C2)
SN(C3)
SN(C4)
stimulatory Ctype lectin receptor signaling pathway (GO:0002223)
TRIFdependent tolllike receptor signaling pathway (GO:0035666)
positive regulation of neuron projection development (GO:0010976)
epidermal growth factor receptor signaling pathway (GO:0007173)
tolllike receptor TLR1:TLR2 signaling pathway (GO:0038123)
positive regulation of cytosolic calcium ion concentration (GO:0007204)
tolllike receptor 2 signaling pathway (GO:0034134)
tolllike receptor 3 signaling pathway (GO:0034138)
activation of MAPKK activity (GO:0000186)
protein polyubiquitination (GO:0000209)
sodium ion export from cell (GO:0036376)
positive regulation of ubiquitinprotein ligase activity involved in regulation of mitotic cell cycle transition (GO:0051437)
DNA damage response, signal transduction by p53 class mediator resulting in cell cycle arrest (GO:0006977)
tolllike receptor 4 signaling pathway (GO:0034142)
negative regulation of telomere single strand break repair (GO:1903824)
tolllike receptor 5 signaling pathway (GO:0034146)
tolllike receptor 9 signaling pathway (GO:0034162)
calcium ion transport (GO:0006816)
anaphasepromoting complexdependent proteasomal ubiquitindependent protein catabolic process (GO:0031145)
activation of protein kinase activity (GO:0032147)
positive regulation of cAMP biosynthetic process (GO:0030819)
positive regulation of sodium ion export from cell (GO:1903278)
protein K11linked ubiquitination (GO:0070979)
positive regulation of activin receptor signaling pathway (GO:0032927)
positive regulation of adenylate cyclase activity involved in Gprotein coupled receptor signaling pathway (GO:0010579)
cellular iron ion homeostasis (GO:0006879)
MyD88independent tolllike receptor signaling pathway (GO:0002756)
MyD88dependent tolllike receptor signaling pathway (GO:0002755)
activation of MAPK activity (GO:0000187)
regulation of cardiac muscle contraction by regulation of the release of sequestered calcium ion (GO:0010881)
negative regulation of ERK1 and ERK2 cascade (GO:0070373)
transforming growth factor beta receptor complex assembly (GO:0007181)
activation of protein kinase A activity (GO:0034199)
negative regulation of neuron projection development (GO:0010977)
positive regulation of ERK1 and ERK2 cascade (GO:0070374)
positive regulation of peptidyltyrosine phosphorylation (GO:0050731)
negative regulation of histone deacetylation (GO:0031064)
negative regulation of activationinduced cell death of T cells (GO:0070236)
activation of phospholipase C activity (GO:0007202)
regulation of skeletal muscle contraction by regulation of release of sequestered calcium ion (GO:0014809)
positive regulation of protein autophosphorylation (GO:0031954)
signal transduction involved in DNA damage checkpoint (GO:0072422)
tolllike receptor TLR6:TLR2 signaling pathway (GO:0038124)
positive regulation of peptidylserine phosphorylation (GO:0033138)
negative regulation of BMP signaling pathway (GO:0030514)
T cell receptor signaling pathway (GO:0050852)
positive regulation of glucocorticoid receptor signaling pathway (GO:2000324)
Supplementary Figure 5: Top 10 most frequent biological processes from Gene Ontology found in the network hubswhen comparing FuNeL and ARACNE co-expression networks.
11
C1 C2 C3 C4
SE(C1)
SE(C2)
SE(C3)
SE(C4)
SN(C1)
SN(C2)
SN(C3)
SN(C4)
stimulatory Ctype lectin receptor signaling pathway (GO:0002223)
positive regulation of ERK1 and ERK2 cascade (GO:0070374)
TRIFdependent tolllike receptor signaling pathway (GO:0035666)
tolllike receptor TLR6:TLR2 signaling pathway (GO:0038124)
epidermal growth factor receptor signaling pathway (GO:0007173)
tolllike receptor TLR1:TLR2 signaling pathway (GO:0038123)
positive regulation of cytosolic calcium ion concentration (GO:0007204)
tolllike receptor 2 signaling pathway (GO:0034134)
tolllike receptor 3 signaling pathway (GO:0034138)
positive regulation of peptidylserine phosphorylation (GO:0033138)
activation of MAPKK activity (GO:0000186)
positive regulation of cysteinetype endopeptidase activity involved in apoptotic signaling pathway (GO:2001269)
positive regulation of dendrite extension (GO:1903861)
regulation of intracellular pH (GO:0051453)
positive regulation of adenylate cyclase activity involved in Gprotein coupled receptor signaling pathway (GO:0010579)
positive regulation of protein ubiquitination (GO:0031398)
MyD88dependent tolllike receptor signaling pathway (GO:0002755)
tolllike receptor 5 signaling pathway (GO:0034146)
tolllike receptor signaling pathway (GO:0002224)
tolllike receptor 9 signaling pathway (GO:0034162)
positive regulation of peptidylthreonine phosphorylation (GO:0010800)
calcium ion transport (GO:0006816)
positive regulation of cAMP biosynthetic process (GO:0030819)
negative regulation of activated T cell proliferation (GO:0046007)
positive regulation of histone acetylation (GO:0035066)
tolllike receptor 4 signaling pathway (GO:0034142)
positive regulation of proteasomal ubiquitindependent protein catabolic process (GO:0032436)
MyD88independent tolllike receptor signaling pathway (GO:0002756)
activation of JUN kinase activity (GO:0007257)
activation of MAPK activity (GO:0000187)
negative regulation of ERK1 and ERK2 cascade (GO:0070373)
transforming growth factor beta receptor complex assembly (GO:0007181)
activation of protein kinase A activity (GO:0034199)
positive regulation of neuron projection development (GO:0010976)
negative regulation of neuron projection development (GO:0010977)
positive regulation of peptidyltyrosine phosphorylation (GO:0050731)
negative regulation of histone deacetylation (GO:0031064)
negative regulation of activationinduced cell death of T cells (GO:0070236)
activation of phospholipase C activity (GO:0007202)
regulation of skeletal muscle contraction by regulation of release of sequestered calcium ion (GO:0014809)
positive regulation of protein autophosphorylation (GO:0031954)
signal transduction involved in DNA damage checkpoint (GO:0072422)
regulation of cardiac muscle contraction by regulation of the release of sequestered calcium ion (GO:0010881)
negative regulation of BMP signaling pathway (GO:0030514)
T cell receptor signaling pathway (GO:0050852)
positive regulation of glucocorticoid receptor signaling pathway (GO:2000324)
Supplementary Figure 6: Top 10 most frequent biological processes from Gene Ontology found in the network hubswhen comparing FuNeL and MIC co-expression networks.
12
6.3 Validation on independent dataset
In this section we report additional informations about the analysis performed using data from the independentprostate cancer study (Taylor et al., 2010) available in the cBioPortal for Cancer Genomics (Cerami et al., 2012).In particular we report the full list of alterations for the topologically important genes analysed in the mainarticle. The Supplementary Figure 7–14 show the percentage of altered tumour samples for top 10 hubs (nodeswith highest degree) and top 10 central nodes (with highest betweenness centrality) in the best performingnetworks according to the gene-disease association analysis (using the information from the curated databases).The selected networks are C2 for FuNeL, SN(C3) for PCC, SE(C4) for ARACNE and SE(C2) for MIC. Forall of them we report the alterations for both hubs and central nodes.
PTGDS 45%
PAGE4 28%
LMO3 11%
GSTM2 52%
NELL2 9%
COL4A6 65%
MAF 24%
ABL1 11%
RBP1 20%
PARM1 53%
Genetic Alteration Deep Deletion Missense Mutation mRNA Upregulation mRNA Downregulation
Supplementary Figure 7: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest degree (hubs) in C2 network are shown.
PTGDS 45%
PAGE4 28%
GSTM2 52%
LMO3 11%
MAF 24%
NELL2 9%
ABL1 11%
COL4A6 65%
PARM1 53%
MYH11 58%
Genetic Alteration Deep Deletion Missense Mutation mRNA Upregulation mRNA Downregulation
Supplementary Figure 8: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest betweenness centrality (central nodes) in C2 network are shown.
13
PCBP3 5%
HAUS5 19%
PSG1 16%
LGALS9 9%
KAT7 5%
POU4F1 1%
SLC9A1 44%
PAX8 4%
ODF1 20%
BCR 19%
Genetic Alteration Amplification Deep Deletion mRNA Upregulation mRNA Downregulation
Supplementary Figure 9: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest degree (hubs) in PCC SN(C3) network are shown.
SBNO2 22%
CADM4 25%
WBSCR22 39%
FOXE1 11%
KAT7 5%
POU4F1 1%
PSG1 16%
PCBP3 5%
DHRS1 4%
HAUS5 19%
Genetic Alteration Amplification mRNA Upregulation mRNA Downregulation
Supplementary Figure 10: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest betweenness centrality (central nodes) in PCC SN(C3) network are shown.
IL10RA 16%
THRA 33%
ALAS2 4%
SAMD14 1%
RHBDL1 8%
PSG1 16%
DDX11 14%
GNAS 13%
BTF3 13%
KIFC3 5%
Genetic Alteration Amplification mRNA Upregulation mRNA Downregulation
Supplementary Figure 11: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest degree (hubs) in ARACNE SE(C4) network are shown.
14
IL10RA 16%
THRA 33%
PSG1 16%
SAMD14 1%
BTF3 13%
GNAS 13%
ALAS2 4%
RHBDL1 8%
MYBPC3 4%
DDX11 14%
Genetic Alteration Amplification mRNA Upregulation mRNA Downregulation
Supplementary Figure 12: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest betweenness centrality (central nodes) in ARACNE SE(C4) network are shown.
WBSCR22 39%
ADAM15 12%
GP2 7%
COPS6 14%
RIMS2 16%
ATP1B2 18%
CEP170B 21%
POU4F1 1%
HAUS5 19%
PCBP3 5%
Genetic Alteration Amplification Deep Deletion mRNA Upregulation mRNA Downregulation
Supplementary Figure 13: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest degree (hubs) in MIC SE(C2) network are shown.
ADAM15 12%
WBSCR22 39%
COPS6 14%
CEP170B 21%
CPNE6 34%
GP2 7%
ATP1B2 18%
EFNA3 9%
UBE3B 13%
PAX8 4%
Genetic Alteration Amplification Deep Deletion mRNA Upregulation mRNA Downregulation
Supplementary Figure 14: Percentage of alterations in tumour samples from an independent cancer genomic study.Genes with highest betweenness centrality (central nodes) in MIC SE(C2) network are shown.
15
7 Visualisation of the co-prediction and co-expression networks
In this section we include the layouts of co-prediction and co-expression networks for three of the datasets usedin the main article: Prostate (used in the case study) and Lung-Michigan (small networks). Only a few examplesare shown, for which the differences in the topology of the networks generated with the two approaches is themost visible. The networks were visualised using the Organic Layout in Cytoscape (Shannon et al., 2003).
Prostate: FuNeL – C1 Prostate: FuNeL – C3
HIST1H2BM
CXorf40B
H1F0
NAB1
NOLC1KDELR1
NR2F6
GPNMB
GUSB IGL@
PHLDA2
EPS8
IL18
SPRY1 CDKN1C
RBPMS
HYAL2
NCF4
MAPKAPK5
NDUFB8ARHGAP22
NPYACSL1
PEBP1
TPT1
OCRL
H2AFY
TSPYL2
CR1
DGKZHLA-DRA
NAGLU
BRE
PRYDNAJB5
ETV5
MATN2
TNFSF10LEPROT
CPMIFI30
HLA-DOA
GPR125
SERPINE1
SOX9
BIK
SFRS2B
PARP1
BOLA2
IL6ST
IGH@
ATP5A1
TOM1L2
TSTA3
MAOB
HLA-DQB1
OLIG2
DIO2
TBXA2R
CRYAA
CUL7
ITGA3
EXTL2
ATP1B1
LY6D
CRABP2
PRKAR1B
ERCC1
SKIV2L2
FMO3
FAM107A
MDM4
LY96
TARP
NECAB3
MDN1
CTNNA1
RIF1
SLC19A1
HIST1H2AL
NCK1CSNK1D
DHX8
TRIP6SC65
ARPC5
RABGGTA
FOLH1PPP1R2
MID2
PPP5C
RRAS
RPS11ANXA6
SHMT2
HBA1
PLCL2
MMP3
MYO6
CTAGE5KIAA1279
SMARCA4KRT17
APOE
VEGFAPHF21A
ARL4DYWHAZRNPS1
BCKDHB
NELL2
SSBP2KIAA1109
ALDH2
TP63
MIPEP
SWAP70
VAT1
LSM12TSPAN1
DEFB1ZZEF1
PTPN13
CD53
RBBP4
SLC25A40
COL6A2INPP5DATP1B3
SPOCK1
CASP2
FECH PRKAR1A
PMP22
DSPPRIM2HMGB2
BLK
FKBP9SMARCE1
COL1A2
GPRC5A
CAPZB
HPCA
SRGAP3
SLC9A7 HBBELF3
MCL1
NEFM
SHROOM2
COL7A1
ANPEP
PARVB
SLC7A8MPRIPUCNAHCYL1
SNTB2GATA2
GNMT
VEZF1
INPP1
SPRR1A
LILRB4
LIMS1
NSFL1C
PSCAIRF3
TFPI ATP9AXBP1
TTBK2
MAP2
IFI44IGHD
TBC1D4PPBP
KRT13
OBSL1C6orf145
DBI
NRP1
ITGA6
CLSTN1HOMER2RUNX1T1
MT1B
NAMPT
RIN2 BECN1HLA-JPIK3R2ABI1ALDH4A1
EIF2S3KRAS
CREB3L1
HNRNPH1
S1PR1DOC2B
CDC2L2
NUCB1GORASP2
ADK
LOC100134331
CCND2MEIS2 BMI1
ZNF37AWSB2
RRM1
TRMT1
TMEM106CAMFR
TFF1
NBL1
TMSB4X
FMODCALM1
GALR3
RBP1GNB2L1P4HB
RCAN1
TMEM47
HLA-ASKIL
ME1HOXC6
C6orf54MAPKAPK3
LGALS1
CXCL12PAGE4
SNCGFEM1C
NDUFV2BCL2 CANXATP6V0A2RAP1GAP
FUBP3ARHGAP1
UGP2
ARG2
AOX1SGCD
KRT10 LDOC1
SMAD5CALD1
KRT15
RBBP6IQCK
SEPT10NKX2-5
VCAN
FBLN1
RAB27A
RGS10
HLA-DRB4
DNASE1L3
GRK6MDC1MAF PSG11
SERBP1 FN1
C7 PTGDS
ITGB3BP
TXNIP
PRKRA
IGF1 PTPRC
CDS1COL13A1
GSTP1GPD1L
ATP6V0D1BMP2
TIMM17A
PKD2
LYZ
UBE3BTCP1HLA-DPA1
ZNF133
ANXA4
KRT19CBR4
ERBB3 COX5A
PTH
THOP1
PWP2
CFDALDH9A1
LDLRMTHFD2
NCAPD3
TSPAN31
IL10RB
DFFA
GCLC
ARHGEF18
TARDBP
CDC42BPAEMP2
GPX1
TACSTD2
BGN
RHCE
PLXNB2
TMSB15A
ACYP1
CCL14
DOPEY2
FKBP4
MT1E
TRAF4
PTPRG
NXPH4
FAM114A1
ABP1
ATF5
NPHP4
LCP1TRO
DPTLDHA
CIT
STK17A
P2RX7
DUSP2
PPPDE2
MAP3K8
BRD3RNASE4
SCN2BUQCRQ
SMR3B
TBC1D2BALDH1B1
C1QBPDEAF1
GGA3
CASP10
POU2F3ATP6V1E1
RABGAP1BTG2
MT3
SPINK4
ULK1
BUD31CHKB
LTBP4
FBXL2SIVA1
AHNAK
NCAPD2PDGFRL
PYGB
EFEMP1TAPBPTK2 RAP1B
SLC6A15DLGAP2
DPYSFAM91A2
LRP10XRCC6
ILK
AGR2
NFKBIA
STC1
FES
COX7C
PPAP2B
SFRS16
SLC25A6
CFDP1MYL9
ROR2
MAPK6
CCND1POP5
FADD
ZP3KIAA0495
TLX1
ID1 LITAF
DUT RAB1A
ROCK1BTF3 KIAA0182
FOXO1
CSRP2
HNRNPDTOM1
KIAA1012
CUL5
ARHGAP5
YWHAQ
HADH
CYP3A5KRT5
SNRPE
SPON1
PFDN6NAP1L1RMND5A
UBE2B
STAT5B
ANXA3RALYL
VGLL4
CARD8
PDGFRA
SLC15A2
ALCAM
SEPT8
DHX15DPM1
PCBP2
IL1R1
USP9XRPS3
MRPS27
PTRF
DUSP1
FBXL7
DST
ACTG1
RYBP
GRAP2 DNMT1TSPAN7MYOF PAX5
SYPL1 SERPINF1LAMA2
NACA
GSTA4 AP1S2
EIF4EBP2DUSP5 CLCA1
BNIP3LUSP10
ACSM3
PCNA SAT1
LTF
ARFGAP3
DUSP3
GSTA2
FZD6
TBX6
PPIL2 ERCC2
MAP7
ATP2C1
KCNMA1
HSP90AA1RUNX2
INSR
NFATC4RNF41
LDHBMARK3
GRK5
GUCY2FNELL1
COL16A1POLR2DOVOL1
ITGA4
RPS16
RPL28
TRAK1COPB2
STOML2
GPD1
MGST3
MARCKSL1
BST2
PCGF1
SPINK1PLXDC1
KIAA0040
PRDM2
COPS2CPA3
CAPG
GPR39
USP13
TYMS
CALCOCO2
RPSA
HEPH
GSTM3
EIF2S1
ALDH3A2
DKFZP564O0823
IGFBP2
NCKAP1
BACH1
TM9SF2
PITX1
HDDC2
CCK
PALM
ALDOA
PRKDC
ABCA8
CELA3BRAB3GAP2
FXYD1POLD3ASCL1
TOB1
SERTAD2
EMP3 CEBPDASH2L
PIBF1GUCY1A3
RPS6KA5TNFRSF21
TBPL1 STACSEMA6C
TGFB3
ACAT2SMAD9
GNA15
TFIP11
LGALS2PLP2
ZNF410
PPM1H
PAMFADS1GPRASP1
TOR1ARAB11FIP3EPX WSB1
KDM5A
KCNB1
SIK1PNN
IQGAP2
ACTB
GSTM2
EPCAMCES1
INPP4A
HRSP12WIF1
BTG1
RPL10
WIPF1
RRAGA
SALL2RGS2
DHX29
NUP205
PGCP
ROCK2
N4BP2L2COL11A2
KANK1
VIM
ARFRP1NFIBIL11RA
EIF1B
SCGB1A1TFAP2A
CHM
ARL2BPCASKPCMT1
WNT5A
G3BP2
FAM65B
KCNAB1
ZNF268
HAUS3MEG3
MAN1A1
LRIG1
ATP6V1H
CCL25
SH3YL1
H1FX
FBP1
GTF2BZMPSTE24
FKBP2MPHOSPH9
PDXK
FKBP5
MANBA
MT1X
LYN
BMPR1A
BAG2
ACAA1
PTPN4
IPO5
HSPB11
BTN3A1
RPL8
LPCAT1TRBC1PDE3B
SET
PTPRCAP TOR1AIP1
SMPDL3A
MAL
G0S2
F13A1
GDF15
KCNS1
ASTN1
PDGFA
MAPK8SPOP
AUTS2
C5orf13
CSF2RB
MGAT5AHSA1
FAS
FUSIP1
ANKRD1HCFC1
EEF2UAP1PTTG1IP
XCL1
CAND1
RANGAP1
MMP7NFYB
SYNJ2HBEGFSETD4
SAP18
MYL1
GAPDH
MRC2
R3HCC1
LCK
HNRPDL
CKS1B
RPL19
BCAR3
COL4A3
GCHFRNME2 ABCC1
PGA3
BMP2KTRPV6
BBXPCLO PREP
SMURF1STX7
TAX1BP3
BCL6
PIM1
AZI1DENND3
ULK2
MAP4
ESD
OAT
C10orf116
R3HDM2
NFKB1
TGFBR3FAAH
PRSS23
PAH
RRAGB
LPGAT1
STMN2
ACTR2
STAT1
RPL23
SPTB
IRF1
DLG5
SLC35A1
HSPD1
RPL18A
NR3C2 LNPEP
CAV1
IARS2PLCH2
ATP5C1
MLH3
ANXA2P3ANKLE2
SMAD1
USP6
CDR1
CETN2
PPY
PIK3R3
CDC5L
HIST1H4C
CD38
TTRAP
CRABP1
RNF103SERPINB10
GCDH
BAT2D1
PDIA5
GMFB
VHL
UTP18ETHE1
SLC7A1
PCNX
CXADR
LTBP1
ATP6AP1
RAB11A
CCT5
AZGP1
SIM2
S100A4 DDHD2TAF2
LAMC1
KIAA0247
LPIN1
TNPO1
SPCS2HSD11B1
C19orf2MPPED2
SORL1ZHX2
GPM6B
FLAD1EHMT2
NTRK3GUCY2D
MT1H
FCN1
GOSR2
TMEM63A
ACO1PKIGTMED10
LGALS3BP
SAG
KLK2AMD1
NEFH
MAST4
CLEC3B
ATP6AP2RSU1PTMA
ACTG2XRCC5
PAK1
ADD3
IFT20
CLU
HLA-DPB1
IGFBP5
SPRY2
PEX3
TMEM87AMCM5PLAGL1
TPST1MAN2B2
KLHL21CBX5
RABEPK
CRELD1NBPF15 NR4A2
ID4
MT1GSATB1
NDUFS2TMX4
MTMR15RP11-345P4.4
RPL39L
ISCU
SH3GLB1
EEF1G
PTPRM
TUBB2A
C16orf45
RPLP0
LUM
NCBP2
TBL3
MTA1
FBXO9
RAC1
FADS2
PRKRIR
LMOD1
RPL9
LTB
MCM6CPE
PDLIM1
ZNF652
WNK1
CPD
DUSP6
KRT18
LMO3
TMEM123CYR61
KLK11CETN3HOXB13
BCAM
IFNG
COX7A1
PPT2CMAH
HSF2
SNX7
MGMTCTSE
CEACAM1
STAT3
ATP5F1TPBG
SORD
C4orf46
ALKBH1
ARID5B
CD81
RHOBTB1
YTHDC1
EPB41L3
CYP4F2
MTIF2
ZNF646
DNAJB6
NR4A3NDRG1
ALG13
HES2
CDT1RGS1
C6orf130
SMG5
CLDN3
AIM1
EFS
GALNT1
DHCR24
BRD2MAPK10
ABCC4
KIAA0907JAG2
IL1B
ALB
CRYM
PPP1R3C
SDC1HEBP2
MMETRA2AAQP3RFK
RND3
NAB2
ITGA1
SLC14A1PRKAR2B
SERPINB5
PPP2R1A
EEF1A2
DYNC1H1
CCNT2
GGCT CCNOCYBAPROSC
RTN4
ARHGEF1
RBPJ
GAS1
TPM1TSC22D3ZFP36L2
PRKCIMETTL7A
CADM1MSH6
WDR1
STAT6SH3BGRL
PKMYT1
EEF1A1SON
COL4A6
ANGPT1
F5
SFRS6
MYH11PFDN5
SFRP1
VAMP1CCR1
UBB
ATP1A1
NCR1
RAB11A
ACAA1
CCL14
INPP5D
DPM1
HEBP2
G0S2
RNASE4
PWP2 CPD
SPTB
BUD31
PALM
TMEM106C
NCF4
SFRS2B
ROR2
LNPEP
CYBA
FUSIP1
CDKN1CLITAF
RAB27A
TP63
NPY
GCLC R3HCC1
ATP5C1
TPBG
MAPK10
TFIP11
LSM12
PGCP
CANX
KRT17
STAT6
CCL25
INPP4A
NR2F6
RAB3GAP2
PCNA
TFPI
TOM1
LAMA2
EPB41L3
KDM5ARPSA
STK17A
PAMRTN4
P4HB
BTN3A1
NCBP2
SC65
SPRY1HNRNPH1
UCNTRA2A
RHCE
LYZ
ARFRP1
PPP2R1A
TOM1L2
RRAS
NR3C2
FUBP3
GUCY1A3
HIST1H2BMBMI1
HMGB2
CHM
SERBP1
ZNF268MGMT
ISCU
HOMER2FAM107A
LCP1
ATP6V1E1
TBC1D4
PRIM2
PPT2
KLHL21WNK1
SLC7A8
KDELR1SEPT8
HIST1H2AL
BRD3
KLK2 ETV5
PRKDC
CLU COL13A1
FKBP4
LMO3
ABCC4
SRGAP3NEFH
GGCT
SLC35A1
KRT18
ARL4D
IRF1
GPX1
ALBMME
GNB2L1
KCNMA1
ABP1
MT1E
FKBP9
STAT1
PKMYT1
GAPDH
SORD
DSP
SMG5
GALNT1
GUSB
IFI30
MID2
PARVB
EEF1A2PPM1H
PLAGL1
ATP6AP1
SOX9
CXADR
RPS6KA5
SATB1
ALCAM
HLA-J
TMSB15A
SNRPE
SGCD MT1H
IFI44
CPE
GDF15
HOXB13SPON1
SH3GLB1 ACO1
HNRPDL
PRYCIT
IQCK
XRCC6
BIK
BTG2
CLEC3B
TSPAN1
VEZF1 FOLH1 TPST1
AIM1
PPP5C
XBP1
SERTAD2
C6orf130
LTF
LY6D
TROSALL2
TARDBPPROSC
SPRR1A COPS2
AMD1
ACSL1
SHROOM2
ALKBH1
ITGA1TSPAN31
PPBP
PTPRGMIPEPCDS1
BTG1
HLA-DPA1TM9SF2 DEFB1
CASK
EFS
NDRG1
CDC2L2
CXCL12
IFNG
GSTA4
HES2
UQCRQ
GCHFR
CMAH
ACSM3
DUSP6
CHKB
MYH11TTBK2
HBB
NEFM
MAPKAPK3
PAGE4
ITGA4CLCA1
VCAN
ATP5F1
KIAA0040
SPINK4
PTGDS
SH3BGRLRND3
MARK3
MMP7
R3HDM2
DPT
KRT10
PRKAR1ASPOCK1 NCAPD3BCL2GSTP1
COL4A6
MT1G
NXPH4
TOR1AAP1S2
TMED10
PLXNB2
GPM6B
HCFC1
MDC1
NDUFV2
PCMT1TNFSF10
DLGAP2
LGALS3BP
GRK6CCNOPCBP2
RPLP0
SCN2B
CXorf40B
DST
NAB1
TMEM47MAP2
HDDC2TACSTD2
EEF1A1
CFDCAPZB
ZNF652
ATP9AC7 MCM5
CSF2RBPEX3
RSU1RUNX1T1
BMP2
FAAHPPY
YTHDC1C10orf116
CLSTN1
POP5ANGPT1
RHOBTB1
DHCR24
MMP3
IGL@
AZGP1
PLCH2
RAB11FIP3
CBR4
NTRK3
MTIF2
POLD3GPD1
PKD2
GATA2COL6A2ADD3
EXTL2
KCNS1
STMN2
THOP1ATP6V0D1
RP11-345P4.4
SLC25A40
AHNAK
ME1
NAMPT
NELL2PRKAR2BULK1
RRAGA
SFRS6TXNIP
MEIS2
CALCOCO2
RBBP4IL10RB
BLK
PIBF1UBB
GSTM3
TMX4
HLA-A
SDC1IPO5
GRK5
RNPS1
CCT5
UBE3B
MATN2
IGF1
CCND2
ASH2L
ASCL1
SWAP70
RABGAP1
MEG3RBP1
TRAK1CYR61
RIF1
LRP10
FAM114A1
DNASE1L3
LDOC1
DHX8
CTAGE5
COL11A2
SKIV2L2
CCR1
MRC2
PKIG
MT1B
ITGB3BP
ATP1B1S100A4
LY96
MRPS27
RAP1GAP
SON
ERBB3
RPS3
INPP1
PDGFRL
ACYP1
PLP2 BGN
HSP90AA1
SFRP1
RAB1A
PTH
MAN1A1
ID4
VGLL4
LGALS2
GPRC5ATLX1
S1PR1 CAPG
CDC5L
LOC100134331
PSCA
MGAT5
SCGB1A1
AQP3
NKX2-5
SPOPFBXL2
DBI
SNTB2ABCA8
TSTA3
OBSL1
HYAL2
BAG2BMP2K
MANBA
GALR3 RCAN1
ZP3FN1
WIF1
TRIP6
TMSB4X
NBL1
TBXA2R
DKFZP564O0823
CDC42BPASMAD5
FADD
HSD11B1
AOX1
EPS8
NAB2
CD38
SERPINE1
CPA3
CTSE
LYN
GUCY2FANKRD1 SNX7 TAF2
LEPROT
PHF21AMTA1 DUSP3
H1F0TGFBR3
CASP2
SORL1 ESD
TBX6
CUL5CBX5
LDHAPPIL2
GOSR2
ARHGAP1
CTNNA1 RFKIGFBP5 PIM1LTBP1
IFT20 ARHGAP5
CALD1TFAP2A
VAT1
OVOL1VIM
RPL23
SIVA1RPL39L
GPR39HBEGF
COL1A2FBXO9EHMT2 POLR2D
ZNF410
RANGAP1
PEBP1MCL1
RRAGBRMND5A
HOXC6
GPRASP1TK2
OCRL
TAX1BP3
AUTS2
ZNF133 SAGSH3YL1
PFDN6
COX7CAHCYL1
CLDN3
CARD8PAK1
RPL28SYPL1
LRIG1TPT1
ITGA6
KIAA1279
SSBP2DUSP1
LTBP4 ACTG1ADK
CRELD1 PFDN5
EPX
MAPK6
NFKBIAZFP36L2
FZD6
HRSP12
NBPF15 ANXA4
NDUFB8 USP10
SFRS16
DYNC1H1 HPCA
NFATC4SAP18
DNAJB5C1QBP
HADH
PAH
MPPED2
IRF3
MAP7
GPR125
G3BP2
IGH@
TRPV6
MPRIP
ALDH3A2
NUCB1
PITX1
HLA-DOA
MTMR15
APOECADM1
SMURF1
DIO2
ARID5B
LDHBKIAA0182
BST2
FKBP5
LPGAT1
N4BP2L2 POU2F3IGFBP2
COL4A3
DGKZFOXO1
RPL10
ALDH1B1
CCKTNPO1
MAP4
IGHD
AZI1
USP9X
HEPHTSPAN7DENND3
C6orf145
MAN2B2CDT1
DOC2BSYNJ2
LIMS1SLC14A1
WDR1
FAM65B
HLA-DRB4
KIAA1012PAX5
TGFB3
PTPRC
CR1
CETN3
YWHAZ
WSB1
ACTG2
SHMT2
HBA1
BCL6
PLXDC1
ROCK1
PDGFRA
TTRAP
IL6ST
NDUFS2XRCC5
TBC1D2B
LILRB4STAC
CYP3A5
SLC19A1
ROCK2
ALDH9A1
NCKAP1
H1FX
SMAD9MYO6
PTPRMIQGAP2
RIN2
ATP1B3
SPCS2
CCND1
SAT1
RPS11 LMOD1
PLCL2
VAMP1
PRKRA
LAMC1
ATP6V0A2
ATP1A1FBXL7
PSG11
KIAA0907
ALDH4A1RGS1
PTMA MAST4CRYAA
ACTBTMEM87A
TSC22D3FADS1
TIMM17A
CPM
CREB3L1
BBXC16orf45
BRE
VEGFAEMP3
TYMS
MSH6 NFIBDLG5
ALG13
INSR
LUM
PRKAR1BNSFL1C
DHX29 DEAF1MDN1FMO3
SIK1
NPHP4
GPD1LMYOFMAF
SLC9A7
FBLN1
TNFRSF21RBBP6
IARS2
STAT3CRYM
IL1B
PARP1
WIPF1
USP13
BACH1LPIN1
ARHGEF1
GSTM2PIK3R2
MDM4 SMR3BPHLDA2GNMT
ATP6AP2
SETD4
MTHFD2
CSRP2
ACAT2TAPBP
SKIL
PIK3R3
EIF1B
RUNX2GAS1
CD53
HLA-DRA
SERPINF1
RABEPK
TRAF4
CALM1
HLA-DPB1
PNNKRT19
C6orf54
OAT
KRT13
JAG2
CYP4F2
RPS16RABGGTA
DDHD2 KRT15
PRKCI
KLK11
CRABP2
PDLIM1
PPP1R3C
NFKB1
EIF2S1MT3SERPINB5ID1
DNMT1
PDXK
DUT
KANK1
TPM1EIF2S3KIAA1109
SEPT10COL7A1 CEACAM1ZHX2STC1
Prostate: ARACNE – SN(C1) Prostate: ARACNE – SE(C3)
ESR1
FLT3LG
NQO1
SEMA7A
GCKCOX10
GJA1MAP2K4
SCAPERPPP3CA PI4KBSSTR5
SLC9A1LRCH4
ACANADRBK1
ASPHD1COBRA1
EFNA3
ECE2
IVL
PAK2
CECR7 CRCPCCKAR
FETUB
CD33
COL19A1
RAD51L3
HAP1
OCEL1
DDX51
PRKCB
HIST1H3E
LOC26102
KYNU
RPS18
TMSB4Y
KRT6A
PLEKHM1
RPL11
TGFB3
MR1IFIT1
GUCA1A
PRPF6
DHRS1
PAIP2B
LOC644165
CCL19
SLC24A1
ASGR2
FAM153A
MAX
CNTN1
PRPF31IQSEC1
LOC100134331
SPEF1
RPL36A LOC100131311
RBCK1
DNAJC7
RPS24EEF1D FAU CD34ECDELAVL3CCL22FUSFAM38ANACA
HHLA1
RPL23A
RPL17
psiTPTE22
RPS8
ABO
MAP3K14
SLC6A11SEMA3F
UTP3
ZNF239
RPL9KLK3 ISLR IGSF9BKLK2MCL1
NSFL1CSAP18NFKB1AGPS DPT
AANATNEUROG3MAPKAPK2
EXOSC2 MFNGUNC5B
MPP2MAD1L1
LLGL2
AATKRAP1GAP
GRM4
ITGA10
TOP2A
IL17RA
RAD54L2
SLC7A4
SPN
PSG7PRPF19
ARID3AASMTL
GPR35
SORBS3
C2orf55
TBXA2R
IFRD2
UGT2B15
MYO1C
GRINA
SLC4A8
NTNG1MYBPC3
SLC25A11
PAX8
HAUS5
WBSCR22
DUS4L
ANKRD46
CCKBR
SLC8A1RABGGTA
LOC201229
MYST1PDGFRA
BZRPL1
PRF1
MLANAZNF264
LY9
ARPC4
CRTC1
IGHG1
HTR7
POU4F1
NOC2L
GPR143
TNIK
PTGER3
CHST2
LTK
ABL1GRB2
HMX1
RHO
TNKS
LUZP1 XDHMC2R
RAPGEF5
COL1A1CASP2
OBSL1KRT86
TCOF1
EPOR
TERF1
JAK3ZBED4
MLH1
EZH2
OPN1SW
TNP1 DOK2
DIDO1IFT140
SLA
LAIR1PTGS1
C9
FGF2BCAM NDST1
COL6A2
SLC1A7ITGAD
BFSP2 KRT35
TRPC4APCAD
TRAFD1
SMOX
POL3SCDX2
FSHB
NTSR2
RPH3A
EXD2
ELAVL2
B4GALT3
ARSE
RGS9
CHST3ATP6V1B1
CARD10
LYPLA2
CORT
IL18RAP
GAD2
EPAS1
TCF7L2
CD52PPP3CC
SLC6A9PLCD1
ADAM15PAXIP1
MAG
KLRC2
RBM8AABCB1
MDC1
LSM4
KRT1
RAGE
HLA-DOB HOXC11
METTL3
TYMS
KLF1
NFRKBCYP2A6
MYEF2CPNE6
IGFBP5
MAP2K5
AGFG2
UBL3
RPRD2
DDIT3
TCAP
ITGA2B
PCGF1
ADAMTSL2
ANKLE2TM4SF5
CRYBA4
SLC20A2
PDLIM7EDIL3
USP19POU2F2
ATP1B2
NRTNPAX3
C16orf7
LOC729400DSG1
MLX
RECQL5BMI1CUBN
TUBB2B
PRPS1L1INSL3
TFF3ARMC6
ARFIP2
TIMP3 PPP1R10
PNMTREG1B
TCF3SRM
FANCASDCCAG1
B4GALT5
PIM1
KCTD20
LEFTY1
P2RX6
DGCR9
DKK4PITPNM1
MTHFR
F7
RAB40BPARD3
CTSG
PQBP1PRH1
NCOR2ATP2B2
PLIN1
RRP12DEFA3 MEF2C
MAGEA5KCTD17
FRAT2
BTF3HIPK2
MSCANK1
AXIN1
SEPT6
CYP3A4
USP11UBXN1
TAF5
FCGR2A
TULP3
GAGE12J
NFYB MGP
SLC6A7
E2F1
CD40
GALK1SAR1A
IGH@
CHKA
FURIN
PMF1
CELP
SLC35A2
GPR19
TROAPVCX3A
VAMP5
CDK2
EFNA2RNF167
ABCC8
TNK1
PTPRD
ARTN
ITGAM
FLOT1
TPMT
GTSE1
AMELX
CHIT1
CNP
STK25
AP2A2
PDE4A
GCKR
KCNQ1
NFATC2IP
CCNF
ERBB2
ABCB6
OTUB1FSCN2
FZR1
INPP4A
ACY1
JAM3
PPM1D
USP6
GIGYF2
ST3GAL1
AGAP8 HTR2A
NLE1
RDH16
SKI
GNAT1
MGAT3
MFAP2
TET3
CNNM2
SMARCD3
PTH
WNT7AUBE2N
MAP3K11
ARHGAP26
SLC12A4
TRIM58
SVEP1
LGALS9
TBX5
HDGF
SLC30A3
WWOX
USP9X
IL8RB
LRP3
MATN3
SIT1
TRIM16
RAB3A
PDE4C
PPAP2C
C5orf30
PCDH9
PIGC
GP2
ORC1LTPPP
UBE3BLOC552889
EPR1
SULT4A1
SERPIND1
KIR2DL4
BCL3
RIMS2
AQP7PCBP3
PIGR
ELN
OGG1
BRCA1
TAF1
ANGEL1
KRT33A
SDC3 RUNX1
ZNF592
TPOATF6B
MTX1
SIX3
ZBTB7A
MEN1
WNT10B
FGD1
ANG
PTAFR
MMACHC
TRIM22GTF2H3KIF22
ASGR1
CYP27A1SNX27
SAMD14
KLF11
APOBEC3C
AKAP3
LOC145678
CEACAM3
ADCYAP1ETV3
ZKSCAN3
MVKLGR5
GAGE2E
DEPDC5COPS6
LY6G6CSPTLC2
SLC5A2
SLC4A3
PAMR1
TUB
NRG2
UPK2
WNT11
ZNF500
KCNJ4
FLT1
FDXR
SMARCD2
TOP3BERBB3
JAG1UBOX5
CCDC69
ZNF157
CDC2L5EIF2AK2
KRT85
POLG
PHKG2CFDP1
NCKIPSD
EVPL
SCAMP5
CAPNS1AQP5
DGKA
DDX11
RASSF7
CYP3A5
NF1
FKBP15
UTRN
ACR
ACCN1
GYPB
BCRHNRNPL
FOSL1
ARID4B
KLHL18
HCRTR1TMPRSS6
SLURP1
PXDN
CRXST14CEACAM4
TERF2ETS1
FAM5C
NUDT3
DOK1C19orf50
TREX1 MAPK8IP1
IRF4
NBEAL2
KRT2
CBARA1
GUCA2A
HTR1B
HRAS
GPA33
DNAH9
SMYD5
RYR3
GRIK5CDK5R1
PAX9
SLC16A2
FAM102A
TRA2A
RFC1
CRYBB3
HAS1
KIF1C
SLC1A6CORO1A
PGM3
CHD8
TBC1D22A
RHBDL1 PSMB4
CYHR1
IRF5
IL10RAC14orf1
IGF1R
GPR3
LILRA4
CAPN5
SCARB1
MFHAS1
SPP2LOC91316
CBLN1
PCTK3
SOX12
HADHAGNL1
MRPL28
RELA
MAGEA4ABCA1
C1orf144
TENC1
RPL3L
NRG1
HTR4
LRIG2
HOXD13
IAH1
CPSF1
RAD52
RASGRP3
MAGEA9
PTOV1
DHX30
MYCL1
LY6E
LIPE
PFDN1
IGF2
MEGF6
GDI1
SBNO2
TOMM34
MAGEA1
RAB8A
SUPT6H
ERF
UCHL1
P2RX7
FADS1
RBBP8
POU6F2
CCR4
STK17A
ST20
EML3
STAT2
TSSK2
SLC16A3
FOXC2
HPGD
ALAS2
SIM2ANXA9
POLR2E
CYP11A1
RAB14
PPP4R1
DCBLD2
IL1RL1
CRHR2
STMN1
TFAP2A
NECAP1
PCDH7
MMP25
IKZF1
GYPESLC17A3
CSNK1A1
SFPQ
CCHCR1
MVD
VENTX
TAP2
TNIP1
YPEL1
KIAA0323
TTYH2
SERPINA4
DNAJC4
ADCY2
STXBP1
TGOLN2
SMG6
PCDH1
ITPR2
LALBA
DCT
PML
TIMP2
SCNN1D
PBX1
REM1
SMOU2AF2
TBX19
SFTPD
KCNH2
NCDN
TRIM3
FOXE1
OR2B6
ZC3H7B
DTX4
EXOSC7
ZNF783
KCNF1
SLC6A2
HSF4
FAM189A2
CSNK2A1
MAP3K3
ATP10B
SLC25A17
SERBP1TP53I11
BAIAP3NFATC3
CD22
GPRIN2FOSL2 GNB3
ACOT11
PTGES
EPHB2PRPHMLLT4
SLMO1
STATHHRK
HTRA2PHF21A
MYST2TYRGSTA2
CEP135
NFKB2 KRTAP26-1RPS6KB2
XRCC2IDS MLL
FAM189BST6GALNAC4
THRATNFRSF25
SPC25
PRRC1
SLC38A10
AIF1
AFAP1
SPRR2CTHBS3KIAA0284
HBBP1ALPPL2
LSM7
ODF1
GPR161
LPAR2
SCYL3 H6PDSPINT3
CACNB4BAHD1
E2F4
SPG20
ALDOC
EIF5BIGHMBP2
RPS26
LBP
MFAP3
PGAP2
FLNA
CNN1
TUBB3
MYL9
HSP90B1
RPS27PTPN9
RPS20CACNB1
ERP29
RPL37ACAMK2BP4HB
RPL10
RPSA
RPS17
RPL3
RPL29
RPLP0
RPS3
RPS14
RPL18
RPS19
RPL15
RPS5
GNB2L1
RPL19
SLC25A6
RPL28
EEF1G
KIF20BATG2A NTRK3SLC6A12
ACADVL
NDUFV1
CCDC85B
CDC37PPP2R5D
GPX4
POM121L9PGDF5
ENTPD2PTPRO
DDN
FGF3
GABRG3
CLEC11A
GFPT2
NRP2
MPDU1
IHH
RPS3A
RPL31RPS23
ARL3 MYO9BTUBB2A RPL24 RPS15A RPS21
TUBB2CRPS6
SEC16A CROCC
RER1
MYH8
NUP214ARVCF
ERGIC3
EIF4G1
KDM1
TUBGCP4 GARNL4PXN
ZBED1CSF1
ZNF593
STIP1
KIAA0999
LOC652147
CCL16
T
FANCL
NHP2
ALOX5
GPR6PDPN
CAMSAP1GRN
PSG1
GGT5CALR
ARHGEF6AHSG
HSP90AB1
DNAL4
ATP2A3
SF1
GIPRDGCR6L
MUC1
TCF25
DDX17SLC2A1
TUBGCP2
BTRC
MSX1
XPO6
DGCR11
RAB11FIP5
AMMECR1
OPRL1NCSTN
SLC1A4
IKBKG
RALYL
TGFB1SIX6
NDST2
MARCO
RPA2
SGCAIL13
GRIP2
FUT7
SRRT
ATP13A3
NOMO1
ATF7
ACVR1B
PRB1
DAPK2
OAS2
NAAA
CADM4
GRIN2A
NCR2
LMTK2
SLC18A3
ZSCAN12WDR62
GTF2ISMPDL3B
GNAS
VEGFC
RFC5
SSH1
INA
GHITM
CA11
NCR3 ZNF169
MCM2
SOCS3
DTNA
ZNF274
CDK6
GJB3
RBM4
MYB
MGAT1GJB5TNPO1
DUSP7
TLX2
CCBP2
CCPG1
DHCR7
AGPAT1
PVT1SRPX
TRIM15
SLC9A3R2
METTL1
AURKA
FAM50A
RUNX2
RP1-21O18.1
ITM2B
POLD2
PFKFB2
HIRIP3SSX1
SLC6A8
DAXX
BAT3PSMD9
ADH1C PLEC1
CHMP7
RREB1
FOXN2
PHLDA2
KIFC3
WNT1
TK2
DUSP14
XPNPEP2
IFI35
GPT
REL
ANKRD12
HCG9UBE2H
ADAM20
TIMP2
EFNA5
PBX1
HTR1E
ZNF239
MED21
GREM1
SH2B1U2AF2
B4GALT4
KRASCTSO
SCGB1D2
SIGLEC7
GPR182NRL
CD33SERHL2
ZNF264
USP24
ALB
WNT7A
MAGEA1
CRCP SPTLC2
TRIP13
KRT37
MRPS12
MR1
SOD3
GTSE1
CARD10
PRPF31
KIF1C
FOSL1SMYD5
AKR1C4PDE1A
HEG1
STX5
SLC9A3R2SNAP25
AGPAT1
CCBP2
TRIM15TK2
C19orf21
NUDT1
NUPR1
THBS1
DOM3Z
UBE2H
PDPN
SOX1ATXN10
CDC37
LITAF YBX1
PAX6
PTPRO
PCYT1B
HLA-DOB
NEBSERPINC1
FAM50A
METTL1SLC28A1
CCPG1
WDR45 RDH5
DTNA
ALPPL2
ST6GALNAC4FPR3
GPR176
PRPH
CD4
HAS1
PIP4K2A
UBTF
FANCG
CCL16
DNAH9
RIBC2
TAX1BP1
PTPRU
DHRS1
GNS
PGRMC2
PAXIP1
ARPC4
HAUS5PAX8
KRT1
DOK2
WNT2B
MASP2
ADAM15
POM121L9P
EZH1
MAP1B
RASL10A
NEU3
SCAMP1TBC1D5
LIN37
INPP4A
CRX
C1QL1
OPRL1
TYMS
MVK
VAT1
GGT5
DCT
PIK3R1
MLH1
ODF1
HIPK2
TRIM58
KIFC3
PPAT
SOX2
PMS2CL
NFKB2
PTPRS
ZNF592
SLC9A7
CR1
GPR162MAP3K11
STK25B4GALNT1
AP2A2
CHIT1
ACAN
ZP2
UBE2N
FZR1
EXTL1IQSEC1
SLC35A2
IFT140
SORBS3
PHF21A
AQP7
SLC30A3
WASF2
ABCD1
FLJ40113
TRIM22
TYR
LPCAT4
KIR2DL4
KATNB1
SLC16A3
HHLA1
TNKS
KLF1
RNASEH2B
PDXKSIM2
LLGL2
SPC25
LRRC15
HBBP1
LGALS9
F7
ANGEL1
CENPA
GHRHR
UBOX5
SLC12A4
NLE1
AGAP8
RAPGEF5
DNAJA3
HTR3A
ALG3
PSMD9
CRHR1ABCB1
GBA
LGR5
MDC1
WNT10B
LOC100134331
UBE2D1
GHITM
RAB40B
SLC22A6
PLCD1
NR0B2
IL8
COX6A2
ARL4CPDCL
RQCD1ST5
C16orf7
CTSG
TEFACCN1
IGSF9B
N4BP2L1 ISLR
CSNK2A1CEP68
AFAP1
COL21A1
MAX
OMG
NTSR2
CDK5
KHK
MOCS1
CASP2
TMSB4Y
TCF3
ASIP
LOC100129500
SYNGR4
SEMA7A
SEZ6L
RHO
DIDO1
C2orf55
TGFB3
psiTPTE22
AATK
AANAT
TEX28
BCL3
RRP12 SHC2PRPF6
CELSR2
TKT
RAC2
CACNA1B
EIF4G2
HOXD10
CPNE1
CNP
CYC1
SIT1
EZH2
MPZL1
LUZP1
DHPS
EPB41ST8SIA5
RAGE
NEUROD2BYSL
PAIP2B
NSFL1C
PCDH9
LOC157627
ITIH4
MAG
IGF1RFAM189B
MYL2
LDB2
ADORA1
RP5-1000E10.4
CYP2B7P1
RPL13P5
PGAM2
CSNK1G2
DCHS1
XPO6
HSPBP1
SEMA3A
STAU1
HMOX2
SLC25A11
CACNA1CBRCA1 GPR12EFNA3
TCF25
SAP18NPAT
RBMX2
HSPA4
FSCN2
NOC2L
PAK2
PQBP1
PRRC1
PCBP3
FAM89B
DOPEY1
ATF6B
TBR1
EAPP
FKBP6
CCR3IL18RAPHDGF
TUBGCP2NQO1
TPO
APOBEC3C
GTF2H3
GUCA1A
MAZ
NTSR1
CTAG1A
BCL2L11
C21orf33
CST6
DDX51
ITGA3
GRIN2C
MCM2
RBMS1
TRBV5-4
KCNQ1
DYRK1BS100A5
IGF2
GDF5
FGD1
IDS
TMPRSS6
IHH
NPPB
ACVRL1
FAM5B
UNC5B
IPCEF1
KIAA1024AQP8
RECQL5PTGS1ZNF263
IL1B
FOXN1
HNRNPL
KCNJ4
MYEF2
CPSF4
MFHAS1LECT2
LHX2
IGHMBP2
PRKDC
NRXN1CSRP3
SLITRK3
CRTC1
CYP2A6
AGPS
S100A4WBSCR22
LEFTY2
FNDC3A
DBN1
IFNA4NFRKB
PPP3CC
PRKCBRBM8A
WASF3
ZBED4
NTNG1
RANBP9
FAIM2
FAM119BLPAR2
REL
ATP2B2PLIN1
RHCE
MXD1
ACOT11
TRAFD1
FSHB
P2RX6
PPP1R8PITPNM1
FOSL2
LEFTY1MYT1
PLAU
GM2A
CD22
KCNJ6
PSD
CRYBA4
CHKB
TIMP3PTGES
ATRN
TFF1
ERBB2
XPNPEP1
HOXD13
PHF16
CHRNE
RPL3L
MYCL1
CRMP1
PHLDB1
RASGRP3
RPL6
RPL34
RPL24
RPL31
RPL27
AIF1
DYRK2
RPL41
RPL7
PTPRJ
RPL37A
RPS29
PCSK7
RDH16ZNF500
RPS20
RPS18TSSK2
DNAJC4 RPL32
RPS8
RPS27AINHA
RPS3A
RPL23
NPM1
DPP6
IGFALS
ZBTB17
ANXA9
DOLKTROVE2 CCT8
SNAPC4
CYP17A1
RPL11
MVD
TAP2IL23A
CACNA1G
RPL21
RPL13ARPS13SLC16A6
RPS25
RPS23
RPS6KB1
TJP1
LAIR1
CHMP1A
NCOR2
ABCC1
TBCD
PKP4
CCR7
PRUNE
TNIK
LY9RAPGEF1LALBA
ARL3
SNORA21
REM1PML
RPS15 RPL9 KIF21BIL24 SPRR2C
RPS17RPL19 STMN1
SCNN1DRPS16
RPSA
RPS11
LMO1RPS14RPL4
GARTRIMS2
RPL38
RPL12 UQCRB
COX6C
DNASE1L1TOMM20
CCDC94
TFF3
MYH8
FAM65B
POR
SCAPER
EPAS1
MC5R
RGS4FCN3
SLC13A2
TCF7L2
AGT
AMBP
FEM1B
TECR
ARHGAP1
OCEL1
GIP
PI4KB
SLC8A1
GRP
RBM38
IL17RA
CACYBP
TET3
PDE4D
LHCGR
RABGAP1L
NCAN
NACAD
PRAMEF12
LOC728643
LSM4
PDGFRA
GAGE2E
TAF12
DGCR11
TOP3B
RAD52
UPK2
NCL
ITGA10
EPORZKSCAN3
MMACHC
C19orf26
MAPKAPK2
ASCC2
RAD54L
TGDS
AHSG
DGCR6L
ARHGEF17
TTLL4
SLC24A1
ATP6V1B1
ARSESMAD9
CCKBR
CSF2RB
BTRC
SFRS1
LOC399904
GPR18
PCDHA3
CPT1B
ACSM3
COL6A2
B4GALT5ARMC6
CCDC6
HTR1B
SFT2D2
ZBTB40
CDKL5
NEUROG3
CYP27B1KRT35
LOC145678
NMB
MICAL2
PRM1
NFKB1
KIAA0892
VCX3A
MORF4L2
KPNA6SLC25A17POL3S
RUNX1
DDX6
MRFAP1L1
AKAP3
GATA4
C16orf88
PDE4A
PIM1
CCDC86
RPRD2KIAA0226
EFNB1
AKAP4
REG1B
RAB31
SLC22A18AS
C10orf12
SYNPO
SLC16A5
FST
MRPL28
GNLY
ANXA5
HPCAL1
PIGC
UCP2
VLDLR
MAPK3
SOS1
FANCA
SV2B
ZNF282
RSC1A1
CD3E
ISG20L2
GARNL4
ST3GAL5
PARD3
TSHB
KEAP1
ITGAD
SDCCAG1
ANKLE2
HAP1GAD2
CD52
SGTA
KLK3
CAPN5
GRAP2
GPT C19orf50
NCR2
SEMA3F
PSMB8
YPEL1
KDM4A
DCBLD2
NIACR2SMO
CYP1A2SMOX
C19orf6
PTPRB
PDE3A
SMAP1
SDF2
RPS19
GAMT
RPL18NDST2
ZNF157
CCT3
LY6E
RFXANK
LMTK2XCL1MSX1
RAB11FIP5FBLSC5DL
FCHO1
SCP2
ARFIP2PLLP
HERPUD1
EEF2
IL1RAP MLLT4
PRKD2
CSTF3
NKX2-2
ATPAF2
RPL3
ITGB6
ENC1
BAK1ARAF
FAM155B
TMEM8BFRAT2
KLK2
ELOVL5
CEP135
RPLP1
VENTX EIF3B
LIAS
JTB
RBM42
PHOX2B
KLHL18
ABHD14ATWF2
ARVCF
PMF1
DDX3X RHEB FAUATP5J2
TSPAN1
SBNO2
TTBK2NGF
MTHFR
PSG1
HPR
C18orf1
DLX2
TAF6L
FMO5
STAC
HSPA5PNMTRPL28
FAM153ARPL10SMARCD2
CYP27A1
RPL18A
RPS5EEF1A2
GDI1
PTPN21PLIN3PPP1R2
ADAM3ACDKN2B
CTSA
PFDN5
FLOT1
FOXD1
MET
TFDP2
PAX3
TM4SF5
UBL3
HRASUSP13
VPS8
EXOSC7
LRIG2
LTA
SLC30A9
ANKRD46
HTR4
TRIM3 RPS12RPL5RPL14
SRM
PPM1D
EPR1
APCVPS11
ST3GAL1
CRHR2KCNQ3
P2RX7
MUC1ZNF79
THOC1PAMR1
C1orf77
KNG1
USP22
POU6F1
NOX1
SDC2
CBFA2T3
DPM2
GYG1
PCGF1
MAP3K3TRIM38
PIK3R2
KRT7
IKBKG
PFDN1
ASGR1
MYO9B
SF1AKT3
C11orf52
AGFG2
PPP1R10
POLR2C
CBARA1
RPL15
KYNU
RPL29
LPHN2
BCL2L1
ETV3
RPS27
PTK7
KCNF1
NACA
ATP5H
FGF2PET112L
CIAPIN1NTRK1
TBX5 C11orf80TAGLN3
SLC25A16
APLP1OPN1SW
REG1A
RAB3GAP2
AMMECR1
TPT1
RPS4X
RPS6
RPS3
RPL36ALRPL10A
RPLP0
MBL1P1
NKX2-1
MELK
LRRC48
GLP1R
CAPN3
ATRX
XAF1
NFATC3
RB1
SLC4A8
CPA1
TUBB2B
SLC6A6
GCKR
ADRBK1
IGHG1 PLA2G6
KRT12
KCNH1
LDLR
GZMM
RGS9
PRF1
DPT
BZRPL1
DKK4
COL6A1
DEFA3
STX11PTGER3
DDX11
LOC652147
CHRNB2
TPMT
PTH
ABTB2
KPNA5
PSMD1
MUC8
UGT2B15
PIK3IP1
NUCB1TRPC4AP
DLEC1SPP2
ZSCAN12
RPA2
MAGED2 IL13
ADH1C
GDI2VAMP7
HLA-F
SRRT
COLQSMARCC2
TRADD
PDCD10LOC728849BCAP29
ATP13A3TCP1
LSM7FUT7 TUBB2A
EIF4A2
ARF1 SLC25A6 SRP9
TMEM87A
SLC1A4
CFDP1
PSME1
PRRG1
RABGGTBCEL
SRD5A1 C3orf32
RPS24
RPL27A
ATP5G3 GLT25D2
ICA1
ASGR2
KRT85
RPL36AGNB2L1
NHP2
SLC25A3
SNRPB
CLIC1NDUFV2FKBP8
RER1
EEF1G
EIF2AK2
PADI2
SAR1A
C1orf21
RPS15ARPL37
GCOM1KIFC1NME2 NME1PDCD6 HINT1CCL22 ELAVL3 KIF17 ACADS UQCRQHNRNPA2B1SIPA1
RPL30
RPS7
EIF3E
PTPN9
RPL23A
RPL35
RPS21
RPL17
RPS10CACNB1
EIF2B5CAMK2B
DRG2SLC22A24 COL4A1CTNS C14orf79 COL4A2TGM3 TNXB
CDH1
SORD
PTPRF
KRT18
ATP6AP1
FAM120A
GRN
PVALB
CBX6
TRAP1
PSMB4
GTF2I
AR
EIF4G1
ECH1
FCGR2A
PRSS23
KISS1
MAP2K3
C9
PCDH1
PSMD2
COPA
TNK1
SLC6A2
DLGAP4
POLG
TRIM28
MRPL12
UROS
RALY
ITGA2B
UQCRC1
ACOT2
SIRPB1MTFR1
NCSTN
LCK
CD63 DDX5
GRIP2
SKIV2LNASP
ASNA1
FAM168B
VAMP5
HMG20B
RFC5
WNT11
SFTPD
USP9X
MTERFD2
SRPX
AK1
TTC15
INPP5D
CD72
DOK1
SCYL3
PPYR1
FIG4
NUDT3
OR2B6
LARGE
TLX2
JAG1
RNF167
IRF4
RBP3DDIT3
GPR6
SIGMAR1
C1orf216
SLMO1
FAM102A
SLC2A1LILRA4
LAS1L
KCNMA1
EDIL3AP3M2
USP19
ADAM20
MAPK8IP1
C1orf105
LYPD3
DDX17
IL8RB
FLJ10038
SMPDL3B
MAST2
PSTPIP1
TCOF1
SPEF1
MARK2
CCKAR
DEXI
ABL1
HRC
ANG
RANBP2
IFRD2FNTB
SPN
CGREF1
NECAB2
DDN
OR5I1
FKBP15
GTF3A
C5orf30
MLLT1
STX16
CSF1
PXN HTR6
EMD
FEZ1
CNNM2
CNOT4CYP2D6PFKM
OSGIN2
KRT86
STX1A
PIGA
HSF1KIF1A
TEAD3
UBL4A
ABCC8
PIK3CD
ITGAM
ARTN
B3GAT2
LOR
ARHGAP4
TRIM66
NRCAM
ACY1
E2F5
IL5RACLCN5
ATF7
CDC20
ST7ITM2B
SAPS1
ZP3
BMP1
PYGM
ORC1L
NTRK2
CADM3
ABCC3
ERLIN2
UPP1
ATP6V1D
GSTA2
ZNF593
NFIC
COX10
CYP3A5
EXOSC2
CCDC69
HMX1
DNAL4
RHBDL1
HCRTR1
TACC2
CDC42 PHF20
GUCA2A
GRIK5
SLC5A2
GGT2
RP1-21O18.1RAF1
EPHB2
FCGRT
PTH2R
CAPNS1
POU2F2
OBSL1
SMARCD3
EVPLDEPDC5
THBS3
CLINT1FANCL
FAM13A
NUP214
DUSP7
ACR
KRTAP5-9
UTRN
CREB5
DYNC1I1
ACVR1B
BAHD1
MCRS1
TXNRD2DGKG
SKAP1
EPS15
NCDN
CLCN6
IFNA6ZBTB25
PAK4
ABCA2
SULT4A1
ALAS2CELP SOCS3
UBE3B
CDK6E2F4
GPR161
ABCA1TPI1
SLC11A1
BPHL
HSP90AB1
C12orf47TGOLN2
GALK1
TAF15
GNL1
MMP19ADCY1
SSTR2
TTYH2
LOC201229
PIK3CB
GPR3
RGS12
MFAP2
CDC34
NFATC4H2AFX
PTK6
ARHGEF6
LIF
CNTN2
PEX6ARID3ASLC7A4
GPR35
IL7R
MYO1C
IGKC
UTF1
RELA
PTPRDSLC6A7
RFC1
CSPG4
CDC2L5
BCL11A
PDPK1
STATH
SEMA5A
IAH1
TNPO1
PRKCE
C9orf114
KLRC3
ZNF101
HTR2B
SLC7A2
FDXR
TREX1
EVX1
PXDN
GRINA
MAGEA4
PLXND1CIZ1
JAK3
PDHA2
CROCC
SPARC
COL1A2
ATG2A
MAN1A2
KDM1
MUC5ACKIF20B
NTRK3
PRSS16
CDR2L
TEKT2
ILVBL
STXBP1
STIM1
LGALS1
EFNA1HNRNPA3
CCNI
RBPMS
SFRS11
NUP210TTC3
ACTB
BCAT1
ZNF134
CEBPE
DKC1
RBM4
TUBGCP4
LOC552889
CAMSAP1
TP53TG5
KDM5B
GPR19
SHBG
XRCC2
CEACAM3
GP2
SKI
CACNB4
RAB8A
FAM5C
GH1
TROAP
THRA M6PR
PPAP2C ProSAPiP1
SLURP1
KRTAP26-1
CADM4
OAS2
YME1L1
NRG2
TBX19SLC15A2
IGFBP6
LOC100128031
VDAC2
sept-06
NONO
FAM127ADYRK1A
MARCO
MTUS1SFRS2
CRYAB
ZFP36L2
ZNF451
CTNNA1
SUPT6H
ZMYM3
SLC4A3
TGFB1
NOMO1
CYHR1HSPB3
NAALADL1
RALYL
GIPR
NECAP1
DHX9
DNAJA1
HMGXB3
PCDHGC3
UBE2S
HIST1H2AG
WDR62
RHOB
LTK
PLEKHG3
PFKFB2TIAM1
EIF3F
WFS1TGFBR2CSNK1A1
LPXN
H1FX
SFRS4
OASL
SPINT1CALR
TNFSF12
UBXN1
C12orf51
NCKIPSD
KIAA0430
SEC16A
RSAD2
NAAA
PLEC1
DCTD
SPG7
MAT1A
SLC5A5
PIK3R3
MFN1
MPV17
CD40
C9orf33
GHRH
LOC729400
SMTN
PPP1R12B
DFFB
MMP15
NBAS
IRGC
HIRIP3
DPF2
MT1A
MT1X
DCAF8
LSR
ARHGEF10
NDUFV1
DGCR2
HOXC4
NEU1
ACADVL
FGFR1
DYNC1LI2
PLXNA3
PPIH
SCAND2
HS3ST1
GNAS
LCT
MTG1
MGAT4C
EIF3CL
MXRA8
CRYBB3
CAP1
NFYB
MRC2TRAF3IP2
UQCRC2DHCR7SPTB
TADA3L
CLSTN3
RRAGA
EMR1KANK3
NID1
ACTN1
POU2F3
INPP1
TNFSF9
MOGS
GABARAPL2
NOLC1
FAM38A
HPS1
KHSRP
TNFAIP6
H2AFY
VASH1
GYPC
IRF2
C1orf63
SIX6
FOXK2
EIF2C2
DHX30
BAT3IRAK1
ZFR
HNRNPCDNMBP
ZNF143
TCEB1
PRCP
FBXO21
PTOV1
SERPINB7
UBE2C
LIMK2
EIF3G
TLE1
DAP
ILF3
MLX
CTNNBIP1
CYP2A13
SLC25A1
PAN2
CPSF1
SLC7A8
ZBTB16
UBE4A
SLC38A10
C8orf84
TFDP1
RAE1
SMR3B
RNF185
SERBP1
LSM14A
ACTR2PPP3R1
SLA
ADD2
MPZ
ALDH1B1
MGAT5
CAD
MAML3
FGRZER1
ZFY
CCRK
NRGN
MED24
KCNJ9
ERBB3
SAG
HSD17B8
ADCYAP1
MGAT3D4S234E
TRA2A
MLL
DBT
ABCF2
DPH2
CD79A
BRD3
ARFIP1
BFSP2
NDUFS8
DDX23
IKBKE
SKP1
RYR3
TGFBRAP1
STX2
MAGEA5DUS4L
NOL7LRP3
LRIT1
PRPS1L1KCTD20
DOHH
DPH1SPG20CLIP2MAGEA9
RPS26
EIF5B
ZMYND10
SERPIND1
F8
SVEP1
ZNF259C19orf29
MDM2
VSIG4
MEF2C
MPDU1
ZNF202
CD7
ABL2
PMS2L1
C19orf57
LOC283079
INSL3
SLC6A9
HOXC11
KRT16
PCNXL2 STBD1
ARL2
DRG1
CLDN5
HRK
PPP2R5D
CTBS
PDAP1
MGAT1
MFAP3
BNIP1
LYZL6
VPS41
HIPK3
CHP
PFKFB3
RNF113A
GPSM3
ZBTB24
MAPK9
ESR1
ATIC
SLC6A11
ABO
GYPB
GCN1L1
CHST3 NUAK1
ACP1
KLRC2B4GALT3
EMX1ARHGEF12
DGKD
KCND3
NCAM1
PLEKHB1
C19orf36
SNX27
KRT33A
PAPPA2
FLT1
CCT4
FZD9
PTEN
ITGB3BPHTR1D
HIST1H3E
FOXE1
MLANA
RNF5
CHKAODF2
XRCC5HTR2A
BBC3MT1B
SLC9A1
PPP3CA
ALOX5ERCC4
KCNA3
DNAJC7
PTPRR
PLEKHM2
ANK1
RBCK1
ZNF467
CYP2C18
HMGA2
PRKAR2ASLC29A2
COL19A1
CNTN6RABGGTA
PAX7C2orf24
CCR5
GCK DMP1
SSTR5
KCTD17
CHST1OLFM1
RXRB
SLC6A4SH3PXD2A
BZRAP1
SCN1B
PAFAH1B1
RBM4B
IGH@
MT3 ZC3H7B
DTX4PSMC5
POLR2J
NUP155
LRCH4
CECR7BMP8AASPHD1
NME6
MDH2DMBT1
CIITA
IMP4PRH1
IFNW1
BMI1
NEURL P2RY6
GATA2
CUBN
ARID4A
ZNF22
COBRA1
SLCO2A1
DOC2A
TNP1
TERF1
TBC1D9B
EDEM1
CRABP1
FURIN
ZKSCAN5
TUB
ABCC10
KLHL35
CHP2
KIAA1539
IL3
DGCR9
SRPK3
ARHGAP26
ERC1
XDH
ODZ4
SLC18A3
ENTPD2
DUSP14
CCDC85B
SLC17A7
RAMP3
GRB10
FGF3
KIAA0467MFSD10
GFPT2CXCL5
PIGR
VAMP4
FLRT2
BCAM
PRKACG
PHLDA2 EIF2B1
SH2B2
NCR3
MDK
WNT1
LOC91316
SLC16A2
PBX2
ILF2
S1PR4
ACOT8
ADCY2
KLHL25
TRIM33
KCNH2
ALPK3
PHF8
IFNGR2
CLEC11A
GABRG3
SCN8A
CYB5R1
HNF4G
MC1R
ZNF187
FSHR
SERPINA4TNPO2REG3A
PGM3
CALCOCO1
CREB3
PTMA
YWHAQ
STIP1
TCF15
FKBP2
SGCA
EP300
SLC1A6
USF2
NDUFS7
GRIN1
C6orf120EEF1D
BGLAP
BRD2
HLA-G
MIF
DDOST
RPL8
PHB2
SCAMP5
RASGRF1
TMBIM6
GOLGA2
SUMO1ACP2
ACAA1
HSPA8
PYCR1
P4HB
PSMA2
ATP6V0E1
RNASE4
SIAH2
ERP29
IGFBP2
PSMA3
HSP90B1
GTF2A2
PDIA3
MGST3
PARK7
VWA5A
TUSC2
ATF4
ZNF132
BAT1
ERGIC3
NME3POLD2
GJB5
OAZ1
FUS
JUNBNUP188
GPX4
TUBB2C
DAXX
TALDO1SOX3MMP23B
TENC1
TP53
MATR3
HIST1H2AH
ATP10B
ARFGEF2
PVRL1
ZBED1
ERCC1
IRF8
IFIT1
MPP2
KRT6A
SLC17A1
IVLSAPS2
TNFRSF10C
TNFRSF1B
MAGEC1
BCR
TCL1A
SPINT3
PSD4
TBC1D22A
MTX1
ERF
NRG1
GIGYF2
CPZ
TPPP
C20orf194
TLE3 IQCB1
GRAMD1B
IL1RL1
AMHR2
TTLL5
HSF4
HOXD9
ZNF337
ECE2
BBS9
GBX1
MSC
RPS6KA3
BAT2D1BMP10
TTC22
ATXN3
PPP1R16B
LDB3
CASP10
HYAL2
NR6A1
MC2R
CDH16
CCNF
BTF3
E2F1
TOP2A
C9orf125
MLXIP
MAP2K4DRD5
MYO16GRB2
SLC9A3
AURKA
UBE2M
CCR4
CXCR5LYPLA2
F2RL3
RGS10
EXD2
C8orf71
CARS
ALDOC
PTPRN
ARR3
TBX1WHAMMRBBP8
MRPL33
MYBPC3
LOC26102
CMA1
FLT3LG
ELAVL2
DNPEP
RPH3A
POU4F1
PPIF
TBXA2R
ZBTB7A
LSM12
ETFB
NAGPA
OSTF1
HCLS1
WWOX
CORT
CALB2
CD34ECD
USP6PHLDA1
GPR143
RAP1GAP
GRM4
DNAJC8
POU6F2
DNASE1
GNAT1
SLN
MEN1
SH3BP1
CYP11A1
C1orf68
CA5A
GOLGA1
POLA2
BRD8
HPGD
PLEKHM1
FGF9
FADS1
SSX1
LAD1
CNTN1
FMO4
TCF20MOBKL1B
PI3
COL1A1 CHST2
RAD51L3
VILL
CTSC
DDX19A
TRIM16
ATP6V0A2
SOX12ADAMTSL2
OPCML
RAB3GAP1
RMND5B
TAF1
ITPR2
JMJD6SCD
MYST2
BTBD2
NPM3
STARD5
GLE1ASMTL
XK PSG7
PRLRH2AFB1
NPAS2SDC3PTAFR
WDR18
DSC1
ZNF76
STK17A
MAPK12
MATN4
ST20
RTN2
TOMM34
BHMTAQP2
GJA1
ATP11A
PRPF19
SIX3 NFIX
LOC644165
MATN3
PAX9
AQP5
PCTK3
KIF22PRB1
CDK2
MFAP4
COPS6
EIF3KCDC25C
IFI35
MYB
H6PD
ARHGAP12RCC1
ARID4B
HTR7
HOXD4
HCG9
OGG1
UQCRFS1 NDRG2
PCDHGA12 EML3TCAP
SLC15A1 NF1CPNE6
DGKA
SETD2NRTN
DAGLA
GYPE
UCHL1
ARHGEF1
ZNF20
LBPCDKN2CAHDC1
OTUB1VAPB
SLC6A8
MYST1
RGS7
ITIH1
ZNF167
ALDH5A1
UCP3DAPK2
SFPQ
FLIILOC729678
PGAP2
LY6G6C
BAX
AMELX
PRKACAMFNG AVPR1B
ACHE
CASK
DKFZP564C196
PDE4C
RASSF9
KLF11
LPIN2LLGL1
RAB3ASTAT2
DMWDAXIN1
T
SLC1A7
IL10RA
ANKRD12HARS
KIAA0513
CCR1C3orf51
SAMD14
ELN CHMP2A
EWSR1
KLF12
TFAP2A
HMOX1
CD99
KIAA0586KIAA0284
C2CD2
TYMP
SEPT8
GPLD1
CHMP7
NFATC2IP
GPA33
ATP1B2
POLR2E
GRK5
TERF2
ETS1
HADHA
DSG1
EIF4BNADK
AMIGO2
CAMK2G
FOXN2
RAD54L2
TULP3
TMCC1
CADPS
MEGF6TNFRSF25
HTRA2
INA
TFCP2
PLEKHO2
COL16A1
ADRA1A
NDST1
MAEA
PALLD
HSPB1
TPM2
TAGLN
CSRP1
FHL1SVIL
MOG
CD1C
C2
CTRL
SLC17A3
MMP25
ZFP36
IKZF1
PPP4R1
GRIN2A
VEGFC
PHKG1
LOC100131311
WWTR1
ACTG2
MYLK
MAP2K2
IGFBP5
CDX2
MAP2K5GNB3
PDLIM7SLC20A2
INPP5B
KRT2FUBP3
PPM1E
FERMT2
FGF8
ATF3
CKB
MYH11
SYNM
ALDH1L1
PEA15
MYL9
CDC6
UTP3
MAP3K14
CCL19
AP3B2
BAIAP3
TP53I11
RND1
VPS72
PSMD14
ESPL1
SF3B2
NRP2
RGN
GPRIN2
GPR44
PCBP1
DPYSL3
SLC6A12
PVT1
PLRG1PARP2
ZMYND11
SNAPC2
FOSB
MAPKAPK3CCHCR1
SHC1
RASSF7
MCL1
GAGE12JPRMT2
RAB14PHKG2
PCDH7
CEACAM4
KIAA0323
NR4A1
H3F3B
C1orf144
SCARB1
FTH1ACTG1
CSNK2BVIM
TPM1
FMNL1
CSK
KIAA0999
ST14
PSMD11
CDC42EP4
PKM2
LCAT
TCF7
CLIP3
RPS6KB2
NBEAL2
NPPA
AHCYL1FAM189A2
PCBP2
ATP2A3
LDHA
COASY
FOXC2
LIPE
GLUD2
IRF5
TUBB3
ZFHX3
RBM5
ZCCHC24
CORO1A
SULT2B1
PTGIS
WDR1
ARHGAP6
ZNF783
PZP
CNN1
TRAK1MGP
FETUB
FLNA
JAM3
C14orf1
KIR3DL3
NCF4
TRD@
USP11
RUNX2
ARL4A
FBLN1
ATP5S
EFNA2
SSH1TNFSF8
CHD8
PROL1
LASS1
BAP1
TAOK3
MAD1L1
DDIT4
MYH2GNB1BCL2
ZNF169
TCTA
DUSP1
PLCG2
CA11
METTL3
ATP6V1A
SMG6
TTLL12
ABCB6
CBLN1
CDK5R1
SERPING1
UBC
SV2A
RREB1
CALCA
PUM2
APBB3
SEPT6
CYP3A4 TAF5
TNIP1
XPNPEP2
SNX13HSF2
ZNF274
FMOD
GJB3
16
Prostate: PCC – SN(C1) Prostate: PCC – SE(C3)
PITPNM1USP11PSG1
SAP18LBPTOMM34
DMWD MYO9BTUBB2BCAMSAP1PXN
TMPRSS6CEACAM3
IKBKG
LLGL2SLC2A1
SIM2STMN1
CD33KIAA0323
FAM189BH6PD
SLC6A2DGCR11
SLC24A1
AANAT
ACR
RUNX1
LOC91316
TUBGCP2SULT4A1 TCOF1
AGAP8 JAG1
RP1-21O18.1 EML3
POU4F1
BCRHLA-DOB
RHBDL1MFAP2
IGSF9BGPR161
CACNB1GRIK5NUDT3
MYBKIFC3CD22
NRP2
MLANA SULT2B1ATP6V1B1
PLIN1SCAMP5
ITM2BAHSG
LYPD3
DOLK
MR1
NTRK3MT4
GRIN2A
HNRNPL LRIG2
KRT33A
N4BP2L1SLC6A9
ARPC4
CCPG1
TTLL12
PRH1
ESPL1
SPN
VAT1
PIK3CB
USP9X
D4S234EANGEL1
ARHGEF12 NEUROD2
FGF7
CYP2A6MUC1C19orf26
HMGXB3
GRB2
HIST1H3ESLC4A3UTP3
MFSD10
SAR1A
WWOX
PDE4CMSC
CDK2ADCYAP1
LEFTY1
AGPAT1
TAZ
PIM1
PAFAH1B1FNTB
ESR1GOLGA1PLEKHG3
TRIM58M6PR
NTNG1
MAPK8IP1
IHHRUNX2
U2AF2
ACVRL1
RPS17
KLHL18
SIX6
RPS5
RPS19
RPL18
MAGEA9
EXOSC2STX16
MYO16
MATR3
CASP2
RAPGEF5
ST5
SEMG2
TUBB2C
TUBB3
TPM1
TUBB2A
SPINK2
SEMG1
AKR1B1
ACTG2 PIPMYH11
GPR35SRPX
RPL29
RPSA
LUZP1
CCDC69
SEMA3F
DSG1
MYEF2
RAB11FIP5CIAPIN1
EDIL3ADAM20
B4GALT3
ALPPL2
CCL16RPA2
PVT1
MYO1C
ARHGEF6
HTR4
CNNM2MRPS12
POLA2
SEMA7A
HSF4
DMBT1
LRRC23PTGS1SLC7A4
LOC100134331
UBE3B
SLC22A24
LY6G6CNCKIPSD
RIMS2
RAD51L3TERF2
IGHMBP2
DOK1
JAK3
EPOR
DNAH9
PRRC1IL8RB
CHST3
ACVR1B
TBX5WBSCR22
HPS1
KRTAP26-1
SLC30A3
NFATC3ITGA2B TGFB1HTRA2
GTF2H3
LY6E SIX3
CNP
SSTR5
GPA33SLC6A8
AP2A2
SMYD5CRCPRREB1
PI4KBSLC6A7
NUPR1
OPRL1
ZBTB7ASVEP1
DGKA
PIK3R1
TUB
ARID3A
LILRA4NQO1
RPL31RPS23
RPS15ARPS12
RPS21
RNASEH2B
AFAP1GP2INA
SLC9A3R2
CDK6RBBP8
ST20
CSK
PAX7
PPM1D
TBC1D22A
SCAMP1
HRK
RPS6KB2
SORBS3
HTR1BMDC1FSTGPR6BAHD1PIGR
ARSEKRT6A
GARNL4XPO6MYH2
HPR
ZNF167CBLN1LRCH4
CRYBA4
GAGE2E
VCX3A
SLC18A3
ANKRD12
POM121L9P
CNTN6
RPS16 GNB2L1 EEF1GRPL27A MCL1 LOC100131311RPS11RPL24
VASH1
SERPINA4THBS1
GYPBRAB8A
RABGGTATNIKTMSB4Y
TBXA2RFOSL1
ASGR1
POL3S
TNPO1
SLAATP2B2
TAF15
UTRN
ENTPD2
SFTPD
CNKSR1IFIT1
RANBP9 CHD8
SLC16A5MMACHC
PGM3
CDK5R1
MYCL1
FUT7
FAM189A2
AQP7
SPRR2C
TUBGCP4
REG1BLY9
LOC145678
RALYL MAGEA4ABCA1
CCNFGGT5
DDX51
AMIGO2C16orf7
ADAMTSL2
ATF6B
DAPK2
RELA
SLC17A3
ELAVL2
PCTK3
IDSSTAT2
ABOCEACAM4
GPR3
ALPK3PAX9
XRCC2CCKAR
OR2B6
STATH
DHRS1PSD4
THRAKIFC1
GRINASHBG
NRG2PCDH9CDR2L
KRT86GPR19HOXC11
IFT140 HPGDSMPDL3B
DTX4ARID4B
VENTXMC2R
CRABP1
CDH4
FAM153A
ABCC8STXBP1
EIF5B
SMARCD2
MTX1
PXDN
LOC552889
ORC1L
RTN2
MCM2
FGF3MGPEIF2AK2
UBE2S
ALDOC
SCARB1IGHG1 SKI
TCF7TRD@
CYP11A1
E2F1
YPEL1
CEP135DDX11
MAG
ZNF500
KRT2
MYBPC3
KCNH2MMP25
PAXIP1HTR7
IGF2T
PCDHGC3
CD34LAIR1
PRPF31
ALAS2ZNF157
ANKRD46
NTSR2
FZR1SPEF1
PHB
NDST1
PTPN9 KCNJ4JMJD6
AGFG2PLEKHM2
GUCA2ALIPE CCDC94
SLC1A7MEGF6
UBE2C
DCISOCS3CCDC86
FETUBPTPN21
AP3M2
PRAMEF12
PPP3CB
NFIX
IFI35BBC3
SERPIND1
ZMYND10
MMP19RB1
NRL
ARHGAP6 POLR2J
SAG
COL6A2
BTF3C9orf114
C20orf194
SNUPN
RBMX2KLHL25
KRT37
CCL22
ARHGEF17
CHKASNAPC4
HSF1
ARR3
GPR162
KATNB1CSNK1G2
BAP1
FGD1CYP27A1
RBP3
CEL
ELOVL5TEKT2COL6A1
CUBN RDH16
SOLH
SLC9A3
SV2A
SLC28A1
DYRK1B
MC1R
YY1
KRTAP5-9AMHR2
SPTLC2
TCAP
KPNA6
SLC6A12
ZBTB25
TPPP
XPNPEP2
C9
IQSEC3ATP11A
TNFSF12LSM7
PTGES
CROCC
PCDHA3ZFY
ARHGAP26
UTF1NOMO1
KRT35
KIR2DL4
SEMA5A
MATN3
HIRIP3
PCDH1GRN
PAMR1
RHO
UCHL1
MAX
SMARCD3
SLC35A2
CAPN5
DDNZNF239
GATA4
INPP4A
RAD54L2PRPF6 CD52
ARMC6KCNQ1
ARFIP2
TAP2DDIT3
JAM3
ZNF593
VEGFC
RBM38UBOX5
CA11MFHAS1
TCF7L2
ABCB1
SMO
HLFOPN1SWPHKG2
HSPA4
SLC25A17
TRIM3
GBX1
SERPINB7
RRP12 NCOR2CBARA1
STX11
PDE3A
C2orf55
CADM4
E2F4OTUB1
PFKFB2
KIF20B
PAX3
DPEP1
CD40MEF2CZKSCAN3
CARD10RPS26
UNC5BHAUS5POU2F2 CNTN1COPS6
EFNA3ADAM15GTSE1 CDX2
TFF3EPHB2
ADRBK1MYST2GNL1SPINT3
AATKABL1
HDGF
APOBEC3CGJB5
SEMA4DTBX19
FAM102A
LTKPPP3CCWNT10B
SLC25A11
FCGR2A
SPC25
ZBED1TRIM33
CHST1
BCL3
PTPRN
TPMT
NUP214
FOXE1
KCTD17GIPRNCDN
CLEC11ABMI1
NRTN
LSM4
RGS9
CPNE6SCYL3
MEN1
KIAA0999
C2orf24
FKBP15
PPP2R5D
CAPN3
CAMK2B
SLC29A2
BCAM
SBNO2SLC12A4
ACP1
ITPR2
PAIP2B
CYP3A5
WDR62MSX1
CSNK2A1
HHLA1
GPLD1
GJB3
IRF4
C1orf105
PRKCBCEBPE
GART
RAB3ANCSTN
RAB40B
HCG9
MAD1L1
RNF167
COX10
EPR1
IL10RA
GPR143
DUSP7
KIAA0284
NLE1
TNK1
GNAS
SCAPER
XRCC5
ABCB6
LAS1L
MARCO
POLR2E
PPP1R8
C14orf1GJA1 AURKA
NADK CHRNB2
IL18RAPADCY2
METTL3
SLC38A10
LOC644165
KDM1DGCR6L
P2RX6
HIPK2
SLC8A1NCR2
DTNA
ITGAM
ZC3H7B
WNT11
HCRTR1
PGAP2
NFATC2IP
TROAP
HMX1
P2RY6
SGCA
PTAFR ZNF264
IL17RA
NSFL1C
TERF1SLC16A3
SOX12
CPSF1
POLG PLEKHM1
CORO1A
FLT3LG
C1orf144
KLF1
ADH1C
PSTPIP1
PLCD1
FAM5C
ATG2A
UPK2
NOC2LCRTC1
FGF2
SEC16A
CRYBB3
GPT
CCL19
MGAT3
TREX1
PRB1
ETS1
BAIAP3 LOC652147
REM1IRF2
RBCK1
FLOT1KIF22
SMG6
UBXN1
PDLIM7
MVDGALK1GCKR
DEPDC5
MLH1PLEC1
RASGRP3 PTH
MAP3K3
IAH1
SAMD14PAK2
AXIN1TOP3B
MAP2K4INSL3
FSCN2
ACAN
WNT7A
KCNF1
UBE2HFAM50AGSTA2
CHMP7GDF5EVPL
MAP3K14
ANG MPDU1
ALOX5
LOC26102
PSMD14
IQSEC1
CRXC19orf50RPL3L
ERFTGFBRAP1
ITGA3SRM
IL3
GRM4
PRPS1L1
ELAVL3
GPRIN2
TGOLN2C19orf29
PIGC
GYPE
LPAR2
F7
RAP1GAP
ETV3
LOC729400PDAP1
PDGFRA
AIF1SSX1
SDC3
psiTPTE22
CCBP2
GIGYF2
GNB3
USP19
RAE1
ARVCF
CYP2A13
SFPQ
POU6F2DUS4L
HTR2A
TBX1
TRIM15
IVLPPP3CA
CTRB2
RGS10
LMTK2
CACNA1B
ABCC10EMR1
METTL1
BAX
EMID1
GRIN2C
AMMECR1
ACCN1RPL11SERBP1
CREB3
SPINT1
OGG1
FDXRZNF592PDPN
DGCR9
TNFRSF25
ENC1 ALDH4A1
PRPH
ELN
MLLT4
TULP3
RASSF7
PTGER3
BRD2PARD3
RECQL5
MPP2KYNU
IGH@
RPS18 RPS8
TACC2
HMGB3
TRA2ACAD
RAGEAQP5
KRT1
TET3
SLMO1EPAS1
MAPKAPK2
COBRA1ZBTB40
GUCA1ATADA3L
CDC2L5
MYST1
DCBLD2
TRAFD1HAP1
CCHCR1
DHCR7DEFA3 RPH3A
ST14 TNP1FLT1NFRKBFEZ1PCNXL2
ITIH4
ASPHD1MGAT1BZRPL1PIK3R3ARTN
PZP LGR5ZNF783CSF1MVK
MXRA8 ZSCAN12COL19A1
SLC5A2PPP1R10
NUP188
OCEL1ODF1
NFKB2SLC9A1
PCBP3LOC201229
LGALS9PAX8TYR
TLX2LYPLA2
RELIRF5HSP90AB1
SLC1A6XDH RBM8A
CHIT1
UGT2B15PSG7
ATP1B2
MTHFRDPT
PNMTPTPRD ITGAD
AGPS
ISLR
OAS2
MAGEA1
ERBB3
GADD45B
CECR7 TYMS SIT1
GRIP2SLN
GAD2
PLIN3
IL4
RAMP3
AFF2CSTF3
MAGEA5
KIAA1539
CCR7DNAJC9
ST6GALNAC4
RPRD2GLT25D2
LOC652147
FGF2
PLCD1 COL6A2
GPT
KLF1
RB1
CHMP7
UBE2H
PTH
MAP3K14
GDF5
IAH1PTAFR BAIAP3
TOP3B
FLOT1MVD
GSTA2
NOC2L
TREX1
SERPIND1
FSCN2
UPK2
PHKG2
ATG2A
GALK1
IL10RA SCAPER
GCKR
POLGKIF22 CRXANG
GPR143
ETS1
MGAT3
AXIN1
RASGRP3SBNO2MAP2K4
CCDC94
FGD1
AGFG2CRTC1
XPNPEP2
ARR3
ARHGAP6
CYP27A1
ACAN
SAMD14
UBXN1
CROCC
UBOX5
RNF167
ADH1C
CADM4
WNT11 LOC26102
HCG9
BTF3
NLE1
C1orf144
ARHGAP26
INSL3
REM1
ZMYND10
MMP19
WNT7A
CEACAM3
UGT2B15
ZNF167
U2AF2
MR1
SKI
TPMTMYST2
LTK
TAZ
H6PD
FOXE1
ANKRD46ORC1L
CDK2
ADCYAP1
SPN
SLC24A1
UBE3B
ACVR1B
AGAP8OCEL1
CAMSAP1
GPR161
LOC201229
LY6G6C
SCARB1
CD34
ARPC4RPS26
C2orf55CSF1
REL
MVK
SIM2
BMI1
JAG1
TUBGCP2
SPINT3
SLC12A4
PFKFB2
CYP3A5EML3
TMPRSS6
PNMT
KIFC3
ALDOC
FCGR2A
OR2B6
LLGL2
CPNE6RAP1GAP
LYPLA2
TLX2
TFF3
HTR2A
GIGYF2
MGAT1
COL19A1 RBM8A
POU2F2
MEF2C
CLEC11ASULT4A1
TGOLN2
ABL1
MYO9BSLC1A6
SEMG2
SEMG1
GRIP2
PIP
NTNG1
psiTPTE22 HHLA1
NTSR2SLC5A2
TOMM34
SLC6A9SLC18A3
TYR
TBX19 IKBKG
STXBP1
RGS9
SLC4A3
TROAP
USP19
XDH
FEZ1
GRB2
MYBTCOF1AATK
MSCEPHB2TRIM58
KCTD17MXRA8
CD40
RUNX2SDC3
STMN1
LSM4
MFAP2
PAIP2B
GTSE1
FDXR
FAM102A
E2F4
DPT
APOBEC3C
CDX2
KRT86ITGAD
WNT10B
OAS2
GRIK5
SLC6A2
HTR7
LBP
INA
ZNF500
RPL11
CNTN1
PCBP3
CARD10LGR5
ZSCAN12SEMA4DELAVL3
ZKSCAN3
NCDN
GIPR
PAX8
KCNJ4
FAM189B
HDGF
USP11
TUBB2B
SLC9A1
NFKB2CCPG1
MC2RPSG7
LGALS9
ATP1B2
WDR62
BCR
PSG1
HLA-DOB
CD22
GARNL4
ADAM15CRYBA4
GNL1
PPP3CCCYP11A1
ADRBK1
ODF1
GRM4
COPS6
ARID4BRHBDL1
PAXIP1
MYBPC3SPC25
CEP135
TET3
NUDT3
CD33
SLC2A1
POU4F1
GP2
FUT7
GPA33
AQP7
PRPH
GPR35
RBBP8
ALAS2
CDK6
SLC9A3R2
IDSB4GALT3
SPRR2CWBSCR22
LY9
DEFA3
SMYD5
KRTAP26-1
MAG
PAX9
DNAH9
YPEL1
RECQL5
AGPS
THRACOBRA1
PRRC1DDX11NCKIPSD
RAD51L3FZR1
SLC22A24
DOK1
CDK5R1
BCL3 MTHFRIGSF9B UNC5B
PITPNM1CACNB1
AQP5
SAP18
ZNF157
LOC100134331
PTPRD
RP1-21O18.1
NRG2
DHRS1
TERF2
PRPF31
SCYL3
ACR
CHST3
RUNX1
IL8RB
RAGEDGCR11
MSX1AANAT
HAUS5
EFNA3GJB5
RIMS2
ZBED1
SEMA7A
FOSL1
ACCN1 HTR4SLC30A3AP2A2
EPAS1EPOR
SLMO1
HTRA2 ITGA2B
ZNF592 SLC6A8IGHMBP2
MAPKAPK2
TRAFD1
Prostate: MIC – SN(C1) Prostate: MIC – SE(C3)
IRF2BP1
PTPRS
MYOZ3
SLC16A5RPL3L
ERGIC3
DUSP1
FAM38A GNB1
EIF3B
NAAA VSIG4TRIM28 GLT25D2
HOXD10
JUNB
C9orf114
NFATC3
ERBB2GFAP
PTPN21ITIH4
CPNE6
REEP2ABCC8RAB14NUDT3CRMP1
FAU ZNF76MYL2
GCOM1
TRIM66
CHST15
TULP3BCL2
ORC1LCACNA1BIFRD2
POL3S
NF1
CAPN3SMOX
RMND5B
UBE2NCCBP2
BCL2L1
LARGE
TAF1CBX3
CCR7
GPR12 ZNF593
KCNJ6
FOXE1
PBX1CARD10SFPQ
DTX4SCNN1D
KIF22ZNF169
MYO9BVCX3ASKI
C19orf29PGAP2
CA11
MSX1
KCTD20CD40CTSG
AKAP3
ATP11A
SERBP1AATK
IL8
NCSTNPDE4C
DNAJA3
IRF5
TRIM3
MR1 HTR7NFRKBHLA-DOB
OGG1
NTRK3
TUBB2B
ALOX5SIT1
PTH
GRM4WNT11
RUNX1
SERPIND1
KCNJ4
INA
EVPLCYP3A5
ELN
SPRR2C
AGFG2
CDR2LLUZP1
ELAVL3
TBX5
CECR7HPGD
PTPRNGRNARPC4AURKAHTR2AOSGIN2
DOK1SLC6A3USP9X
MLXARMC6AFAP1 RP1-21O18.1
VASH1
INPP4A
MAP3K11MYO16ZBTB25
MAPK3
KYNU
SLC18A3RPRD2
H6PDBZRPL1
CD3EPPYR1
ATP2A3
JAG1CSF1NFIX
PARP2
LYPLA2
ARFIP2IGFBP5
NRLLLGL2 GCK
SNUPN
NDST2
CHST3MLANA
KDM5B
MAX
MYOGFAM153A
IDS
EMID1
POLD2
CYP11A1AQP5 HMX1
PHKG2PMS2L1
SRMDEPDC5
ITM2BSTIP1
FNTB
IFIT1
SEC14L1
CD33 MCM2
ESR2CRXMYBPC2RAC2
ACCN1
DDN
PISDATF7
ST5
PTK7
HTRA2
JMJD6
SORBS3EPOR
CHD8CYP2A6
EFNA3
C10orf12NFKB1
OLFM1
CHKA
CNTN1
C19orf26
ADCYAP1
HPCAL1
SLC25A17
CLEC11A
KIR2DL4
GSTA2SLC1A7COL4A1COL4A2P4HBRPL18GAGE2ECAP2
HIST1H3E
POU6F1
GIGYF2
ADRBK1
GDI1
ZNF264
MT3
FANCL
SYNM
MYH11
MYL9
ACTG2
PTPN9ZP2 TAX1BP1 GZMMCSPG4 NADK
MTX1CNNM2CYP3A4
SPC25
MYBPC3LOC201229
COL1A1
NCR1
ATG4B PRPS1L1
CRCP
ALAS2
DPEP1
TFDP2
KIFC1
RREB1
KIAA0284
ST6GALNAC4
TUB
EIF2AK2
AVPR1B
CTNNBIP1
LOC145678
CLSTN3
UQCRC1
MYST1
CHRNB2
LOC91316
SIX3STATH
IRF2
THBS1PNMT
WNT7A
S1PR4EIF3G
BRD2
GARNL4
ARFGEF2
RASSF9
KCNH2
NRP2
GALK1
AMMECR1
CCL22
GTF2H3FNDC3A
GNAT1NFATC4
LIPETAF5
HIPK2 MT4
TNK1
CRTC1
CCDC86
TPPP
ZMYND10
GPR18
TWF2
CCNF
TIMP3ABCC3
TBXA2RARHGAP4
DCI
DNAJC9
TRPC4AP
LPCAT4LEFTY1GAMT
FLT1
CADM4UPK2
GRINA
MFAP2ELAVL2
SLC5A2
FURIN SMOANKRD12
FAM5C
DGKG
SEPT6TOP3B
KEAP1PPAP2C
PAX3ABL1
SLC6A8
ESR1RAB3A
C6orf120
ASGR1NCDNLRRC23
KCTD17DUSP7KIAA0323
COPS6KIFC3KLF1
MAGEA5
CGB
IL10RA
TMPRSS6 ECE2IAH1
SLC9A3R2
PCBP3KRTAP26-1
BRCA1CAMSAP1NFKB2OPRL1
MAGEA1
RELCAPNS1
NOMO1
CEACAM3
RAB8A
SAR1A
psiTPTE22TCF7L2
GTSE1
LILRA4EXOSC2
IL8RB
ENTPD2
ARSE
ATP6V1B1ZNF167CRHR1LRRC48
PPP1R16BDGKA
CBFA2T3
RIMS2
KLHL18
PROX1SLC17A3
PDGFRA
CCR4
ZBED4
C16orf7
ANKRD46KRT6A
USP6
ZNF783NTNG1
MYO1C
CDK2
GJB3
MLL
GYPB
TYMSHCG9
FZR1TPMTIGHMBP2
SIM2LGALS9
DPT
AMIGO2
RPS6KB2PTGES
TUBGCP2GP2
MGAT3
CDK6
CROCCSAP18 LOC652147
BTF3RHBDL1
C19orf50 SDC3
XDH
NEUROG3
FGD1OR2B6
DCBLD2
RELA
PSG1
KRT1E2F4
MVK
COL6A2 GPA33TRIM15 F2RL3BCAM
NCR3
CCDC69
METTL3PRB1CYP27A1
NRTN
DUSP14
HIRIP3ANK1
MFNG
SLC4A3
IGF1RABCB1 MMACHC
POM121L9P
GGT5RAD54L2ITIH3
SLC6A12BAIAP3 LYZL6BFSP2
RNASEH2B
TNFRSF25TLX2
FCGR2A
GCKRPFKFB2
HAUS5
IGF2
IQSEC1PSG7
NCKIPSDABCB6
MYB
ITGADMYST2
AGAP8WBSCR22ADAM15
SLC6A9
GPR35MRPL28
IGSF9B
ATG2ASULT4A1
ZSCAN12
POU4F1
C2orf55
MTHFR
SBNO2PPP3CC
ODF1CCHCR1RPA2HTR1B
RASGRP3PLEC1
EPHB2
POU6F2AHSG
CEACAM4
FLOT1
RABGGTASLC18A1
CAD
FEM1B
EDIL3LPAR2
MAEA
THBS3
FANCA
ALDOC SCAPER
B4GALT3MUC1
LOC26102ABOLY6G6C
POLR2J
LGR5ERBB3
KRT85
PRPH
RAB11FIP5PAIP2B
UGT2B15SLC12A4 EML3
TERTARHGAP6AXIN1SEC16A
CRYBB3PAX9TREX1
CD52
DNAJC4
DGCR11
C12orf47
TRA2A
MFAP3
CHMP7
INSL3
LRRC68PML
XPNPEP2MEGF6
C19orf21
POLG
SPN
HSF4
KHKSLC9A1
UNC5B
CYP2A13
GAD2
TNFSF12ADRA1A
C1orf144TICAM1
ITGAMRRP12
FOSL1MVD
IFI35
TMSB4Y
SLC28A1CSTF3
OPN1SW
KLHL25
EFNA2
COX6A2
SVEP1
SLCO2A1
CCDC85B
ARHGEF12
WHAMML2
ADAMTSL2
RBP3
NTRK2
SLC16A3
CDH16
DNAL4
KATNB1
REM1
C14orf1ITGA10
GRIN2C
LALBA
GRB2
SEMA5A
LDB3
STAU1
SAMD14
RBM8A
ITGA2B
LSM7
NCR2
NTSR2
ARTN
PTGS1
DHCR7
KIR3DL3
PAMR1
SLC16A2
ABCA1
REG1BGARS
PIK3R2
SLC8A1
TGFBR2
ABTB2
GTF3A
SLC6A7
MFHAS1 LOC728849
CACNB1ST3GAL1
IVLAANAT GPR19
GSTM5
GNAS
MAG
SLC22A18AS
POLA2F7
PARD3
TNPO1
COL6A1
BAT3
XRCC2
RASSF7TGOLN2
HCRTR1
ALDOB
TNKSCD72
CDH4MAPKAPK2EPAS1
GLUD2 KIAA0586RAD52
GHITMATF6B
TERF2
VEGFCPCGF1FSCN2
GPTHAP1
TFF3
RGS9
PQBP1YPEL1
SLMO1MGAT1
ATP1B2PRRC1
LSM4
ZNF592UBE3B
TGFBRAP1 RNF5TUBGCP4
RAP1GAP
LOC100134331
SLC30A3APOBEC3C
ZKSCAN3
SLC1A4
RBBP8
MYEF2PRPF31IKBKG
GPR161
MAD1L1STK17APAX8
TRAFD1
CDK5R1TYR
SLC20A2
ZBTB7AKRT35GJA1
TMCC1
CIAPIN1
GRIN2ALRCH4ZNF157LBP
LOC552889
CNKSR1
SPINT3BCL3
ACR GTF2F1TNFRSF10C
INHA
T
CCPG1MAP2K4
LAD1
SCARB1NRG2POU2F2
HDGFPVT1MLH1
TROAP
RAGEOCEL1
SEMA3AFAM89B
COL19A1
CD22
TAZBCRSTAT2
CRYBA4CCKAR
ACVR1BNOC2LIGH@
NAGPA
MATR3
HOXD4
SOCS3
FAM168B
CUBN
KAT5
LOC644165
TRD@
CEBPE
GHRHRHTR6
ZFYKDM1
TK2
PIGC
ARR3 RFC1
CCNA2
ARVCF
POLEDCT
PAK2
GUCA2A
CCL16
SLC9A3PLEKHM2
ZMYM3
TP53TG5
HOXC11IGHG1
KCND3
CBARA1
BTRC
HMGA2
TRIM58
PLIN1PCTK3
ERF
NLE1
ALPPL2
NDST1
UBL3KIAA1024
GNL1
EPR1
HSF1
ZNF500
MEN1
RB1TBX1
AIF1SLC22A24
FAM189A2
TERF1
NRXN1
LMTK2
SMPDL3B
RECQL5
TMBIM6PPP2R5D
ABCF2
MATN3
JAK3
DDX51
GPR3 PTGER3
PSMB8
CRHR2FUSHSP90AB1
PIK3R1
VAPBSETD2
SIX6MAPK8IP1
FOXN2PRKCB PYGME2F1ADAM20
NME6
FAM50A
OTUB1
RASSF1
KRT33A
RPL11 EWSR1
ASPHD1
IQSEC3
DDIT3PLEKHG3
KCNQ1CNTN6
GPRIN2
MRPS12
SMARCD3
ZC3H7B
ACAN
DTNA
ARHGAP26
TBC1D22A
DUS4LSLC6A2
PLCD1
PPP1R10
PTPRD
OAS2
CDC45L
LTK
IL1RL1
SLC15A2
STK25
UBOX5
SCAMP5
SAG
FLT3LGGUCA1A
PIGR
AQP7
TNIP1
HTR4CAPN5
USP19
CSNK1G2
TBX19 CORT
NSFL1CDOK2
VAMP5CEP135
KRTAP5-9
FEZ1 PDPNMELK
P2RX6
ADH1CUBXN1D4S234E
PSMB4RAB31
CREB3
PCDH9HNF1A
EXD2SMG6
SLC1A6
TGFB3TNP1
CDC2L5COX10
SLNIL18RAP
GPR143VENTX
REG1A
GNB3
PITPNM1
FLJ10038
CCL19
PTPRBCCDC94ODZ4
ITPR2
DEFA3MLLT4
AHDC1
ADCY2UBE2H
SLURP1STXBP1HHLA1
KRT37
PRPF6
MEF2C
GJB5
AGPSHLF
PAFAH1B1
CASP2
FGF2
AKT3
SERPINC1
AP2A2
THRA
USP11
IL13ARID4BFAM189B
TCOF1
POLR2E
KRT86
MAGEA9SERPINA4
PDE3A
AQP8
FAM102ASMYD5
HNRNPL
PRH1
SLC25A11
LRP3
GRIK5
MYT1
KRT2
ATP10B
U2AF2
IL4
TGFB1
CHIT1CNP
RUNX2
MC2RSCYL3
FUT7
RDH16
FKBP15
PTAFR
CORO1A
ZBED1
DAPK2SEMA4D
ETV3TM4SF5
TENC1
ATP2B2CHRNA3
KLF11
PAXIP1WWOXMPP2
NEUROD2
CD34
ATP6V0A2
RAD51L3
SERPINB7
DNAJC7CBLN1
GRIP2FCER2
DDX11MED24SV2A
GDF5RAB40BDHRS1MAP3K14
AGPAT1
WNT10BWDR62
DNAH9
RPH3ASLC2A1
UTP3SOX12
METTL1ETS1
FDXR
FGF3PHF21A GPR143
RBBP8
MLANA
MLL
TBXA2R
PHF21A
ARID4B
PAXIP1
KRT2
CASP2CPNE6
GTF2F1
GPRIN2
MYT1
STIP1TGFB1
NSFL1CAKT3 SLC15A2COL19A1TYMS
SCARB1
LLGL2
MGAT1
GNAT1SLC22A18AS
C2orf55
NLE1
TM4SF5
SDC3
ATP11A
MYST2
IFI35
TERF2
PPP2R5D
CBLN1
VAPBLEFTY1
KIAA0586
FEM1B
KRT86
ZBED1GTF2H3
NTNG1
SAMD14
TRA2A
NCKIPSD
UBXN1
CRHR2
ADRBK1
LPCAT4
SERBP1
HIPK2
PTAFRMYO1C MYBPC2
ANKRD12
GUCA2A
LTKMRPL28USP19
SMARCD3
CDC45L
PRPF31
NCR2
P2RX6CADM4
RBM8A
PAX9
ANK1DGKG
VENTXPITPNM1
KCTD17CCKAR
EWSR1
ARHGAP26
DTNAKRT1
ADCY2
MMACHC
MVD MEF2CARSE
CCHCR1
NFKB2KDM5B
SAGITGA2B
HOXC11
TRAFD1
SLC9A3
KIF22
CYP2A13GJB3 SEPT6
USP9X
ABCB6SERPIND1
VAMP5
USP11MAP2K4
MAP3K14
FNTBSAR1A
PTPRD
CRYBB3
SIT1
NOC2LGRIN2A
SERPINC1
CORO1ACYP27A1
UNC5B RHBDL1TCOF1
CHMP7CHST3ITGAMGARNL4
GNB3
MFAP2
CHRNA3CD52
UTP3POM121L9P
RAD51L3
RAB40BLBP
IGSF9B
ZFY
PAIP2B
LOC652147
HLA-DOB
RELA
T
TGFB3
CSNK1G2TREX1
MEGF6
DNAL4GRIN2CCRMP1SFPQBCL2L1
MLLT4
GSTM5
CHIT1
ZSCAN12
ZNF169
IL8
LAD1FKBP15
LRCH4
PDE3A
LYZL6
ARVCF SKI
C12orf47
UBOX5
GRIK5CA11HTR4 INADUSP7
HMGA2PAFAH1B1
SLC28A1XPNPEP2FLT3LG RDH16
TNIP1CCL16
HLFHPCAL1
UPK2
IGHG1
ZMYM3
MAGEA9
DCI
VCX3A
CD40
KCNJ4
AQP7
SLURP1 KLF1
SLC18A3KRTAP26-1
CHRNB2
PTPN21
MAGEA1
CSTF3 MTX1
SLC25A17
ARHGAP4
EFNA3
SLC25A11
PRB1TNFSF12
EIF3G
MFHAS1ARMC6
SPNSEMA4D
TOP3B
ACR
CNTN1C19orf50
KRT35
PGAP2
ADAM20
BTF3EVPLCOPS6
IL18RAP
DCBLD2TRPC4AP
PTGS1 GRINA
ELAVL2
MSX1 DNAH9
SMO
FAM38ATRIM28 ORC1LMAPK8IP1 PBX1LOC644165 HOXD4 KCNQ1 PLEKHG3RECQL5
GJA1
KHK
GAD2RREB1
ERF COL6A1
ABL1SRM
USP6
BAT3
CYP3A4 HIST1H3EEPAS1
RPL11
POLR2J
FLOT1
EML3
NME6
PAMR1
POLD2
HTRA2LOC91316
CYP2A6
ERBB3
BRCA1
TFDP2
RAGESLC22A24
FAM189A2
CHKA
LILRA4
HCRTR1
MUC1
TENC1
EDIL3
SORBS3
POU4F1
YPEL1
LY6G6C
CRCP
ZNF593
CIAPIN1
LGR5
IDS
KIAA0284
TGOLN2FANCA
CNNM2CLSTN3
CD22
LOC201229
FANCL
MYOG
SNUPN
ALAS2
AGPAT1
FCER2
JAK3
CORTMELK
HTR1B
DGCR11
RPS6KB2TCF7L2
MAPKAPK2
REL
PARP2
CCR4IRF2
KIFC1
MAEA
TMBIM6ST6GALNAC4
MLH1
ZBTB7A
RASGRP3
GHITM
CBFA2T3
MAGCDK2
HSP90AB1ST3GAL1 PRPS1L1SLC18A1
ENTPD2 MYO16TUB
IGH@
TERF1 AATK
TGFBRAP1
CAMSAP1CHD8
STAT2FAM153A
TGFBR2
ZBED4
KRT33A
ALPPL2
KRT6A
DHCR7
XRCC2
ACVR1B
F2RL3
INSL3
CEP135
LRRC68AXIN1
KDM1
HSF4
IGF2
CCL19
SLC30A3
FURINIQSEC1
GRIP2
PAK2
ARR3
CD33
PML
SEC16ANRTN
DCT
TBX19
IL1RL1
D4S234E
REEP2
ATP6V1B1CYP3A5
AURKA
ZMYND10
RP1-21O18.1
SLC5A2
C14orf1
TBC1D22A
PLCD1
ITPR2ADH1C
DHRS1
SCYL3
PCDH9
SLC16A5
POU6F2THBS1GPR12
REG1A
RNF5
REM1
TUBB2B
STAU1
TICAM1
CRTC1CSF1
NCSTN
NEUROG3ZNF157
IFIT1
SIX6
C1orf144
POLGITGAD
SMG6PMS2L1
ADRA1A
FOXE1
BFSP2 ABCC8
ARHGAP6
ATG2A
METTL3
MFNGMYEF2
PRPHPDPN
TPPP SULT4A1
JAG1
DPT
ZNF592CNP
GPR35
PRRC1ASGR1
NCR3
IGHMBP2MYO9B
EXOSC2MFAP3 NTRK3
IAH1SERPINB7
MYBPC3LYPLA2STK17A
OSGIN2
NFRKB
CEACAM3
KRT85
SLC20A2
WWOX
RAB14
RAB8A
GPR18
PVT1
RABGGTA
PSMB4AQP5
AGFG2BCL3
TPMTSTXBP1
MAGEA5
AMIGO2
KYNU
UBE3B
TROAP
DEPDC5NOMO1
TRIM15
TMCC1
IL8RB
ADCYAP1SLC9A3R2
FSCN2FUT7
LOC552889
UGT2B15
SLMO1
PCBP3
ARTNDNAJC4
ESR1
PTHPTGER3
CYP11A1
CEACAM4IRF5
CCBP2ALOX5DPEP1
B4GALT3SPC25
GP2
AGAP8
MVKRAB3A
CAPN5EPOR
ALDOCKIAA0323
EPHB2MTHFRBZRPL1
OPRL1
ATP2B2
SBNO2
LPAR2
HMX1
COL6A2
MGAT3SOX12
RGS9PPP3CC
TUBGCP2PHKG2
TBX5 SLC6A8
OR2B6
SLC6A2DNAJC7ESR2TFF3
ZNF783STK25H6PD
NCR1LOC145678
TULP3
EIF2AK2
IL10RA
GIGYF2
HCG9
KRTAP5-9
DOK2
REG1B
GFAP
AANAT
CECR7 NADK
CDK5R1
COX10
RASSF7
TMPRSS6
MT3TRIM58
FDXR
GCOM1
TWF2
SCAMP5
CARD10CNKSR1
NFKB1
MAPK3ABCA1 MRPS12LMTK2GNAS IL4
HNRNPL
IL13PDGFRASETD2
CLEC11A
HPGD
GRN
ADAM15
TUBGCP4
SLC4A3
WDR62
E2F1EMID1
DOK1
SIM2
CROCC
ATP1B2
HDGFLRRC23
SIX3
GGT5
PRKCBNEUROD2
PTPRNKIFC3
HAUS5
PTGES
WBSCR22
C19orf29
OCEL1LGALS9MAD1L1FZR1
AP2A2
VASH1METTL1SPINT3
IKBKG
WNT11DDX11 RPA2
SERPINA4
GCKR
CDK6
WNT10B
HAP1
SLC12A4SAP18
SLC6A9GNL1
ODF1
SLC1A4 ZC3H7B TYRACCN1
SLC2A1
GTSE1
SCAPER
GPA33
INPP4A
UBE2H
MYB
RAD54L2
LSM7
FCGR2A
NRG2
CAPNS1
THRA
ITIH3
SLC9A1
CCPG1
MCM2
FOSL1
XDH
ABCB1
BCR
PSG7CRYBA4
GPR161
CRX
CDC2L5
ELAVL3
psiTPTE22
CDR2LRAP1GAP
HTR2A
ELN
MC2RNTSR2
ECE2
FGF2
SEMA3A
TRIM3
TNFRSF25
PAX8
APOBEC3C
LOC100134331
LSM4
E2F4
TAZ
OGG1
RRP12PRH1
PFKFB2
ATF6B
HTR7RIMS2
ZKSCAN3
U2AF2CD34
NCDNFGD1
GRM4
MR1
AHDC1
TNP1TLX2
ARPC4
LUZP1
PTPN9
RPH3A
PSG1RUNX2
FAM189B
RUNX1SLC6A3
17
Lung-Michigan: FuNeL – C2 Lung-Michigan: FuNeL – C4
MYT1PDGFRL WNT5A
EPS8MLC1
FOXK2
IER2
RB1
PCNA
MRPL19
HAX1
SEPT8 COPB1
BGLAP
CYP2D6
F13A1
USP14
PTGER4
STIL
POLR2B
C2orf3
RRS1
DES
COX7B
SNRPEAQP4
MCFD2
PSMC1CBR1
PRIM1 ASNSCLPS
GPA33
TNCGUK1MUC2USP13
GIP
SUZ12ARVCF
CPT2 JUND
RCAN1
NDUFV2
FGF13CYP1B1
MT1P2
PKMYT1
MYB
MC1R CAMK2G
FRMPD4KLK1HTR1E
SNRPN
MEOX2
GTF2H4
FMO2
HBEGFME1
GPKOWIRS1
CDR2
CRABP2
PLP1
BTG2
HLCSPIK3CA
RAD21
GALNSGNL1
IL11RGN
MAN2C1FUT1
ANKRD36B
EBNA1BP2 HSPA6
ZC3H3PCK1
MAN1A1
S100A3
STK3AGTR2
RASA2
CLEC3B
ITGB1
COPS8
PPBPL2
PML
CTDSP2
PIGH
FOSB
SF1
SLC5A5 TSN
FCGBP
CA3
MEST
CYP4A11
CD300C
ALDOC
TAF13 PTMA
NEK4
CHRNE
KRT2
MTMR11
TAP2
DOC2A
MT1G
FDFT1
STARD3
ADRA2C
CYB561
FKBP1A
NR4A1
TNIP1
PITPNA
SFTPC
ZNF592
CTSE
TPM2
COL11A1
DPYSL3AFAP1
ATP1B2
EDEM1GRIA1
KIAA0226
FMO1
NELL2UBE2A
FBN1
RUNX1
LGALS1
COL6A2
RPL28
KIF5A
GAMT
NR4A3
FN1
TSPAN8
ZAP70
RPL19
BBC3
DCT
FASLG
RPL18
COX7A1
PDGFRBCOL19A1
CD99
RNASE4TPM1 TMEM151B
KNTC1
IGJ
NRGNCDC5LTIMP3
THRA
TSPAN7
TGFB3
PTN
FABP5L7
PPIC
HARS
VLDLR
FUT3
COL6A1
FOLR1
ACTA2
ADAM10
TM9SF1
CYP4B1
C4A
COL6A3
CTNNB1
IGHM
ZNF135
HLA-E
GJA5
CD52
ACP5
CES1
DNMT1
C7VCAN
CKS1B
C4BPA
ADCY7
RRM1
PMCH
BMS1
GNL2
LRRC16A
HK3EPHX1
MN1
TIAL1
MEIS3P1
CXCL12
LAMA4AGL
GRM1
LAMB1
RNASE1
TEC
KIAA0114
PAK2
LSSRAB2A
IL8RB
BCL2L2
C8A
JUNBLAMP2
HSPA13
GTF2H1
BET1
TFPI
EMP1ANXA3PPP6C
CHRFAM7A
ACTG2
GLRXCCL13
DNAJB1
GZMM
ELMO1
FGF2
PRKCB
PRDX6
CHAD
APOA1
GP9
PTK2B
FTH1
GUCY1A3
HSF2
GABPB1
SLC2A4
POLD2
DLGAP1
RPS27
IDS
OCM2
GDNF
CELA2B
CUGBP2
PWP2
HLA-G
FUT6
PLA2G16
FABP4
LRRC14DCHS1
H19RPL36AL
NTRK2
OXTR
CDX1LUM
EMP2
TNFAIP6
LTA4H
ETV5
ARHGAP5
PSMD2
TFRC
MCAM
ZNF250
CFPTALDO1
CKMHSPA4
S100A4
ANXA1
RPS21
CD226
GPD2
HOXA1
DDX1EFNB3MAOB
ATP6V1C1
HMGN3
CLEC2B
CHRND
ORC2L
KCNJ8
HSD11B2
SRPK3
RRM2MPO
RDX
PSEN2
KIAA0232
LMAN2
ARHGDIG
TJP2
C21orf2
HIST1H4CCALCRL
OMD
AGER
PDCD2
ARL4DKRT5
SEPT5
EMG1
TRH ALOX12
RUNX2DGCR5
SCRN1
IGFBP6
ALDH1A1PTPRMAKAP12
FOXF1
HBB
STOM
GATA6
MAPK3
ARL6IP1
THOC1
KIAA0513
ANGPT1
CHN1
CPB2HOXB7CD36
NMI
CSNK1G2
FPR1
ABCG1SERPINB2
GBE1
ALDH2
GRK5
OASL EZR
KCNJ15
KCNN3
PIK3R1
ADH1C
PLIN2
CAV2
KRT4
CAT
ADARB1
EMP3
LPL
PLA2G1B
FGR
CLDN5
GPM6A
CA2
CALD1
CFD
CNN1
SPARCL1
GPC3SFTPA1B
AGTR1
VWF
LTB
S100A9
SPOCK2
AQP1
FEZ1
TM9SF2
IL1R1
PCSK1
PPAT
SMARCA1NT5E
CD46
TNFRSF11B
CXorf40BPZP
GSS
IL4R
NUCB2
RPS25
CDC34
FMR1
CLOCKBMI1
PLAT
CAV1
SEMA3F
SSBP1
GBP1SSB
PASK
TFDP2
POU6F1
SEPP1
LITAF
UBC
SELP
SERPINI1
HIST1H4E
CCR6
CNTN1
CORO2A
MYH8LPIN1
FGFR1
MPZ
TRPC1
RUNX3
DSC3
TNNC1
ACHEARPC2
CYTH2PDHX
ITGAE
CYP2A6
ENPEP
COL4A3POU3F2TYRPPOX
MGAT3C18orf1GYPAGNA11
PECAM1
RAC1
STATH
PCGF2
CPM
FDPS
SLC39A9
SPRR2C
DNA2
VRK2
PTPN11
COL10A1ROR2 SQSTM1 CTSC
HLA-DPB1
LOX PTHLHHIST2H4A
FLI1
ZNF638
PDLIM1
SCTR
CHI3L1LYL1
SEC24CBAG1
ATP5D
SEMA3B
FGF7
TLN2
HIST1H3A
CLCN6
AP2S1
PLN
APPEFTUD2
LBR
HERC1
HTATSF1
NUMB
PTGR1HSPB3
ATP5J
UQCRC2
TIMM44
GCH1
RBMX
MPV17
KYNU
ELAVL4
RPP30
ABCB10
HSD17B10NUP214
TGFB1
MYOM1
SF3A1
KIF5B
SLC15A1
LOC96610
ANXA6
GALM
PCDHGA12FAM65B
AKAP10
ATF3 MEF2A
CTBP1ELK1
ADSS
ACSL1
RPS27A
MED21
PKN2
PSMA2
UPK1A
PNOC
ARHGAP22
ERCC2
GATA3RNASEL
AYP1p1
PTGISZFP36KRT31MAP3K5
MCHR1
FUT4 CCNHUBE2V2
IDUA ZCCHC11LRRC17
H1F0
C8B
PTPN9
PBX3
GSR
SEMA4DRXRG
E2F4
ATP2B3
ITGALF10
HPRT1
FBP1KTN1CACNA1DNME4
EZH2 CCL19
PTPRC
SLC9A6
TBX5ST3GAL1
OAT
PPP2R2B
CCL25
ST6GALNAC2PPP1CC
LEP
RPLP1
DDX21
RAB5B
WEE1
KPNB1
HAS1
TWF1
GNA15
KLKB1
ACVR1
BRCA1
ABAT
CEACAM6
HNRNPC
CASP7RHEB CTSDSPTBN1
SAT1ADRB2
PSMA6
TFPI2
TNNT2
CD1C
GLI3
FST
PRPH
EBP
DEFB1
CYP2J2
GALK1
CDS1
SLPI
HBA2
ALDH3B1
S100A8
NR5A1
EDN2
CPA3ABCA3
P2RY14
ETF1
RARRES2
ANXA8L2
FCGR3B
NPC1QB
TAGLNAPBB2
WARSPDXK
SERPINB6 CYBBGNG11
GAL3ST1
POSTNIL15
SIRPD
HMOX1
PGC
EIF2S1
ELN
HOXB5
ATP2B1TBX2CTSKIQGAP1
CNN3
PEBP1
FAM189A2NAP1L1
POLG
SPARC
EIF5A
AADAC
UGP2
CST6CCL15
TNXB
RBP4
MRC1
MYH11
OXA1L
XRCC6
COMT
MALL
ARHGEF6
SERPINA1
IGLL1
RDBP
NDUFB7
CD22IL10RASEPT6FRZB
PPP3CB
HCLS1
MMRN1
RASGRF1
LCP1
GFER
CR2
DLG1
UGT2B4
STXBP2ITK
SEPT7
GABRA3
CTBS
FGF4
CD79B
KRT7
ACOT2
EFNB1
CSTA
AOC3
G6PD
NKG7
NR2F6MYL9SRPX
CUL7
EDNRB
FHL1
PSMD7
FGF8
SPRR2E
GCNT2CRYGD
TUBGCP3
ITGA5
FGFBP1
FPR2
PTGDS
CD48
BTN3A2
PCOLCE
VCAM1
PEPD
MAP4K1NEUROG1IL13RA2
CLCN5INPP5D
PTPN7
ISCU
SLC11A1
MT1ECYP2B7P1
POU5F1 CDC25C
BLMH
PLEK
KRT85
TULP1
HAGH
TBXA2R
CD6
FCAR
ADAR
PBX1
CORO1A
CXCR4
BCL2
MAP1A
DBI
CYP2C9
KCTD2
FOXJ1ILVBLMB
MPST
MYH6
SLC4A2CFHR2
FES
SELLPRCP
IL16
NPPC
SLC39A7
IDH3BMAPRE2
PITX1
F3
TCF7L2SCP2
KCNJ5POLB
LOXL1TSFM
DLX2 GCNT1
CEBPE COASYBST1
TFCP2 ORC1LSTAT6
PTX3
FLAD1
FAS
PUM2
RPL3
CRY1 PSG7
EXT2
NEFL
TNFAIP2
TCP1
YAP1
PIK3C3
NCOR2
PEX19
UCHL1
MARS
KRT9
ATP13A3
BMPR1A
CPD
INPP5A
MUC8
COX17
CLIP1
ANPEP
HIST1H2AD
GAS1
TXNRD1
ITGA9
NR2C1
LMO4
SMARCD2
SF3B3
YWHAB
ACTB
RUNX1T1
RNF2
NCAM1
TRIB1
GEM
AESFAM110B
HSD17BP1MFSD10
THAP11C2
OAZ1
KIAA0020
GNPDA1
KIAA0040
EEF1A1
INHA
ITPR1
KIFC1
CKS2POLR2LGJA1CLTA
SLC2A3SLC25A3
CNTFRGTF2E2
RLF
FDXR
SIM2
MARCH6
PSMB8
MCM3
ACY1
CDKL5
SFRS2 CYB5A GTF2E1 UBE2D3 ABL2 UCP2 CDK10 TAF10 CLP1 PSIP1 DDX39 ZNF84DUSP5
APLP2IFNA5
NAMPT
HTT
TAF1A
NDUFA12
HRGDEK
BMP5
HYAL1
SDC4
ZFP161
IGF2R
CCR4
MSTP9
AP3M2NID2
PSME2
LPO
TFAM
ERBB3
FLOT2
SLC18A2
GSTO1
YY1
PDE4A
HSPA1A
RELA
BTN1A1
SNW1
RBBP6
MKRN3
KLF4
ADD1ETS2ITGB3BP
ETFARARS
FAM3C
ABI2
SGK1 TERF2GATA1
OR3A1BPGM
RB1CC1
MTHFD1
GPR21
PRKAR2B
SUMO1
ANXA2
USP9X
PDXDC1
NRF1
SOAT1
RRAGB
ZNF154
STS
RAD50
IRX4
ACO1
NCK1
RABEPK
KRT15S1PR1
ZNF24COL13A1
ALPLIFI35
CRHR2
CYP3A4
RBMS2
SP140WRN
LGALS7B
SMAD4
GRM4
NOLC1
CDC27
RNF113A EXTL2
GPM6A
GAS1
CYP2C9
ZFP36
HIST2H4A
FCGR3B
IL1R1
CA2
HSD11B2
IRX4
ANGPT1
IL15
S100A4
SLC9A6OR3A1
KRT5
SAT1
FEZ1
CA3
CCL15
RBBP6
ELAVL4
RELAH19
SPTBN1
CCR4
PLA2G16
CAV2
MMRN1
ITK
UBC
RASGRF1
MT1E
ISCU
EMP2
NRF1
USP9X
TAGLN
MPV17
GALM
SLC11A1
ETV5
EFNB3
IFI35
ITGB1
HBA2
CLOCK
FAM189A2
GPC3USP13
RDBP
ACTG2
LMO4
GATA6
ZNF135
TNFAIP6
SGK1
ANXA1MPO
SFTPC
LCP1
HCLS1
MEF2A
FLI1
CALCRL
EMP1
SERPINB6
PDLIM1
KRT4
FMO2
ARPC2
ARHGEF6MYL9
AGTR2 PLEK
TM9SF2
SPARCL1 HSPA4
TNNC1
MYH6
FHL1
CD52
DGCR5
IL16
NEFL
ADRA2C
RLF
MEOX2
FABP5L7
HBB
PIK3R1
COX7A1
SNW1
ADH1C
MCAM
TAP2
TXNRD1
AOC3
HIST1H3A
TAF10
ANPEP
C1QBGTF2H1
PIK3C3 C7
CXorf40B
POU3F2
AKAP12
TNXBELMO1
DCHS1
C18orf1
GDNF
CYP4B1
PPP3CB
SFTPA1B
ANXA3
GP9DBI
S100A9
PTN
TFRC
GJA1ACHE
CAV1SLC5A5
SRPK3
CXCL12RNASEL
ACSL1
BTN3A2
CHI3L1
CTBP1
RBMS2 SERPINA1 GSTO1CLIP1KCNJ8VWF
MYT1
BCL2 NUP214
KRT2
FOXF1HAS1
FUT3
CPB2 UPK1A
GRK5
RRM2
CLEC3BMRC1
KIAA0232HIST1H4C
LYL1LMAN2
RPL36ALHERC1
CD226NUCB2FN1
DOC2A
SEPP1
LTB
PTGER4CFD
LAMP2
EFTUD2
RXRGSCP2ITGAL
KRT31PIK3CA
GYPA
LAMA4
CPA3
IER2
RDX
MYOM1
ARHGAP5
CES1
PECAM1MCHR1
SRPX
RASA2
MYH8
ARHGDIG
INHA
ACOT2
ST6GALNAC2FGF13
MARCH6
ARL4DRAD21
THOC1
UBE2V2
GSS
CYP4A11
RPS27A
FLOT2
POU5F1
BMP5
CCNH
CUGBP2
CPT2
TPM1
HTT
TPM2
PRIM1
S1PR1
CD300C
TYRPPAT
AGER
FABP4
UCHL1
HAGH
GNL1
DDX1
ADARB1
COMT
TGFB1
NCK1
TRPC1
GTF2H4
TERF2
KTN1
COPB1
PBX1
EZH2
BCL2L2
ARVCF
TIAL1
DES
PTPN11
COL19A1
DNA2
AKAP10
PRCP
EMP3
NCAM1
FTH1
SNRPN
HOXA1
CRY1
INPP5D
NDUFB7
Lung-Michigan: ARACNE – SE(C2) Lung-Michigan: ARACNE – SN(C4)
COL5A2
COL1A1
BGN
ITGB5
COL5A1
INHBA MFAP2SPARC
THY1
VDR
PLAU
RARRES2
COL3A1
HTRA1
POSTNWAPAL NR4A2
IER2
CNN3
COL6A1NR4A1
FAP
THBS2
EGR2
HERC1
INPP5D
MAP4K1
MARCKS
COL4A2
ACTN1
VCAN
COL1A2
NCKAP1L
LIPA
BAP1
KIF2A
RELA
ITGAM
AIF1
ALOX5
IFI30
LAPTM5
PSMB9IL18PIAS1
AMPD3
PEA15
CYTH1
CASP1
DPTIQGAP1 GM2A
SELPLG
ARHGEF6
SLA
FCGR3BSF1
MLLT3
FAS
HLX
GPSM3
CCND2
DAB2SRGN
DOCK2CAMLG
MYD88
TPM4
CTNNA1CASP8
PTK2B
KIAA0430
MAP7
KRT18
SAC3D1
SSFA2
PSMB8
LMO2
SLC31A2PRCP
HLA-DRAHLA-DMB
GPNMB
CAPG
HLA-DPA1
CRY1
RSU1
HMGN4
NDUFB7
HMGB1
MGP
TGFBR2
DARC
KAL1AOC3
NR3C1PMP22
ITGA5FBN1
JUNB
PCOLCE
CCR7
ZFP36
LUM
ITGB7
DUSP1
ISCU
RHOH
COL6A3
CPZ
GDF10ARID5A
TIAM1
HSD3B2
ID2B
FBLN2
LCK
FMOD
LAMA2
ZFP36L2
CALD1
C1R
CYBA
IGSF2
EGR1 TAGLN
LOXL1
CDH11DYRK4
CD72CD2
TLR1
IL2RB
UBE2D2
TRD@DDR2
FN1
ATF3
DPYSL3
CORO1ARUNX3
IRF1
TRBV5-4
EML1
STAT4
DDX5
CD3D
LAMA4
CCL5
CD19
CTGF
EGR3
TPM2
LGALS1PDGFRA
MMP2
C1S
RCAN2
ACTA2
FEZ1 ICAM3
CD6
GNG10PDLIM7
HMHA1
CTSK
COL6A2
MSH3
PRMT2
ZNF266
TAF12
C12orf32
PHF3
GTF2A2 ATP6V0D1
GLG1
EIF4G2
APOC3
CREB5 RARA
RBM6
MYO7A
RAB28 RBBP6NR1H2
BNC1 HEXA
MORC3 STIP1 LZTR1
RCVRN
HTR2A
CRIP1
ZNF638E2F4
TULP1
IARS
ACP1
RCC1
GNA13TDG
UBE2V2
DHX9
HSPE1DDX1
MYL3
MORF4L2
AZIN1COASY
PRDM2SSRP1
AKT2
SKI
ATP6V1C1
CALB2
XRCC5
NMT1NAP1L4
TFCP2 TFAMKDM5A
DLD
IFNAR1
SILVRAB9A
TSG101
HLA-DPB1NCF2 LST1
ACP5
GRN
FGRAPOE
LILRB4
FCER1G
SPNCCR5
PTPN6
CD247
CYP3A7ITGAL
SLC1A3FCGR1A
PIK3IP1 CD52
LCP1
GNPDA1
IL7R
APOC1
MNDACD86
CSTA
SPI1
FGL2
IL10RA
HLA-DQA1
PTPRC
CYTIP
C1QB
LCP2ZAP70
HLA-DMA
ICAM1VAMP2
ANXA6
CD69 IRF2TRIM22
CD74
SYK
IL2
PRNP
GZMA
CCL4IL2RG
TNFAIP3
ENG
CCR2
PTPROHLA-E
ATP6V1F
ITGB2
ARHGDIB
PSAPBTKCD83
AOAH
SEC14L1CXCL12
PECAM1
VIM
SELP
ATP6V1B2
GYPC
RASSF2
PIK3R1
A2M
PAFAH1B2TGFBR3
ADRB2
STOM
SLC8A1
PPP2R5C
FRY
TGFBI
ATXN1
MYL9
FCN1
IGFBP4
LYZITGA4
MAP4
RPA2
ARHGAP19
PSMC3
BTG2
PDE2A
CRYAB
RNASE6
CSF1R UBC
LTC4S
SMARCA2
ALOX15RAP1A
TANK
EMP3
RGS19
HCLS1
STAT5A
LRRC32
SDCBPJUND
CYR61
HIC1
SERPING1
IL15RA
MOBKL1BRTN1
GRM8
PDE4B
CIRBPCFDFILIP1L
CX3CL1CXCR4
NR1H3
TCF4
PLEK
TIMP3
C3AR1
CD14
IL16
CD37
LSP1BCL2A1
TBXAS1
ZKSCAN1
NCF4
SIRPA
GNAI2
LRP1
IFNGR1N4BP2L1
ARHGAP25 RAC2
CD48
CORO1B
CD33
GLIPR1
CD53
CD59HBA2DPYSL2
CREB1
MAOB
ITGA8
ARSA
CAT
MYH10
VWF
ARPC2
FHL1C7
SCGB1A1SERPINF1PPAP2B
GPC3
SFRS7
HSD11B1
PROS1DPYD
NBL1
FMO3
SSR4
ANGPT1
PTGDS
GRK5CYP4B1
TRPC1
FMO2
XBP1
INPP5AADARB1
SPARCL1
MFAP4
COPB1
PTPN7
CHRM1CALCRLCDH5
HLA-C
PLOD2
MAP4K5SNAPC1
ATF6
LOC91316
CCL15 CPA3
LMOD1
MYH11
TPSAB1
COX7A1
CFH
SRPX
U2AF1
HOXA5P2RY14COL4A1
SYT5
PRG2
NTN3
HERPUD1
MYO1C
GABRB1
PSMB4FERMT2
ADH4
GTF3C2
CHRM4SNX19
FEZ2MYOM1
SCAMP1
MAP3K11
BRAF
AHRS100A10
SNTB2
ANXA2
CTSHCSNK1E
SLC35B1
BIRC2 PKNOX1MOGSP4HB
TIAL1
ZYX
NME1
COL16A1
INPPL1
GPS1
PPP2R1A
NONO
IL10RBSRPK1
TRIP10
SUMO1AFF2
GHRHR
CYP1A2
ARHGDIA
OPCML
PPY
BAK1NR5A1
KIAA0125
TNFRSF17
SLC25A16
RPL41
IGL@
IGHG1
CD38
MS4A1 IGHM
CD79B
CD27POU2AF1
CD79AGINS1
ZNF136XPCPUM2SEPP1
TGFB1 SP3IL11RAMIF
PURA
FLII
DPF1
RAD21TIMM17A
PRDX3
ATP2A2
FMR1
PSMD1
NOLC1
ATP5G3
SFRS2
HNRNPF
TWF1ARL3
PSMD14
MUT
CLNS1A
SLC25A11PSMD12
BCL9TLK2
C1orf61
DEFA4
POLR2G
FAM38A
PUM1
RYK
UNC119DNM1L
RBBP7
EIF3AESR1
UBE2A
CCT8
LMAN2
NUMB
UBXN4
PTPRF
MLANA
SART3
PCBP1
XPO1
SLC25A3
CDC42EP1
NKX2-1
PMS2L3
AKAP6
RHEB
MPO
STMN1
AADAC
LMNB1
KIF2CMCM2
MCM3 RPL8
PHYHIP
PGK1
CDC20
TYMS
NEK2
MCM6
DLGAP5
MCM4CDC2
HMGA1
CSTF1
KPNA2
CHERP
CDKN3
DTYMK
CKS2
VCL
ZNF133
EPRS
GALE
EIF4A3
TBL3
MELK TTK
TROAP
ATP5B
RGS3
GTF2E2
LBR
CKAP5
CTPS OCRL
PRKDCNUP88 SSB
SRP9
CLTC
RAB1A
PHKA1DDX10
MRPL3
TTC35
RPL6
RPS10
PSMA4
N4BP2L2
MCM7
PSMD7
ARL1
BRF1
PKN1
TMEM131
SMARCA1
MUM1MAN2C1
UBE2S
PALM
JUP
PMS2L5
HADHB
GSPT1
NAMPTPSMA5
SMC1A
RBM10
EIF2S2
YARS
EZH2RPP38
DAZAP2MRPS22
SFRS3
MAN2A2
NNT
CCT3 ATP5C1SNRPG
BYSL
MRPL12
KIF11
CKS1BSLBP
RFC4MKI67
H2AFX
GGH
MYBL2
PSEN2
NCAPH
MRPL19
PCNA
FLAD1
HNRNPA0
ILF3
COPS5
CYP51A1 SNRPB
CENPAPAICS
PRLR
PLK3
GIPR
CCNB1
FADD
ANXA5
CDC123EIF3B
HUWE1
PSMB1
CCT5
TCEB1
SNRPE
KLHDC3
CBX1
FH
CCT7YWHAQ
CENPE
AIMP2
PCSK7
PPP1CC
TARBP1
SFRS9
CLIP1
TM9SF2 THOC1
UBE2H
AGPS
RANBP2
COPS8PFN2
PWP1
GUCY1A3
CST3
KIF14
RPS21
LDHA
REEP5
TCP1
CYCS
EPHX2
CSNK1D
LMNB2
COX7B
ERCC5
PDHX
PIK3CA
GLUL
HPCABAT1
TUBB3
EMG1
GNL2USF2
H2AFZ
DGKG
STX1A
SON
KIAA0232
ACTG1
CCNA2
HNRNPA2B1
PCBP2
SNRPF
NARS
UNG
LOC150786
ATP6V0A1
SNRPB2
ATP1A3
MTHFD2
UBE2N
CTCF
PTPN11
PSMB7
YIF1AGOT2
NUP153
ALDH18A1
TRAP1LSM1
ELAVL1
TROVE2PPIF
EBNA1BP2
PCCB
CDC25C
TP73
C20orf24 GRSF1
GART
DYRK2
PSMA3
SRP19
CSNK2B
CCNF
CENPF FOXM1
KPNB1
DAP3
SPTBN1
CHAF1A
TK1
P2RY10
RRAS
DNASE1L1
IL8RBVARS
PSMB2HMMR
BRCA1
SRPR
ARCN1
GK
LOC96610MTIF2
FGFR2
KIAA0101TRIP13
SLC2A1
AGFG1PKM2
TOP2A
NDUFA4
BBS9MAPK1IP1L
CXorf40B
RBBP4
GTF2H1HCFC1
CHIC1
CCL17 CDK9TEF
EXOSC10
MAD1L1
ADRA2A
DRG2
NXF1
RAB21
SSTR3
MAP2K1
ADRB3
GPR162 TRIP12
DDX39
SSR1ZMYND11
HOXB13
HSD3B1
PSMD6ATXN10
PROL1 C1D
RCOR1 CLCN3
NEUROG1PHB2
THAP11
IGLL1
ETV2C4orf10
ROS1
SIGMAR1ATP5O
AANAT
GP2ACSL1
NPDC1HSD17B10
GIP
FLT3LG
RFC1
BNIP2
WTAP
RRP1B
CHRND
CAMK2G
ARHGEF12FRG1
RAB2APIN1
GLI3
MAPRE2
CNP
PPM1A
CALCOCO2
B4GALNT1POU2F2
BCL2L1
HBZ
PPP1CB
JARID2
DDX18PCDH1
PPP3CB
PIK3R2
SETMAR
EP300
APC
IMMT
KIAA0240
ATXN2PSMA1
MYO5BCUL4A
RBM3
DYRK1A
LPCAT3
H3F3A
SNRPA1 FCGR2B
SUMO3
TLN2
FNTA
NCOA4
WWOX
IFNA21
CETP
TKT
AVPR1B
ACTC1
PEX1
SMPD1
FUT6
NHLH1
TNNT3
L1CAM
ZNF592
DRD2
GPR3
SNCB
GTPBP1
LIMK2SMAD7
MAP1A
MYLPF
NUP188
MT1A
FUT3PAX8
LIG3MICB
GNB3
CAMPLOC100128792
C21orf2
GSTM5 EBI3
MSX1
OXT
KCNJ4HTR2C
MT1X
RABEPK
ZNF410
CYP4A11
DNASE2
ZNF142P2RX7
GCK
TNNI1
PRKACA
SAA1
MSNTIMP4
NKX2-5
PSMC5CDSN
PTPRN
DBT
ITPKB
ADAM15
ACAA1
PLOD1
DAP
EED
MAP3K4
CYP2C19
ADCYAP1
CD36
PLCD1
TUB
MYL5
ZWINTAS
DRD1
HLA-DOA
FES
IL3RA
KIAA0195
GYPBLOC201229
OR2H1
THRA
ZBTB17PSD
NTSR1 JAK3
HTT
MAPKAPK2
CKM
USP11
DRD4
CTRC
IL17A
PIGF
KPNA3CIR1
FIG4
HARS
SCN7ALPP
ZNF8PPT2
CDC2L5
LAMP1 ADAM9DNTTIP2
MTF1
CNTNAP1LOC728849
DHX8
RAD23A
SP2
SDF2EIF2S1
ATP5F1CSTF3
MTHFD1PAX6 CCL22
MYH6GPER
C4orf6KIR2DL4
RPS9RPL18
RPL32RPS19
RPA1 EEF1G
KLHDC10ERCC3
COX17
MYOD1
RPS5RPL28
GANAB
RPS28
RPL29 RPS15
PTS PDXK
PIGAPPARD
RPSAP12RPS8
MED1EWSR1
RPL13A
WIPF2TIMM8ANAP1L3KRT2 TBCEZNF91NUTF2 RAB4A ZMYND8ARAFGTF2F2ANKS1A GGCX HMGCL HNRNPH1 TAF7 TTYH2MTORMAP3K5JAK1FAM120ARAC1IFNA8ANXA7 HSP90AB1 IL13RA1SLC35A2CCNHHIST1H2AC RPL11 RPL3 ADAM8 HNRNPH2 BRPF1 KDM5D RPP30RPS4Y1NQO1 HIST1H1CCLP1RGNRORCBMP4 ERBB2 WNT2B CDK5R1 IDUA RASA1 EFEMP2 RPL7A EPHA2ZNF273MMP3RPL35 EXTL2 RBM42 COX6B1 PRMT5 GNAT2APEX1 UGP2 DYNLL1 BMS1GSS
IFI35
HK3
IRF9
CCL3
PSME1
LHFPL2
PLA2G7
PTAFR
CYBB
IRF8FLI1
EVI2B
HLA-DQB1
IL32FYB
TGFB3
IGHD
HLA-DOB
TRIM21
IFI44
IFI27
MX1
NR4A3 VGLL4
FOSB
AKAP13HLA-DRB1
C6orf142
OSTF1PSME2
UBTF
ITGAV
CXCL9
HCK
FOLR2
MRC1
FBP1
RHOG
ALOX5AP
MAPKAPK3
IFI6
CCL13
CXCL10GBP1
CXCL11
CENPB
KIFAP3CAP1
SYPL1
C2
RBM25
RPS6KA1HLA-F
TIMP2IFNGR2
MUC1
SLC28A1
CBFB AHNAK
ENPEP
PSMB10
BTN3A2
TMSB10
SLC10A3
TNFRSF1BMPP1
CCR1
C5AR1
LGALS9
ATOX1
RFTN1
BAT3
AGPAT1
ABL2
GAPDH
PARP1
COL18A1
EHMT2
SOD3
ENO1 TMED10
HAGH
AHDC1
NUP62
KRT8
DSC2
GPIETF1
TUBB
DMPK
ESR2
CDH12
SLC11A1
GHRH
LSR
DRD5WARS
SKP1
CFDP1
PIGHNFYC
TAP1
DDR1 MAPK14
RPL31
WIT1EVPL
NFX1
RPL27A
EIF3E
RPS15A
EIF3F
RPS18
RPS3A
RPL10A
RPS6
YWHAH SFTPA2BYWHABPABPC1GPD1LCDH1
ABCA3
E2F5
TCF3 GTF2E1
ITGB1 SFTPA1BSFTPB
NAAA
NFIC
GOLGA4
PAFAH1B1
RPA3
EPS15
LRRC16A
MTMR6
SEPT7
BCL2
GNAQ
DMD
MICA
TNXB
GNB1
CCNG2ADH1C
CD302
HRG
SNRK
ELF1
TUBG1
NPEPPS
TIE1SERINC3
TCEA2KIAA0141
ST14
RDBPWWP1
PSMA2
SHC3
TOP2B
SFRS5
LTBP2
CCT6A
MYL6B
LAMB2
TMED9
RPN2
MYL7IP6K1
SLC1A5
NFIL3
PSMB6
NUDT1
ARD1A
SF3B2PPT1
NFKB1RASA2TOP1 HSPA1L TCN2 FAM190B GABRA6LASS1 HSF1ACY1
TCF12
UQCRC1
FUT8
KCNJ9
GABRQ
RALY CTBS ZPBP SREBF1NCSTNMRPS31 MBTPS1 LARP4B ATXN3 COL14A1DHCR24 IL9 SKIV2L WFDC2 MALL BNIP1 RPS23
ELAVL4 KRT76
TALDO1PGD
BLOC1S1
AKR1C1
RB1
TNFSF8
IGFALS FHL3NCOA6
TM9SF4 AKR1C3SLC19A1
GALT
SNW1COX7C HRH1 ACPP TAF1A PPIG TAX1BP1 AK1 VASPIRX4 TNFRSF8 GRM1 HSPA9 HARS2 UBXN8 SUPT5H
C7
CD302ADH1C
MGP
LAMA2
BCL2
TGFBR2
NME1
DPYSL2MYH11
P4HB
AOC3
FEZ1
ITGA8
COX7A1RPN2
PTGDS
CPA3 MFAP4
LTBP2
A2M
VWF
SPARCL1
SEPP1
FHL1
ANGPT1CFD
TPSAB1
HLA-DQB1
HLA-DQA1HLA-DMA
HLA-DPA1
HLA-DPB1
HLA-DRB1
HLA-DMB
CORO1A
CCL5
GZMA
MAP4K1
CD48
RUNX3
CD2
CD3D
CYTIP
FGL2
LCP2
ACP5
NCKAP1L
FYB
IL2RG
FGR
CCND2
IL2RB
LCP1
HCLS1
CD74
RNASE6
CD14
PTPN6
ARHGAP25
CD37
TRBV5-4
TRD@
LCK
COL4A1
THY1
COL5A1VCAN
COL1A2
SPARC
COL1A1
COL4A2
COL6A3INHBA
THBS2
FAP
COL3A1
POSTN
CYR61CTGFPDGFRAPWP1UBE2NNOLC1
THOC1
RBBP4
TPM2COPS5
COASY
ACTA2
MYL9
HLA-F
PSMB9 EBNA1BP2TAGLN
MRC1
ALOX5AP
HCK
GYPC
CXCL12
THRA
LIPA
ITGB2
LST1
ITGAM
MNDA
ARHGEF6
JUNB
IER2
POU2AF1
LOC91316
CD86
SIRPA
CYBB
AIF1
SPI1
GM2A
HLA-C
IGL@
IGHG1 GBP1
CXCL10
STAT5A
DAB2
BTK
EMP3
FCGR3B
SRGN
CD52
APOC1
C1QB
FCGR1A
C3AR1
FCER1G
LAPTM5
GLIPR1
EIF2S2
SNRPF
MTHFD2
PLA2G7
CSF1R
PEA15
HLA-DRA
MPP1
APOE
GPNMB
DOCK2
TDG
PSMB8
TAP1
ITGAL
LSP1
CAPG
TCF4
CCR1
NCF4
VIM
IFI30
EVI2B
SLA
IL10RA
PLEK
ZAP70
PTPRC
CD53
HLX
RASSF2
ALOX5
NCF2
LMO2
SELPLG
RPS28 ACP1 GRSF1 KRT18 KRT8 LUM CDH11 SFTPA1BC1R COL6A1 COL6A2 PCNA SNRPB TCEB1 AZIN1 RPS9C1SNXF1PCDH1C4orf10JUNDBTG2CIRBP
CDC2
KIAA0101
H2AFZ
CKS1B
CCNB1
ADCYAP1MSX1
AVPR1B
MELK
DUSP1
ADRB3
ZFP36
TRIP13
CENPA
KIF14
KIF11
TYMS
MCM6
CDC20
TOP2A
TK1
AIMP2
CCT5
SFTPA2B RRAS SPTBN1 KDM5D RPS4Y1 SLC2A1 GAPDH RPL18
CD79AEGR1
L1CAMTEF
APOC3
NTN3 CXCL9
SMARCA2RPS5 HNRNPA2B1 HMGB1 PIK3R1 HTR2CNHLH1 ATP2A2MRPL3
CCL4 CCL3 SFTPB NAAA NBL1TIMP3
18
Lung-Michigan: PCC – SE(C2) Lung-Michigan: PCC – SN(C4)
ASS1
PI3
IL11
KRT6C
RASGRF1
GRIK3KRT5
ALPI
PRKG1SDS
TRD@
ZNF132
PRB2TAF1
TUB
NR5A1
GPR68MPZ
PLCD1CD2
TRBV5-4
IL8RB
NHLH1
EPHA1
CHRM1
LTKIL17A
SPRR1A
TNNI1
FUT1
MATK
RAB33A
MLH1
TNFRSF25
HTR2CTHBS3
NTN3
UROD
MAGEA11
ACVRL1
LEP
COL11A1
P2RY4
C4orf6
ASCL1 IFIT1 MX1CALCA PCSK1
LY9INPP5J
PKMYT1
KLK3
PDCD1
MAS1
IKBKE
LOC643332
CD79AGNB3
CCL22GP9
DRD4
FOXO4
EFNA3
FUT6
TBX2
SCNN1G
HNRNPLGNAO1
PAX6
MYBPC2
PRKCGPRRX2
CSN3
EVX1HAAO
SYN1
PLK1
KCNQ1MYO1F
CRYBB2KCNJ4
MAP3K14
ALX1
GSTM5PSD
C21orf2 KIAA0195GFAPGUCA2A
HTT
HLA-DOA
BCAM
CSN2
RBP3RHOD
FES
OXT
CG030
OMP
CCR9
OR2H1PSG7
IDUA
RARBSOX2
PRKCBKRT4
ATP12A
GSTA2
CCDC88A
GRIA1
FGF2
KCNA3
CACNA1A
GYPA
SERPINB2
CACNA1C
CRHR1
IFNA6
CTAG1B
SRP54
PSMA6
PCCA
CALB1
NNAT
CGA
CDH15
CHERPINPP5E TACR3
LOC729991
MYL4
MLL3
ZNF33B
AKAP1
ELN
PLXNB1
HSF4
PMS2L11
SIRPD
NTSR1
AQP2SLC17A4ETV2
CD53 SNAP25CNN1ACTG2CXCL5 CXCL3PTPRCMYL9 TAGLNTFF1 MUC5AC
STARD3
SPARC
APOB
SCGB2A2
AGTR1
TF
GRB7
ERBB2
INSL4
AGT
PLGLB2
TIAL1
S100A9
INHA
IFI35
ERVK6
CST2MAPK1UBE2L3HCK
METTL1
CDK4
AKR1C4
VIL1HPD
APOA1
DEFA5
NEFH
TSPAN31
WASL
SCG2CHGAMAGEA3MAGEA12KRT13NPY1RNFATC3ZNF185CCL15VWFCST4
GNGT1
ACVR2B
ID1
STAT5BITK CR2
COL1A2
COL1A1
COL3A1
CXCL10
CXCL11
GBP1
APOA2
AHSG
TSFM
TAC1
IGFBP1
STRA13
FABP1
RBP1
CGREF1
APOC4 CCT2
LST1LTF NRGN ITIH2 CRISP3 CD48 CD37F10
DCHS1
LOR
KRT76
HOXA9
GPLD1
PGK1
KEL
BMPR1B
CELA3B
MSX2
CYP11B1
CACNA2D1
MLN
FBXW4P1
AOC3
FHL1
COX7A1
CLEC3B
FABP4
FMO2
ADH1C
SPARCL1
GAS2L1
PRM2
MYH10
CPLX2
HLA-C
MT1L
PRRT1
SSTR1
AMELX
CLCNKB
FEZ1
HR44
MYL2
MATN1
NOS1
SSTR2
CTRCSSTR3
PLCB2
BMP4
NEUROD2
GRAP
HSD17BP1
CD3E
SFTPC
AGER
ANGPT1
A2M
FOXF1
CAV1
METTL1
CDK4
TSFMPI3
KRT5
RASGRF1
PRKG1GRIK3
TAGLNASCL1 CD53TSPAN31 CXCL5MYL9 CXCL3PTPRCCALCACOL1A1
COL3A1 KRT6C
COL1A2
SDS
SCG2CHGAMAGEA3MAGEA12KRT13NPY1RNFATC3ZNF185 ITIH2F10 LTF CRISP3SNAP25 NRGNACTG2 CNN1CXCL11 VWF CCL15CST2 CST4CXCL10
PDCD1
PRB2
RAB33A
TRBV5-4
ALPI
CD2
UROD
P2RY4
LEP
MAS1
MPZ ACVRL1
MAGEA11TRD@
IKBKE
GNB3
FOXO4ZNF132LY9TAF1
GPR68
NR5A1
COL11A1
CCR9
GUCA2A
HNRNPLHAAOCCL22
INPP5JGP9
OR2H1
KCNQ1
MATK
FEZ1
WASL
CG030
PLK1
FUT6
CD3E
EFNA3
MAP3K14
CRYBB2
MYO1F
KRT76
KLK3
TUBPSG7
OMPTNFRSF25
EVX1
MLH1DRD4IL8RB
GFAP
PAX6GSTM5
HTTRHOD
PSDGNAO1
CSN3
KCNJ4FESSCNN1G
SPRR1A
ETV2SYN1
PRKCG
PRRX2
BCAM
GYPA
HSD17BP1
SLC17A4
MYBPC2
MLN
PLXNB1
CHRM1
NHLH1
PLCD1
FUT1
BMP4
PGK1
LTK
HCKLST1 SRP54CD48 MAPK1UBE2L3PSMA6CD37
SCGB2A2
STARD3
IL11
ERBB2
AGTR1 ASS1
INSL4
TF
IL17A
GRB7 APOB
NEUROD2
SSTR1
MYL4
HOXA9
HSF4
CTRC
SIRPD
RBP3
CSN2
GRAP
PLCB2
OXT
SSTR2THBS3
HLA-DOA
C21orf2
HTR2C
KIAA0195ALX1
ITK
S100A9
ACVR2B
ID1
GNGT1
IFI35STAT5B
TIAL1
PRM2
CPLX2
HLA-C
MYH10
MT1L
FABP1
AHSGTAC1
CTAG1BCGREF1
APOA2
CALB1
NNAT
APOC4AKR1C4
PLGLB2
DEFA5
APOA1
IGFBP1
INHAHPD
ERVK6
NEFH
RBP1
VIL1 PCCAGAS2L1
FHL1
AOC3
COX7A1FMO2
SPARCL1
ADH1C
ANGPT1
FABP4
SFTPC
AGER
CLEC3B
A2M
AMELX
ZNF33B
BMPR1B
HR44
PRRT1SSTR3CELA3B
INPP5ECDH15
CACNA2D1
MATN1
MSX2
ELN
AKAP1
CLCNKB
MLL3
GRIA1
CRHR1RARB
SERPINB2
IDUA
KCNA3
CACNA1C
PRKCB
TACR3
PMS2L11
CYP11B1
CHERP
NTSR1AQP2LOC729991
GSTA2
ATP12A KRT4
FGF2
SOX2
Lung-Michigan: MIC – SE(C2) Lung-Michigan: MIC – SN(C4)
SEMA4D SYKMX1SFTPA1B IFIT1SFTPA2B IFI35PSMD4 EIF2AK2ISG15IFI44IFI44LILF2ARHGEF16 HLA-FZNF592
ITGAX
FGR
PARP1
SNRPE
SRPXDPYSL3
HIST1H2AC
PDGFRA
PMP22
NBL1
HIST1H2BE
MMP2
HIST1H1C
BOP1
KDM5D KPNB1
EIF2S1
PUF60
SUZ12
PSMA3RPS4Y1
PSMC6
XIST
CYC1
ACTA2
TAGLNACTG2
MYL9
FSTL1
C1R
C1S
CENPA
CDC20
TYMS
MCM7
CSE1L
CDKN3
H2AFZ
AQP1
MCM4
MCM2
NCAPHTTK
STMN1
CYCS
ATP5G3
CCNA2
TRIP13
HNRNPA2B1
PLK4C20orf24
EPR1
LOC96610
CDC2KPNA2
PCNA
MKI67
DTYMK
CCT2
UBE2S
HMGA1
SNRPB
LTBP2CDC6
CSNK2A1
CKS2
UBE2I
MCM6TK1
RFC4
MYH10
MYBL2
CENPF
UBE2C
UNG
EGR3
ZFP36
NR4A1
NR4A3FOSB
ATF3
PTGR1 GCLM
AKR1C1TSFM
PGD
G6PD
TALDO1
IER2
EGR1DUSP1
GPX2 JUNB
AKR1C3
NR4A2
EPHA1CRYBB1UBE2NRAB21EPRSRABIFRANBP1PTGES3IL6SLC2A3NONOUQCRB HSF4 DECR1 HBBRPL31RPS9 COL6A1RPS28 HBA2KRT5AANATYY1 CFTR RPL7A KRT6CRPL35TOMM20 RBL1CAPGPRKACG FILIP1L DPYSL2 RRASEDNRA PURAVIMTIMP3RPL3RPL10ACOL6A2
YIF1A
COASY
PTGDS
XPO1
CCT4
COPS5
RASSF2
MS4A2
PSMB9
GBP1
TAP1
PSMB8
CXCL9
BTN3A2
SPARCL1
CCNB1
ANGPT1
MTHFD2
TCEB1
A2M
TNXB FMO2
VWF
LOC150786
LRP1ITGA8CFD
CLEC3BPTNCCL15
UBE2V2
DARC
EIF2S2
SERPING1
PSMD14
CXCL12
GYPC
ANXA6
CD83
MRPL12
GPD2
CCT7 TUBG1
RARRES2
AIMP2
CCT5
PHB GAS1
ELN
GOT1
KIF14
KIF11
MELKKIAA0101
LMOD1
GAS6
FOXM1
PSMA5
C7
SEPP1XPC
FHL1
ADH1C
FMO3
MYH11
CPA3
CKS1B
NME1
MGP
LTC4S
CRYAB
COX7A1
EIF4A3
MFAP4
AOC3GAPDH
PAICS
H2AFX
SNRPF
TPSAB1LDHA
POLR2J2
CSNK1A1
EZH2
CTPS
PPIA
RARS
SHFM1
UBA7
TROAP CKAP5
PSMB5
RCAN2
CYP4B1 CD302
TGFBR2
PRMT5
NEK2 KIF2CTOP2A
ARHGEF6HLA-DMA
NCKAP1L
HLA-DQA1
BTK NCF1C
ARHGAP25
INPP5D
PTPRC
DOCK2
CCL19
CD48IFI30
FGL2
SIRPA
CD37
CCL4
NCF2
LTB
SELL
ALOX5AP
LIPA
GPR183
CD14
MAN2B1
ALOX5
CSF1R
LST1
HLA-DRAACP5
HLA-DMB
NCF4SLC31A2
FCGR1A
CCR1
ITGB2
CYBBAIF1
APOC1
PLEK
EVI2B
IL10RA
CYTIP
CD2
TRBV5-4
IL2RB
ICAM3
LCP2
FYB
PRKCB
CORO1A
CXCR4
LILRB4
PECAM1
LCP1
TRIM22
HCK
GPNMB
ITGAM
SPI1
RNASE6SELPLG
GLIPR1
ATP6V1B2
C3AR1
GM2A
HCLS1
MAP4K1
MNDA
SRGN
HLA-DQB1
MRC1
CD74
HLA-DPB1
HLA-DRB1
SPN
LMO2
HLA-DPA1ITGAL
CD53
CD33
HSP90AA1
C1QB
FCER1G
LSP1
LAPTM5
CD52
PTPN6
CACNG1
HPS1
HMOX2
MNT
GCM1
ADCYAP1
L1CAM
SSX5
ASNA1
CTCF
SLC9A1
CPA2
SUMO1LY9
EFNB1
MVK
IRF5
PIK3CA
EFNA3
P2RY4MAPK1CTTN
ACTC1
PIK3R2
MSX1
GH2
PTPN11
MAPKAPK2 C21orf2
CSN2
ALPILIMK2
PIGR
SRPR
MNAT1
KRT18
AQP7
ARCN1
KRT8
IL3RA
NXF1 GALE
PRKACA
NPDC1
ARL1
TDG
SLC33A1
ATP2A2
RENBP
PHB2 C4orf10
IGL@CD3E
HLA-DOB
AZU1 CCR7
LOC91316
TNFRSF17
PCDH1
CD79A
IGHG1COL16A1
IGKV1D-13
IGJRPS7
RABGGTB
KIAA0125
HLA-C
HERPUD1
SRP72
POU2AF1
MUM1
ATP13A3
HRC
IARS
IGHM
CD27
MS4A1
WWOX
CD6
TIAL1 USP14
NTN3
CD19
GNAT1
SPTBN1
SSBP1
CORO1B
LCK
CD3D
EMP3
APOE
IL2RG
MRPL3
CCL5
P2RX5
OXT
TAF12
TSG101
EXOSC10CXCL1
C4orf6
TRD@
TBX2
CLCN7
BCAM
PRB2GUCA2A
KLK3 NTSR1
CETP
MYO1F
TNNI1
NKX2-5
GP9
TH
ZNF132
LIFRCDH15
GNAO1
BMP4
HADHB
GZMA
PLA2G7
LGALS1
SHC3
KRT76
SPTB ATP4A
ETV2
AXIN1
BMPR1B
GSPT1
SYN1
PSD
IL8SIRPD
GNB3
HOXA9EVX1
MYL5
DRD4HTR1B
THBS2
CALD1
MFAP2
THY1
CDH11
LUM
HTRA1
LOXL1
KISS1
CTSKFRG1
HAAO
PDGFRBFBN1
MYL4
SLC17A4CUL1PTPN9
KRT86
COL4A1
COL4A2
VCAN
COL5A2
INHBA
COL5A1
COL11A1
REG1A
CHERPCYP17A1PAX6
TRIP12
TACR3 SSTR3
CACNA1B
GRIN2C
IFNA21
ACTR1A
TWIST1GPR3
UGT2B15
GPER
SCNN1GHTT
C1D
SSTR2
ABP1
CGBCYP4A11
AVPR1B
DRG2
FAP
COL6A3
COL1A1
COL1A2
COL3A1
POSTN
SPARC
PLAU
CAMK1
EBNA1BP2PSME2
JAK3
OR2H1 MT1L
SSTR1
EN2
GFAP
PMS2L11
CTRC
PPY
CHRM1
AMELX
MLANA
C16orf35
NEUROD2
CXCL10
HR44
ITGA5
CELA3B
MSX2
KIAA0195
MYBPC2GYPA
GRAP
KCNJ4RBP3
OMP
GNAT2
ATP1A3
FHL3
DNTTIP2
PTPRS
GRSF1
SCN1B
IKBKESNCB
PRKCG RHOD
PDCD1
SPRR1A
FEVAQP2
CLCNKB
LOC729991
MATK
FOXO4
PAX8
SAFB2
IL8RB
CLIP1
PRRX2
CTAG1B
TFDP1 TEF
HNRNPL
LTKADRB3
DRD2
TNNT3
APOC3
CCKAR
GSTM5 DEFA5INPP5J
RBBP6
PSG7
TUB
CHIC1
MLH1
FES
UROD
THRA
PTCRA
HTR2C
NHLH1SSTR4
PLCD1
MLN
HLA-DOASERPINF1
CYP11B1
CCR9
REG1P
CLCNKB
GYPA
HOXA9
CETP
MYBPC2
PAX6 GPER
OMP
LIPA
IFI30FGL2
MNDAAIF1
CD53PTPRCIL10RA
ALOX5MAN2B1
EMP3
HCK
ARHGAP25
ACP5LST1
LAPTM5
NCKAP1L
HLA-DRB1
CD74
HLA-DPA1
HLA-DMAHLA-DQB1
HLA-DMB
HLA-DPB1
HLA-DQA1HLA-DRA
CD37
LSP1
MYO1F
CSN2
HLA-DOA
PAX8
CDH15
SCN1B
SYN1FEV
SCNN1G
FES
PLCD1
PSD
AVPR1B
IGL@
RBBP6
CCR9
IKBKE
ZNF132
FHL3
SIRPD
CYP11B1
REG1P
TH
PDCD1
RHOD
OR2H1
MLH1
UROD LIMK2ITGAL
OXTNHLH1
TUB
CELA3B
LOC729991
KLK3
CHERP
AQP2
SPRR1A
DRD4
EVX1
CTRC
GBP1KRT76PSMB8
CXCL9PSMB9TRD@
PSME2CXCL10TAP1
POSTN
FAPSPARC
COL1A2
COL5A1
COL3A1
COL6A3
MFAP2
ANGPT1
CD302PLAU
VCAN
TNXB
CD2
TRBV5-4
MCM2 COL5A2 COL1A1 THY1
THBS2
UBE2C
TOP2A
TK1
KIF11
FOXM1
UBE2S
CDKN3
TTK
COASY
EBNA1BP2
TCEB1
TUBG1
ETV2C4orf10SFTPA2BSFTPA1BTPSAB1CPA3IER2EGR1 USP14ATP2A2
CD48
CORO1B
MGP
SPARCL1
LMOD1
ADH1C
FHL1
NBL1PMP22 COL4A1COL4A2
HLA-C
IGHM
CD79A
CD3E
IGJ
NTN3
POU2AF1
TNFRSF17
LOC91316
COL6A1 COL6A2 MX1
CENPF
MCM4
NCAPH
IFIT1
A2M
MYH11
TIAL1 FBN1IARS HTRA1
AZU1 EVI2BNCF4 C1QB
IFNA21
HNRNPL
HTTPRRX2
RBP3KCNJ4 NEUROD2 GP9PRKCG
FCGR1AARHGEF6LCP1
GPNMBPTPN6
LIFR GFAP SSTR4 HTR2C
SSTR3
TNNI1
ITGB2CD52CYBB
APOC1 GM2AFCER1G
ADRB3MLN
C21orf2FOXO4
CLIP1PPY
NTSR1MYL5 TWIST1 PMS2L11
AMELX
CHIC1LCP2
PLA2G7
PLEK
APOE
CORO1AC3AR1HCLS1
RNASE6
COX7A1 MYL9
GYPC LRP1
MFAP4
PHBGOT1LGALS1SIRPAH2AFXNME1BCAMC4orf6 HBBHBA2
PTGDS
DARC TAGLN
ACTA2RPS4Y1
XIST
CTPS
NR4A1
KDM5D
FOSB
ATF3
ZFP36
KPNA2
KIAA0101
CDC20
MELKKIF14
CCNB1
EIF4A3
EZH2
19
ReferencesCerami, E. et al (2012). The cBio Cancer Genomics Portal: An Open Platform for Exploring MultidimensionalCancer Genomics Data. Cancer Discovery, 2(5), 401–404.
Chowdary, D. et al (2006). Prognostic gene expression signatures can be measured in tissues collected inRNAlater preservative. The Journal of Molecular Diagnostics, 8(1), 31–39.
Davis, A.P. et al (2015). The Comparative Toxicogenomics Database’s 10th year anniversary: update 2015.Nucleic Acids Research, 43(D1), D914–D920.
Hamosh, A. et al (2005). Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes andgenetic disorders. Nucleic Acids Research, 33(suppl 1), D514–D517.
INSERM (1997). Orphanet: an online database of rare diseases and orphan drugs.
Magrane, M. et al (2011). UniProt Knowledgebase: a hub of integrated protein data. Database, 2011.
Rappaport, N. et al (2013). MalaCards: an integrated compendium for diseases and their annotation. Database,2013.
Shannon, P. et al (2003). Cytoscape: a software environment for integrated models of biomolecular interactionnetworks. Genome Research, 13(11), 2498–2504.
Singh, D. et al (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 1(2),203–209.
Taylor, B.S. et al (2010). Integrative Genomic Profiling of Human Prostate Cancer. Cancer Cell, 18(1), 11–22.
20